kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2013-12-13	Merge branches 'doc.2013.12.03a', 'fixes.2013.12.12a', ↵	Paul E. McKenney	125	-174/+3736
	'rcutorture.2013.12.03a' and 'sparse.2013.12.12a' into HEAD doc.2013.12.03a: Topic branch for documentation changes. fixes.2013.12.12a: Topic branch for miscellaneous fixes. rcutorture.2013.12.03a: Topic branch for new rcutorture/KVM scripting. sparse.2013.12.12a: Topic branch for sparse-RCU changes.
2013-12-13	rcu: Remove "extern" from function declarations in kernel/rcu/rcu.h	Teodora Baluta	1	-1/+1
	Function prototypes don't need to have the "extern" keyword since this is the default behavior. Its explicit use is redundant. This commit therefore removes them. Signed-off-by: Teodora Baluta <teobaluta@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-13	rcu: Remove "extern" from function declarations in include/linux/rcu.h	Teodora Baluta	4	-61/+61
	Function prototypes don't need to have the "extern" keyword since this is the default behavior. Its explicit use is redundant. This commit therefore removes them. Signed-off-by: Teodora Baluta <teobaluta@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-13	rcu/torture: Dynamically allocate SRCU output buffer to avoid overflow	Chen Gang	1	-33/+34
	If the rcutorture SRCU output exceeds 4096 bytes, for example, if you have more than about 75 CPUs, it will overflow the current statically allocated buffer. This commit therefore replaces this static buffer with a dynamically buffer whose size is based on the number of CPUs. Benefits: - Avoids both buffer overflow and output truncation. - Handles an arbitrarily large number of CPUs. - Straightforward implementation. Shortcomings: - Some memory is wasted: 1 cpu now comsumes 50 - 60 bytes, and this patch provides 200 bytes. Therefore, for 1K CPUs, roughly 100KB of memory will be wasted. However, the memory is freed immediately after printing, so this wastage should not be a problem in practice. Testing (Fedora16 2 CPUs, 2GB RAM x86_64): - as module, with/without "torture_type=srcu". - build-in not boot runnable, with/without "torture_type=srcu". - build-in let boot runnable, with/without "torture_type=srcu". Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-13	rcu: Don't activate RCU core on NO_HZ_FULL CPUs	Paul E. McKenney	3	-0/+25
	Whenever a CPU receives a scheduling-clock interrupt, RCU checks to see if the RCU core needs anything from this CPU. If so, RCU raises RCU_SOFTIRQ to carry out any needed processing. This approach has worked well historically, but it is undesirable on NO_HZ_FULL CPUs. Such CPUs are expected to spend almost all of their time in userspace, so that scheduling-clock interrupts can be disabled while there is only one runnable task on the CPU in question. Unfortunately, raising any softirq has the potential to wake up ksoftirqd, which would provide the second runnable task on that CPU, preventing disabling of scheduling-clock interrupts. What is needed instead is for RCU to leave NO_HZ_FULL CPUs alone, relying on the grace-period kthreads' quiescent-state forcing to do any needed RCU work on behalf of those CPUs. This commit therefore refrains from raising RCU_SOFTIRQ on any NO_HZ_FULL CPUs during any grace periods that have been in effect for less than one second. The one-second limit handles the case where an inappropriate workload is running on a NO_HZ_FULL CPU that features lots of scheduling-clock interrupts, but no idle or userspace time. Reported-by: Mike Galbraith <bitbucket@online.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Mike Galbraith <bitbucket@online.de> Toasted-by: Frederic Weisbecker <fweisbec@gmail.com>
2013-12-13	rcu: Warn on allegedly impossible rcu_read_unlock_special() from irq	Lai Jiangshan	1	-2/+6
	After commit #10f39bb1b2c1 (rcu: protect __rcu_read_unlock() against scheduler-using irq handlers), it is no longer possible to enter the main body of rcu_read_lock_special() from an NMI, interrupt, or softirq handler. In theory, this implies that the check for "in_irq() \|\| in_serving_softirq()" must always fail, so that in theory this check could be removed entirely. In practice, this commit wraps this condition with a WARN_ON_ONCE(). If this warning never triggers, then the condition will be removed entirely. [ paulmck: And one way of triggering the WARN_ON() is if a scheduling clock interrupt occurs in an RCU read-side critical section, setting RCU_READ_UNLOCK_NEED_QS, which is handled by rcu_read_unlock_special(). Updated this commit to return if only that bit was set. ] Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-13	rcu: Add an RCU_INITIALIZER for global RCU-protected pointers	Paul E. McKenney	1	-38/+42
	There is currently no way to initialize a global RCU-protected pointer without either putting up with sparse complaints or open-coding an obscure cast. This commit therefore creates RCU_INITIALIZER(), which is intended to be used as follows: struct foo __rcu *p = RCU_INITIALIZER(&my_rcu_structure); This commit also applies RCU_INITIALIZER() to eliminate repeated open-coded obscure casts in __rcu_assign_pointer(), RCU_INIT_POINTER(), and RCU_POINTER_INITIALIZER(). This commit also inlines __rcu_assign_pointer() into its only caller, rcu_assign_pointer(). Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-12-13	rcu: Make rcu_assign_pointer's assignment volatile and type-safe	Josh Triplett	1	-1/+1
	The rcu_assign_pointer() primitive needs to use ACCESS_ONCE to make the assignment to the destination pointer volatile, to protect against compilers too clever for their own good. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-13	bonding: Use RCU_INIT_POINTER() for better overhead and for sparse	Paul E. McKenney	1	-1/+1
	Although rcu_assign_pointer() can be used to assign a constant NULL pointer, doing so gets you an unnecessary memory barrier and in some circumstances, sparse warnings. This commit therefore changes the rcu_assign_pointer() of NULL in __bond_release_one() to RCU_INIT_POINTER(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-12-13	rcu: Add comment on evaluate-once properties of rcu_assign_pointer().	Paul E. McKenney	1	-0/+8
	The rcu_assign_pointer() macro, as with most cpp macros, must not evaluate its argument more than once. And it in fact does not. But this might not be obvious to the casual observer, because one of the arguments appears no less than three times. However, but one expansion is only visible to sparse (__CHECKER__), and one lives inside a typeof (where it will never be evaluated), so this is in fact safe. This commit therefore adds a comment making this explicit. Reported-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-12-10	rcu: Provide better diagnostics for blocking in RCU callback functions	Paul E. McKenney	3	-0/+9
	Currently blocking in an RCU callback function will result in "scheduling while atomic", which could be triggered for any number of reasons. To aid debugging, this patch introduces a rcu_callback_map that is used to tie the inappropriate voluntary context switch back to the fact that the function is being invoked from within a callback. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-10	rcu: Improve SRCU's grace-period comments	Paul E. McKenney	1	-7/+49
	This commit documents the memory-barrier guarantees provided by synchronize_srcu() and call_srcu(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-10	rcu: Fix CONFIG_RCU_FANOUT_EXACT for odd fanout/leaf values	Paul E. McKenney	1	-2/+2
	Each element of the rcu_state structure's ->levelspread[] array is intended to contain the per-level fanout, where the zero-th element corresponds to the root of the rcu_node tree, and the last element corresponds to the leaves. In the CONFIG_RCU_FANOUT_EXACT case, this means that the last element should be filled in from CONFIG_RCU_FANOUT_LEAF (or from the rcu_fanout_leaf boot parameter, if provided) and that the remaining elements should be filled in from CONFIG_RCU_FANOUT. Unfortunately, the current code in rcu_init_levelspread() takes the opposite approach, placing CONFIG_RCU_FANOUT_LEAF in the zero-th element and CONFIG_RCU_FANOUT in the remaining elements. For typical power-of-two values, this generates odd but functional rcu_node trees. However, other values, for example CONFIG_RCU_FANOUT=3 and CONFIG_RCU_FANOUT_LEAF=2, generate trees that can leave some CPUs out of the grace-period computation, resulting in too-short grace periods and therefore a broken RCU implementation. This commit therefore fixes rcu_init_levelspread() to set the last ->levelspread[] array element from CONFIG_RCU_FANOUT_LEAF and the remaining elements from CONFIG_RCU_FANOUT, thus generating the intended rcu_node trees. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-10	rcu: Fix coccinelle warnings	Fengguang Wu	1	-1/+1
	This commit fixes the following coccinelle warning: kernel/rcu/tree.c:712:9-10: WARNING: return of 0/1 in function 'rcu_lockdep_current_cpu_online' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: coccinelle/misc/boolreturn.cocci Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-03	rcutorture: Stop tracking FSF's postal address	Paul E. McKenney	17	-32/+50
	All of the rcutorture scripts has the usual GPL header, which contains a long-obsolete postal address for FSF. To avoid the need to track the FSF office's movements, this commit substitutes the URL where GPL may be found. Reported-by: Greg KH <gregkh@linuxfoundation.org> Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-03	rcutorture: Move checkarg to functions.sh	Paul E. McKenney	2	-22/+27
	Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Flag errors and warnings with color coding	Paul E. McKenney	5	-12/+41
	The output of the rcutorture scripts often requires interpretation, so this commit simplifies this interpretation by tagging messages as BUGs (colored red) or WARNINGs (colored yellow). Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Record results from repeated runs of the same test scenario	Paul E. McKenney	1	-1/+15
	Repeatedly running a given test, for example, by repeating the name as in "--configs "TREE08 TREE08 TREE08" records the results only of the last run of this test. This is because the earlier results are overwritten by the later results. This commit therefore checks for earlier results, using numbered file extensions to distinguish multiple runs. The earlier example would therefore create directories TREE01, TREE01.2, and TREE01.3. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Test summary at end of run with less chattiness	Paul E. McKenney	2	-1/+4
	The commit causes kvm.sh to invoke kvm-recheck.sh at the end of each run, and causes kvm-recheck.sh to print only the name of the test, not the full path to the corresponding Kconfig file. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Update comment in kvm.sh listing typical RCU trace events	Paul E. McKenney	1	-1/+1
	Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add tracing-enabled version of TREE08	Paul E. McKenney	1	-0/+26
	The TREE08 Kconfig fragment does not enable tracing, which is appropriate for its test case. However, this can be inconvenient in cases where TREE08 locates RCU bugs. This commit therefore adds a TREE08-T that differs from TREE08 only in enabling CONFIG_RCU_TRACE. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add --kmake-arg argument to kvm.sh	Paul E. McKenney	2	-1/+8
	This commit adds the --kmake-arg to kvm.sh, which allows passing in things like "V=1" to see the build commands, as well as enabling the CROSS_COMPILE= make macro used for cross-building. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add --no-initrd argument to kvm.sh	Paul E. McKenney	2	-3/+8
	This commit adds the --no-initrd argument to kvm.sh, which permits initrd to be contained in a root partition specified by the --bootargs argument. Without --no-initrd, the kernel build expects an initrd directory in the same rcutorture directory that contains bin and configs. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add --qemu-args argument to kvm.sh	Paul E. McKenney	1	-5/+11
	This commits adds the --qemu-args argument to kvm.sh that is required to pass boot devices down through to qemu. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add --bootargs argument to specify additional boot arguments	Paul E. McKenney	1	-1/+7
	This commit allows easy specification of trace_event lists, among other things. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add --buildonly dry-run capability	Paul E. McKenney	2	-0/+9
	This commit adds --buildonly, which does the builds specified by the --configs argument, but does not boot or test the resulting kernels. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Eliminate configdir argument from kvm-recheck.sh script	Paul E. McKenney	1	-4/+2
	Don't grab the configuration fragment from the configs directory because it might well have been changed since the test was run. Instead, use the ConfigFragment file that was placed in the results directory. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Allow Kconfig-related boot parameters to override	Paul E. McKenney	1	-4/+6
	As it stands, the default kernel boot parameters generated from the Kconfig fragment will override any supplied with the .boot file that can optionally accompany a Kconfig fragment. Rearrange ordering to permit the specific .boot arguments to override those generated by analyzing the Kconfig fragment. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Refactor to enable non-x86 architectures	Paul E. McKenney	3	-13/+102
	This commit expands the checks for what architecture is running to generate additional qemu-system- commands, then uses the resulting qemu-system- command name to choose different qemu arguments as needed for different architectures. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Eliminate --rcu-kvm argument	Paul E. McKenney	1	-14/+1
	The --rcu-kvm argument was intended to allow the scripts to live in an alternate location. Unfortunately, this prevents the kvm.sh script from using common functions until after it finished parsing arguments, because it doesn't know where to find them until then. However, "cp -a" and "ln -s" work pretty well, so lack of an --rcu-kvm argument can be easily worked around. This commit therefore removes this argument. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Remove decorative qemu argument	Paul E. McKenney	1	-1/+1
	The qemu -name argument doesn't seem to be useful in this environment, so this commit removes it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Abstract qemu-flavor identification	Paul E. McKenney	3	-6/+36
	The task of working out which flavor of qemu to use gets more complex as more types of CPUs are supported. Adding Power makes three in addition to 32-bit and 64-bit x86, so it is time to pull this out into a function. This commit therefore creates an identify_qemu function and also adds a --qemu-cmd command-line argument for the inevitable case where the identify_qemu cannot figure it out. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Eliminate duplicate .config-check code	Paul E. McKenney	1	-26/+1
	The commit uses configcheck.sh from within configinit.sh, replacing the imperfect inline expansion that was there before. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Make test output less chatty	Paul E. McKenney	3	-13/+16
	This commit drops no-longer-needed diagnostics from the output. Some of them are retained in logfiles, in case they are ever needed. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Refactor TINY_RCU test cases	Paul E. McKenney	8	-90/+68
	The TINY_RCU test cases were first put in place many years ago, and have been incrementally modified rather than being reworked. This commit therefore completes a long-overdue reworking of the TINY_RCU test cases. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Refactor TREE_RCU test cases	Paul E. McKenney	34	-480/+328
	The TREE_RCU test cases were first put in place many years ago, and have been incrementally modified rather than being reworked. This commit therefore completes a long-overdue reworking of the TREE_RCU test cases. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add SRCU Kconfig-fragment files	Paul E. McKenney	4	-0/+18
	Use .boot facility to ease inclusion of SRCU into automated testing. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add v3.12 version, which adds sysidle testing	Paul E. McKenney	23	-0/+521
	The v3.12 version of the kernel added the CONFIG_NO_HZ_FULL_SYSIDLE Kconfig parameter, so this commit adds a version transition at that point. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add per-Kconfig fragment boot parameters	Paul E. McKenney	2	-0/+14
	Some Kconfig fragments require rcutorture module parameters to do optimal testing, for example, a configuration for SRCU would need rcutorture.torture_type=srcu. This commit therefore adds a per-Kconfig-fragment boot-parameter capability. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add per-version default Kconfig fragments and module parameters	Paul E. McKenney	13	-32/+242
	Different Kconfig parameters apply to different kernel versions, as do different rcutorture module parameters. This commit allows the rcutorture test scripts to adjust for different kernel versions. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add kernel-version argument	Paul E. McKenney	44	-1/+906
	Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add datestamp argument to kvm.sh	Paul E. McKenney	1	-5/+11
	Allow datestamp to be specified to allow tests to be broken up and run in parallel. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcutorture: Add KVM-based test framework	Paul E. McKenney	41	-0/+1656
	This commit adds the test framework that I used to test RCU under KVM. This consists of a group of scripts and Kconfig fragments. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Greg KH <gregkh@linuxfoundation.org>
2013-12-03	rcu: Let the world know when RCU adjusts its geometry	Paul E. McKenney	1	-0/+2
	Some RCU bugs have been specific to the layout of the rcu_node tree, but RCU will silently adjust the tree at boot time if appropriate. This obscures valuable debugging information, so print a message when this happens. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-03	rcu: Fix srcu_barrier() docbook header	Paul E. McKenney	1	-0/+1
	The srcu_barrier() docbook header left out the "sp" argument, so this commit adds that argument's docbook text. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-03	rcu: Allow task-level idle entry/exit nesting	Paul E. McKenney	1	-6/+8
	The current task-level idle entry/exit code forces an entry/exit on each call, regardless of the nesting level. This commit therefore properly accounts for nesting. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
2013-12-03	rcu: Break call_rcu() deadlock involving scheduler and perf	Paul E. McKenney	5	-25/+86
	Dave Jones got the following lockdep splat: > ====================================================== > [ INFO: possible circular locking dependency detected ] > 3.12.0-rc3+ #92 Not tainted > ------------------------------------------------------- > trinity-child2/15191 is trying to acquire lock: > (&rdp->nocb_wq){......}, at: [<ffffffff8108ff43>] __wake_up+0x23/0x50 > > but task is already holding lock: > (&ctx->lock){-.-...}, at: [<ffffffff81154c19>] perf_event_exit_task+0x109/0x230 > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #3 (&ctx->lock){-.-...}: > [<ffffffff810cc243>] lock_acquire+0x93/0x200 > [<ffffffff81733f90>] _raw_spin_lock+0x40/0x80 > [<ffffffff811500ff>] __perf_event_task_sched_out+0x2df/0x5e0 > [<ffffffff81091b83>] perf_event_task_sched_out+0x93/0xa0 > [<ffffffff81732052>] __schedule+0x1d2/0xa20 > [<ffffffff81732f30>] preempt_schedule_irq+0x50/0xb0 > [<ffffffff817352b6>] retint_kernel+0x26/0x30 > [<ffffffff813eed04>] tty_flip_buffer_push+0x34/0x50 > [<ffffffff813f0504>] pty_write+0x54/0x60 > [<ffffffff813e900d>] n_tty_write+0x32d/0x4e0 > [<ffffffff813e5838>] tty_write+0x158/0x2d0 > [<ffffffff811c4850>] vfs_write+0xc0/0x1f0 > [<ffffffff811c52cc>] SyS_write+0x4c/0xa0 > [<ffffffff8173d4e4>] tracesys+0xdd/0xe2 > > -> #2 (&rq->lock){-.-.-.}: > [<ffffffff810cc243>] lock_acquire+0x93/0x200 > [<ffffffff81733f90>] _raw_spin_lock+0x40/0x80 > [<ffffffff810980b2>] wake_up_new_task+0xc2/0x2e0 > [<ffffffff81054336>] do_fork+0x126/0x460 > [<ffffffff81054696>] kernel_thread+0x26/0x30 > [<ffffffff8171ff93>] rest_init+0x23/0x140 > [<ffffffff81ee1e4b>] start_kernel+0x3f6/0x403 > [<ffffffff81ee1571>] x86_64_start_reservations+0x2a/0x2c > [<ffffffff81ee1664>] x86_64_start_kernel+0xf1/0xf4 > > -> #1 (&p->pi_lock){-.-.-.}: > [<ffffffff810cc243>] lock_acquire+0x93/0x200 > [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90 > [<ffffffff810979d1>] try_to_wake_up+0x31/0x350 > [<ffffffff81097d62>] default_wake_function+0x12/0x20 > [<ffffffff81084af8>] autoremove_wake_function+0x18/0x40 > [<ffffffff8108ea38>] __wake_up_common+0x58/0x90 > [<ffffffff8108ff59>] __wake_up+0x39/0x50 > [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0 > [<ffffffff81111450>] __call_rcu+0x140/0x820 > [<ffffffff81111b8d>] call_rcu+0x1d/0x20 > [<ffffffff81093697>] cpu_attach_domain+0x287/0x360 > [<ffffffff81099d7e>] build_sched_domains+0xe5e/0x10a0 > [<ffffffff81efa7fc>] sched_init_smp+0x3b7/0x47a > [<ffffffff81ee1f4e>] kernel_init_freeable+0xf6/0x202 > [<ffffffff817200be>] kernel_init+0xe/0x190 > [<ffffffff8173d22c>] ret_from_fork+0x7c/0xb0 > > -> #0 (&rdp->nocb_wq){......}: > [<ffffffff810cb7ca>] __lock_acquire+0x191a/0x1be0 > [<ffffffff810cc243>] lock_acquire+0x93/0x200 > [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90 > [<ffffffff8108ff43>] __wake_up+0x23/0x50 > [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0 > [<ffffffff81111450>] __call_rcu+0x140/0x820 > [<ffffffff81111bb0>] kfree_call_rcu+0x20/0x30 > [<ffffffff81149abf>] put_ctx+0x4f/0x70 > [<ffffffff81154c3e>] perf_event_exit_task+0x12e/0x230 > [<ffffffff81056b8d>] do_exit+0x30d/0xcc0 > [<ffffffff8105893c>] do_group_exit+0x4c/0xc0 > [<ffffffff810589c4>] SyS_exit_group+0x14/0x20 > [<ffffffff8173d4e4>] tracesys+0xdd/0xe2 > > other info that might help us debug this: > > Chain exists of: > &rdp->nocb_wq --> &rq->lock --> &ctx->lock > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(&ctx->lock); > lock(&rq->lock); > lock(&ctx->lock); > lock(&rdp->nocb_wq); > > * DEADLOCK * > > 1 lock held by trinity-child2/15191: > #0: (&ctx->lock){-.-...}, at: [<ffffffff81154c19>] perf_event_exit_task+0x109/0x230 > > stack backtrace: > CPU: 2 PID: 15191 Comm: trinity-child2 Not tainted 3.12.0-rc3+ #92 > ffffffff82565b70 ffff880070c2dbf8 ffffffff8172a363 ffffffff824edf40 > ffff880070c2dc38 ffffffff81726741 ffff880070c2dc90 ffff88022383b1c0 > ffff88022383aac0 0000000000000000 ffff88022383b188 ffff88022383b1c0 > Call Trace: > [<ffffffff8172a363>] dump_stack+0x4e/0x82 > [<ffffffff81726741>] print_circular_bug+0x200/0x20f > [<ffffffff810cb7ca>] __lock_acquire+0x191a/0x1be0 > [<ffffffff810c6439>] ? get_lock_stats+0x19/0x60 > [<ffffffff8100b2f4>] ? native_sched_clock+0x24/0x80 > [<ffffffff810cc243>] lock_acquire+0x93/0x200 > [<ffffffff8108ff43>] ? __wake_up+0x23/0x50 > [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90 > [<ffffffff8108ff43>] ? __wake_up+0x23/0x50 > [<ffffffff8108ff43>] __wake_up+0x23/0x50 > [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0 > [<ffffffff81111450>] __call_rcu+0x140/0x820 > [<ffffffff8109bc8f>] ? local_clock+0x3f/0x50 > [<ffffffff81111bb0>] kfree_call_rcu+0x20/0x30 > [<ffffffff81149abf>] put_ctx+0x4f/0x70 > [<ffffffff81154c3e>] perf_event_exit_task+0x12e/0x230 > [<ffffffff81056b8d>] do_exit+0x30d/0xcc0 > [<ffffffff810c9af5>] ? trace_hardirqs_on_caller+0x115/0x1e0 > [<ffffffff810c9bcd>] ? trace_hardirqs_on+0xd/0x10 > [<ffffffff8105893c>] do_group_exit+0x4c/0xc0 > [<ffffffff810589c4>] SyS_exit_group+0x14/0x20 > [<ffffffff8173d4e4>] tracesys+0xdd/0xe2 The underlying problem is that perf is invoking call_rcu() with the scheduler locks held, but in NOCB mode, call_rcu() will with high probability invoke the scheduler -- which just might want to use its locks. The reason that call_rcu() needs to invoke the scheduler is to wake up the corresponding rcuo callback-offload kthread, which does the job of starting up a grace period and invoking the callbacks afterwards. One solution (championed on a related problem by Lai Jiangshan) is to simply defer the wakeup to some point where scheduler locks are no longer held. Since we don't want to unnecessarily incur the cost of such deferral, the task before us is threefold: 1. Determine when it is likely that a relevant scheduler lock is held. 2. Defer the wakeup in such cases. 3. Ensure that all deferred wakeups eventually happen, preferably sooner rather than later. We use irqs_disabled_flags() as a proxy for relevant scheduler locks being held. This works because the relevant locks are always acquired with interrupts disabled. We may defer more often than needed, but that is at least safe. The wakeup deferral is tracked via a new field in the per-CPU and per-RCU-flavor rcu_data structure, namely ->nocb_defer_wakeup. This flag is checked by the RCU core processing. The __rcu_pending() function now checks this flag, which causes rcu_check_callbacks() to initiate RCU core processing at each scheduling-clock interrupt where this flag is set. Of course this is not sufficient because scheduling-clock interrupts are often turned off (the things we used to be able to count on!). So the flags are also checked on entry to any state that RCU considers to be idle, which includes both NO_HZ_IDLE idle state and NO_HZ_FULL user-mode-execution state. This approach should allow call_rcu() to be invoked regardless of what locks you might be holding, the key word being "should". Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org>
2013-12-03	rcu: Fix and comment ordering around wait_event()	Paul E. McKenney	3	-4/+13
	It is all too easy to forget that wait_event() does not necessarily imply a full memory barrier. The case where it does not is where the condition transitions to true just as wait_event() starts execution. This is actually a feature: The standard use of wait_event() involves locking, in which case the locks provide the needed ordering (you hold a lock across the wake_up() and acquire that same lock after wait_event() returns). Given that I did forget that wait_event() does not necessarily imply a full memory barrier in one case, this commit fixes that case. This commit also adds comments calling out the placement of existing memory barriers relied on by wait_event() calls. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-03	rcu: Kick CPU halfway to RCU CPU stall warning	Paul E. McKenney	2	-1/+27
	When an RCU CPU stall warning occurs, the CPU invokes resched_cpu() on itself. This can help move the grace period forward in some situations, but it would be even better to do this -before- the RCU CPU stall warning. This commit therefore causes resched_cpu() to be called every five jiffies once the system is halfway to an RCU CPU stall warning. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-12-03	documentation: Update circular buffer for load-acquire/store-release	Paul E. McKenney	1	-18/+19
	This commit replaces full barriers by targeted use of load-acquire and store-release. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Restore comments as suggested by David Howells. ]