summaryrefslogtreecommitdiff
path: root/kernel/rcu/tree.c
AgeCommit message (Collapse)AuthorFilesLines
2015-10-06rcu: Eliminate panic when silly boot-time fanout specifiedPaul E. McKenney1-9/+11
This commit loosens rcutree.rcu_fanout_leaf range checks and replaces a panic() with a fallback to compile-time values. This fallback is accompanied by a WARN_ON(), and both occur when the rcutree.rcu_fanout_leaf value is too small to accommodate the number of CPUs. For example, given the current four-level limit for the rcu_node tree, a system with more than 16 CPUs built with CONFIG_FANOUT=2 must have rcutree.rcu_fanout_leaf larger than 2. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06rcu: Don't disable preemption for Tiny and Tree RCU readersBoqun Feng1-0/+9
Because preempt_disable() maps to barrier() for non-debug builds, it forces the compiler to spill and reload registers. Because Tree RCU and Tiny RCU now only appear in CONFIG_PREEMPT=n builds, these barrier() instances generate needless extra code for each instance of rcu_read_lock() and rcu_read_unlock(). This extra code slows down Tree RCU and bloats Tiny RCU. This commit therefore removes the preempt_disable() and preempt_enable() from the non-preemptible implementations of __rcu_read_lock() and __rcu_read_unlock(), respectively. However, for debug purposes, preempt_disable() and preempt_enable() are still invoked if CONFIG_PREEMPT_COUNT=y, because this allows detection of sleeping inside atomic sections in non-preemptible kernels. However, Tiny and Tree RCU operates by coalescing all RCU read-side critical sections on a given CPU that lie between successive quiescent states. It is therefore necessary to compensate for removing barriers from __rcu_read_lock() and __rcu_read_unlock() by adding them to a couple of the RCU functions invoked during quiescent states, namely to rcu_all_qs() and rcu_note_context_switch(). However, note that the latter is more paranoia than necessity, at least until link-time optimizations become more aggressive. This is based on an earlier patch by Paul E. McKenney, fixing a bug encountered in kernels built with CONFIG_PREEMPT=n and CONFIG_PREEMPT_COUNT=y. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-10-06rcu: Use rcu_callback_t in call_rcu*() and friendsBoqun Feng1-4/+4
As we now have rcu_callback_t typedefs as the type of rcu callbacks, we should use it in call_rcu*() and friends as the type of parameters. This could save us a few lines of code and make it clear which function requires an rcu callbacks rather than other callbacks as its argument. Besides, this can also help cscope to generate a better database for code reading. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-09-21rcu: Make ->cpu_no_qs be a union for aggregate ORPaul E. McKenney1-11/+11
This commit converts the rcu_data structure's ->cpu_no_qs field to a union. The bytewise side of this union allows individual access to indications as to whether this CPU needs to find a quiescent state for a normal (.norm) and/or expedited (.exp) grace period. The setwise side of the union allows testing whether or not a quiescent state is needed at all, for either type of grace period. For now, only .norm is used. A later commit will introduce the expedited usage. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-21rcu: Invert passed_quiesce and rename to cpu_no_qsPaul E. McKenney1-11/+11
This commit inverts the sense of the rcu_data structure's ->passed_quiesce field and renames it to ->cpu_no_qs. This will allow a later commit to use an "aggregate OR" operation to test expedited as well as normal grace periods without added overhead. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-21rcu: Rename qs_pending to core_needs_qsPaul E. McKenney1-7/+7
An upcoming commit needs to invert the sense of the ->passed_quiesce rcu_data structure field, so this commit is taking this opportunity to clarify things a bit by renaming ->qs_pending to ->core_needs_qs. So if !rdp->core_needs_qs, then this CPU need not concern itself with quiescent states, in particular, it need not acquire its leaf rcu_node structure's ->lock to check. Otherwise, it needs to report the next quiescent state. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-21rcu: Move synchronize_sched_expedited() to combining treePaul E. McKenney1-41/+82
Currently, synchronize_sched_expedited() uses a single global counter to track the number of remaining context switches that the current expedited grace period must wait on. This is problematic on large systems, where the resulting memory contention can be pathological. This commit therefore makes synchronize_sched_expedited() instead use the combining tree in the same manner as synchronize_rcu_expedited(), keeping memory contention down to a dull roar. This commit creates a temporary function sync_sched_exp_select_cpus() that is very similar to sync_rcu_exp_select_cpus(). A later commit will consolidate these two functions, which becomes possible when synchronize_sched_expedited() switches from stop_one_cpu_nowait() to smp_call_function_single(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-21rcu: Use single-stage IPI algorithm for RCU expedited grace periodPaul E. McKenney1-20/+55
The current preemptible-RCU expedited grace-period algorithm invokes synchronize_sched_expedited() to enqueue all tasks currently running in a preemptible-RCU read-side critical section, then waits for all the ->blkd_tasks lists to drain. This works, but results in both an IPI and a double context switch even on CPUs that do not happen to be running in a preemptible RCU read-side critical section. This commit implements a new algorithm that causes less OS jitter. This new algorithm IPIs all online CPUs that are not idle (from an RCU perspective), but refrains from self-IPIs. If a CPU receiving this IPI is not in a preemptible RCU read-side critical section (or is just now exiting one), it pushes quiescence up the rcu_node tree, otherwise, it sets a flag that will be handled by the upcoming outermost rcu_read_unlock(), which will then push quiescence up the tree. The expedited grace period must of course wait on any pre-existing blocked readers, and newly blocked readers must be queued carefully based on the state of both the normal and the expedited grace periods. This new queueing approach also avoids the need to update boost state, courtesy of the fact that blocked tasks are no longer ever migrated to the root rcu_node structure. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-21rcu: Consolidate tree setup for synchronize_rcu_expedited()Paul E. McKenney1-1/+85
This commit replaces sync_rcu_preempt_exp_init1(() and sync_rcu_preempt_exp_init2() with sync_exp_reset_tree_hotplug() and sync_exp_reset_tree(), which will also be used by synchronize_sched_expedited(), and sync_rcu_exp_select_nodes(), which contains code specific to synchronize_rcu_expedited(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-21rcu: Move rcu_report_exp_rnp() to allow consolidationPaul E. McKenney1-0/+66
This is a nearly pure code-movement commit, moving rcu_report_exp_rnp(), sync_rcu_preempt_exp_done(), and rcu_preempted_readers_exp() so that later commits can make synchronize_sched_expedited() use them. The non-code-movement portion of this commit tags rcu_report_exp_rnp() as __maybe_unused to avoid build errors when CONFIG_PREEMPT=n. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-21rcu: Use rsp->expedited_wq instead of sync_rcu_preempt_exp_wqPaul E. McKenney1-1/+1
Now that there is an ->expedited_wq waitqueue in each rcu_state structure, there is no need for the sync_rcu_preempt_exp_wq global variable. This commit therefore substitutes ->expedited_wq for sync_rcu_preempt_exp_wq. It also initializes ->expedited_wq only once at boot instead of at the start of each expedited grace period. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-09-21rcu: Suppress lockdep false positive for rcp->exp_funnel_mutexPaul E. McKenney1-0/+5
In kernels built with CONFIG_PREEMPT=y, synchronize_rcu_expedited() invokes synchronize_sched_expedited() while holding RCU-preempt's root rcu_node structure's ->exp_funnel_mutex, which is acquired after the rcu_data structure's ->exp_funnel_mutex. The first thing that synchronize_sched_expedited() will do is acquire RCU-sched's rcu_data structure's ->exp_funnel_mutex. There is no danger of an actual deadlock because the locking order is always from RCU-preempt's expedited mutexes to those of RCU-sched. Unfortunately, lockdep considers both rcu_data structures' ->exp_funnel_mutex to be in the same lock class and therefore reports a deadlock cycle. This commit silences this false positive by placing RCU-sched's rcu_data structures' ->exp_funnel_mutex locks into their own lock class. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-08-04Merge branches 'fixes.2015.07.22a' and 'initexp.2015.08.04a' into HEADPaul E. McKenney1-262/+327
fixes.2015.07.22a: Miscellaneous fixes. initexp.2015.08.04a: Initialization and expedited updates. (Single branch due to conflicts.)
2015-08-04rcu: Silence lockdep false positive for expedited grace periodsPaul E. McKenney1-2/+10
In a CONFIG_PREEMPT=y kernel, synchronize_rcu_expedited() acquires the ->exp_funnel_mutex in rcu_preempt_state, then invokes synchronize_sched_expedited, which acquires the ->exp_funnel_mutex in rcu_sched_state. There can be no deadlock because rcu_preempt_state ->exp_funnel_mutex acquisition always precedes that of rcu_sched_state. But lockdep does not know that, so it gives false-positive splats. This commit therefore associates a separate lock_class_key structure with the rcu_sched_state structure's ->exp_funnel_mutex, allowing lockdep to see the lock ordering, avoiding the false positives. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-23rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()Paul E. McKenney1-14/+14
This commit renames rcu_lockdep_assert() to RCU_LOCKDEP_WARN() for consistency with the WARN() series of macros. This also requires inverting the sense of the conditional, which this commit also does. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Ingo Molnar <mingo@kernel.org>
2015-07-23rcu: Make rcu_is_watching() really notraceAlexei Starovoitov1-2/+2
Although rcu_is_watching() is marked notrace, it invokes preempt_disable() and preempt_enable(), both of which can be traced. This defeats the purpose of the notrace on rcu_is_watching(), so this commit substitutes preempt_disable_notrace() and preempt_enable_notrace(). Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org>
2015-07-23rcu: Add RCU-sched flavors of get-state and cond-syncPaul E. McKenney1-0/+52
The get_state_synchronize_rcu() and cond_synchronize_rcu() functions allow polling for grace-period completion, with an actual wait for a grace period occurring only when cond_synchronize_rcu() is called too soon after the corresponding get_state_synchronize_rcu(). However, these functions work only for vanilla RCU. This commit adds the get_state_synchronize_sched() and cond_synchronize_sched(), which provide the same capability for RCU-sched. Reported-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Add fastpath bypassing funnel lockingPaul E. McKenney1-0/+16
In the common case, there will be only one expedited grace period in the system at a given time, in which case it is not helpful to use funnel locking. This commit therefore adds a fastpath that bypasses funnel locking when the root ->exp_funnel_mutex is not held. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Rename RCU_GP_DONE_FQS to RCU_GP_DOING_FQSPaul E. McKenney1-1/+1
The grace-period kthread sleeps waiting to do a force-quiescent-state scan, and when awakened sets rsp->gp_state to RCU_GP_DONE_FQS. However, this is confusing because the kthread has not done the force-quiescent-state, but is instead just starting to do it. This commit therefore renames RCU_GP_DONE_FQS to RCU_GP_DOING_FQS in order to make things a bit easier on reviewers. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Pull out wait_event*() condition into helper functionPaul E. McKenney1-5/+21
The condition for the wait_event_interruptible_timeout() that waits to do the next force-quiescent-state scan is a bit ornate: ((gf = READ_ONCE(rsp->gp_flags)) & RCU_GP_FLAG_FQS) || (!READ_ONCE(rnp->qsmask) && !rcu_preempt_blocked_readers_cgp(rnp)) This commit therefore pulls this condition out into a helper function and comments its component conditions. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Add stall warnings to synchronize_sched_expedited()Paul E. McKenney1-4/+54
Although synchronize_sched_expedited() historically has no RCU CPU stall warnings, the availability of the rcupdate.rcu_expedited boot parameter invalidates the old assumption that synchronize_sched()'s stall warnings would suffice. This commit therefore adds RCU CPU stall warnings to synchronize_sched_expedited(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Extend expedited funnel locking to rcu_data structurePaul E. McKenney1-3/+16
The strictly rcu_node based funnel-locking scheme works well in many cases, but systems with CONFIG_RCU_FANOUT_LEAF=64 won't necessarily get all that much concurrency. This commit therefore extends the funnel locking into the per-CPU rcu_data structure, providing concurrency equal to the number of CPUs. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Consolidate last open-coded expedited memory barrierPaul E. McKenney1-1/+1
One of the requirements on RCU grace periods is that if there is a causal chain of operations that starts after one grace period and ends before another grace period, then the two grace periods must be serialized. There has been (and might still be) code that relies on this, for example, certain types of reference-counting code that does a call_rcu() within an RCU callback function. This requirement is why there is an smp_mb() at the end of both synchronize_sched_expedited() and synchronize_rcu_expedited(). However, this is the only smp_mb() in these functions, so it would be nicer to consolidate it into rcu_exp_gp_seq_end(). This commit does just that. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Apply rcu_seq operations to _rcu_barrier()Paul E. McKenney1-53/+19
The rcu_seq operations were open-coded in _rcu_barrier(), so this commit replaces the open-coding with the shiny new rcu_seq operations. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Use funnel locking for synchronize_rcu_expedited()'s polling loopPaul E. McKenney1-7/+8
This commit gets rid of synchronize_rcu_expedited()'s mutex_trylock() polling loop in favor of the funnel-locking scheme that was abstracted from synchronize_sched_expedited(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Fix synchronize_sched_expedited() type error for "s"Paul E. McKenney1-1/+1
The type of "s" has been "long" rather than the correct "unsigned long" for quite some time. This commit fixes this type error. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Abstract funnel locking from synchronize_sched_expedited()Paul E. McKenney1-33/+47
This commit abstracts funnel locking from synchronize_sched_expedited() so that it may be used by synchronize_rcu_expedited(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Abstract sequence counting from synchronize_sched_expedited()Paul E. McKenney1-10/+58
This commit creates rcu_exp_gp_seq_start() and rcu_exp_gp_seq_end() to bracket an expedited grace period, rcu_exp_gp_seq_snap() to snapshot the sequence counter, and rcu_exp_gp_seq_done() to check to see if a full expedited grace period has elapsed since the snapshot. These will be applied to synchronize_rcu_expedited(). These are defined in terms of underlying rcu_seq_start(), rcu_seq_end(), rcu_seq_snap(), rcu_seq_done(), which will be applied to _rcu_barrier(). One reason that this commit doesn't use the seqcount primitives themselves is that the smp_wmb() in those primitive is insufficient due to the fact that expedited grace periods do reads as well as writes. In addition, the read-side seqcount primitives detect a potentially partial change, where the expedited primitives instead need a guaranteed full change. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Make expedited GP CPU stoppage asynchronousPeter Zijlstra1-14/+17
Sequentially stopping the CPUs slows down expedited grace periods by at least a factor of two, based on rcutorture's grace-period-per-second rate. This is a conservative measure because rcutorture uses unusually long RCU read-side critical sections and because rcutorture periodically quiesces the system in order to test RCU's ability to ramp down to and up from the idle state. This commit therefore replaces the stop_one_cpu() with stop_one_cpu_nowait(), using an atomic-counter scheme to determine when all CPUs have passed through the stopped state. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Get rid of synchronize_sched_expedited()'s polling loopPaul E. McKenney1-55/+40
This commit gets rid of synchronize_sched_expedited()'s mutex_trylock() polling loop in favor of a funnel-locking scheme based on the rcu_node tree. The work-done check is done at each level of the tree, allowing high-contention situations to be resolved quickly with reasonable levels of mutex contention. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Rework synchronize_sched_expedited() counter handlingPaul E. McKenney1-68/+30
Now that synchronize_sched_expedited() have a mutex, it can use simpler work-already-done detection scheme. This commit simplifies this scheme by using something similar to the sequence-locking counter scheme. A counter is incremented before and after each grace period, so that the counter is odd in the midst of the grace period and even otherwise. So if the counter has advanced to the second even number that is greater than or equal to the snapshot, the required grace period has already happened. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Switch synchronize_sched_expedited() to stop_one_cpu()Peter Zijlstra1-27/+14
The synchronize_sched_expedited() currently invokes try_stop_cpus(), which schedules the stopper kthreads on each online non-idle CPU, and waits until all those kthreads are running before letting any of them stop. This is disastrous for real-time workloads, which get hit with a preemption that is as long as the longest scheduling latency on any CPU, including any non-realtime housekeeping CPUs. This commit therefore switches to using stop_one_cpu() on each CPU in turn. This avoids inflicting the worst-case scheduling latency on the worst-case CPU onto all other CPUs, and also simplifies the code a little bit. Follow-up commits will simplify the counter-snapshotting algorithm and convert a number of the counters that are now protected by the new ->expedited_mutex to non-atomic. Signed-off-by: Peter Zijlstra <peterz@infradead.org> [ paulmck: Kept stop_one_cpu(), dropped disabling of "guardrails". ] Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-18rcu: Reset rcu_fanout_leaf if out of boundsPaul E. McKenney1-0/+1
Currently if the rcu_fanout_leaf boot parameter is out of bounds (that is, less than RCU_FANOUT_LEAF or greater than the number of bits in an unsigned long), a warning is issued and execution continues with the out-of-bounds value. This can result in all manner of failures, so this patch resets rcu_fanout_leaf to RCU_FANOUT_LEAF when an out-of-bounds condition is detected. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-16rcu: Limit count of static data to the number of RCU levelsAlexander Gordeev1-17/+4
Although a number of RCU levels may be less than the current maximum of four, some static data associated with each level are allocated for all four levels. As result, the extra data never get accessed and just wast memory. This update limits count of allocated items to the number of used RCU levels. Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-16rcu: Remove unnecessary fields from rcu_state structureAlexander Gordeev1-12/+15
Members rcu_state::levelcnt[] and rcu_state::levelspread[] are only used at init. There is no reason to keep them afterwards. Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-16rcu: Limit rcu_capacity[] size to RCU_NUM_LVLS itemsAlexander Gordeev1-6/+6
Number of items in rcu_capacity[] array is defined by macro MAX_RCU_LVLS. However, that array is never accessed beyond RCU_NUM_LVLS index. Therefore, we can limit the array to RCU_NUM_LVLS items and eliminate MAX_RCU_LVLS. As result, in most cases the memory is conserved. Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-16rcu: Simplify rcu_init_geometry() capacity arithmeticsAlexander Gordeev1-10/+8
Current code suggests that introducing the extra level to rcu_capacity[] array makes some of the arithmetic easier. Well, in fact it appears rather confusing and unnecessary. Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-16rcu: Cleanup rcu_init_geometry() code and arithmeticsAlexander Gordeev1-14/+10
This update simplifies rcu_init_geometry() code flow and makes calculation of the total number of rcu_node structures more easy to read. The update relies on the fact num_rcu_lvl[] is never accessed beyond rcu_num_lvls index by the rest of the code. Therefore, there is no need initialize the whole num_rcu_lvl[]. Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-16rcu: Remove superfluous local variable in rcu_init_geometry()Alexander Gordeev1-7/+7
Local variable 'n' mimics 'nr_cpu_ids' while the both are used within one function. There is no reason for 'n' to exist whatsoever. Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-16rcu: Panic if RCU tree can not accommodate all CPUsAlexander Gordeev1-12/+17
Currently a condition when RCU tree is unable to accommodate the configured number of CPUs is not permitted and causes a fall back to compile-time values. However, the code has no means to exceed the RCU tree capacity neither at compile-time nor in run-time. Therefore, if the condition is met in run- time then it indicates a serios problem elsewhere and should be handled with a panic. Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-16rcu: Provide more diagnostics for stalled GP kthreadPaul E. McKenney1-2/+8
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-06rcu: Drop RCU_USER_QS in favor of NO_HZ_FULLPaul E. McKenney1-4/+4
The RCU_USER_QS Kconfig parameter is now just a synonym for NO_HZ_FULL, so this commit eliminates RCU_USER_QS, replacing all uses with NO_HZ_FULL. Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
2015-06-27Merge tag 'trace-v4.2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "This patch series contains several clean ups and even a new trace clock "monitonic raw". Also some enhancements to make the ring buffer even faster. But the biggest and most noticeable change is the renaming of the ftrace* files, structures and variables that have to deal with trace events. Over the years I've had several developers tell me about their confusion with what ftrace is compared to events. Technically, "ftrace" is the infrastructure to do the function hooks, which include tracing and also helps with live kernel patching. But the trace events are a separate entity altogether, and the files that affect the trace events should not be named "ftrace". These include: include/trace/ftrace.h -> include/trace/trace_events.h include/linux/ftrace_event.h -> include/linux/trace_events.h Also, functions that are specific for trace events have also been renamed: ftrace_print_*() -> trace_print_*() (un)register_ftrace_event() -> (un)register_trace_event() ftrace_event_name() -> trace_event_name() ftrace_trigger_soft_disabled() -> trace_trigger_soft_disabled() ftrace_define_fields_##call() -> trace_define_fields_##call() ftrace_get_offsets_##call() -> trace_get_offsets_##call() Structures have been renamed: ftrace_event_file -> trace_event_file ftrace_event_{call,class} -> trace_event_{call,class} ftrace_event_buffer -> trace_event_buffer ftrace_subsystem_dir -> trace_subsystem_dir ftrace_event_raw_##call -> trace_event_raw_##call ftrace_event_data_offset_##call-> trace_event_data_offset_##call ftrace_event_type_funcs_##call -> trace_event_type_funcs_##call And a few various variables and flags have also been updated. This has been sitting in linux-next for some time, and I have not heard a single complaint about this rename breaking anything. Mostly because these functions, variables and structures are mostly internal to the tracing system and are seldom (if ever) used by anything external to that" * tag 'trace-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits) ring_buffer: Allow to exit the ring buffer benchmark immediately ring-buffer-benchmark: Fix the wrong type ring-buffer-benchmark: Fix the wrong param in module_param ring-buffer: Add enum names for the context levels ring-buffer: Remove useless unused tracing_off_permanent() ring-buffer: Give NMIs a chance to lock the reader_lock ring-buffer: Add trace_recursive checks to ring_buffer_write() ring-buffer: Allways do the trace_recursive checks ring-buffer: Move recursive check to per_cpu descriptor ring-buffer: Add unlikelys to make fast path the default tracing: Rename ftrace_get_offsets_##call() to trace_event_get_offsets_##call() tracing: Rename ftrace_define_fields_##call() to trace_event_define_fields_##call() tracing: Rename ftrace_event_type_funcs_##call to trace_event_type_funcs_##call tracing: Rename ftrace_data_offset_##call to trace_event_data_offset_##call tracing: Rename ftrace_raw_##call event structures to trace_event_raw_##call tracing: Rename ftrace_trigger_soft_disabled() to trace_trigger_soft_disabled() tracing: Rename FTRACE_EVENT_FL_* flags to EVENT_FILE_FL_* tracing: Rename struct ftrace_subsystem_dir to trace_subsystem_dir tracing: Rename ftrace_event_name() to trace_event_name() tracing: Rename FTRACE_MAX_EVENT to TRACE_EVENT_TYPE_MAX ...
2015-05-27Merge branches 'array.2015.05.27a', 'doc.2015.05.27a', 'fixes.2015.05.27a', ↵Paul E. McKenney1-61/+120
'hotplug.2015.05.27a', 'init.2015.05.27a', 'tiny.2015.05.27a' and 'torture.2015.05.27a' into HEAD array.2015.05.27a: Remove all uses of RCU-protected array indexes. doc.2015.05.27a: Docuemntation updates. fixes.2015.05.27a: Miscellaneous fixes. hotplug.2015.05.27a: CPU-hotplug updates. init.2015.05.27a: Initialization/Kconfig updates. tiny.2015.05.27a: Updates to Tiny RCU. torture.2015.05.27a: Torture-testing updates.
2015-05-27rcu: Conditionally compile RCU's eqs warningsPaul E. McKenney1-8/+15
This commit applies some warning-omission micro-optimizations to RCU's various extended-quiescent-state functions, which are on the kernel/user hotpath for CONFIG_NO_HZ_FULL=y. Reported-by: Rik van Riel <riel@redhat.com> Reported by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-05-27rcu: Make RCU able to tolerate undefined CONFIG_RCU_KTHREAD_PRIOPaul E. McKenney1-0/+4
This commit updates the initialization of the kthread_prio boot parameter so that RCU will build even when CONFIG_RCU_KTHREAD_PRIO is undefined. The kthread_prio boot parameter is set to CONFIG_RCU_KTHREAD_PRIO if that is defined, otherwise to 1 if CONFIG_RCU_BOOST is defined and to zero otherwise. This commit then makes CONFIG_RCU_KTHREAD_PRIO depend on CONFIG_RCU_EXPERT, so that Kconfig users won't be asked about CONFIG_RCU_KTHREAD_PRIO unless they want to be. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2015-05-27rcu: Make RCU able to tolerate undefined CONFIG_RCU_FANOUT_LEAFPaul E. McKenney1-4/+4
This commit introduces an RCU_FANOUT_LEAF C-preprocessor macro so that RCU will build even when CONFIG_RCU_FANOUT_LEAF is undefined. The RCU_FANOUT_LEAF macro is set to the value of CONFIG_RCU_FANOUT_LEAF when defined, otherwise it is set to 32 for 32-bit systems and 64 for 64-bit systems. This commit then makes CONFIG_RCU_FANOUT_LEAF depend on CONFIG_RCU_EXPERT, so that Kconfig users won't be asked about CONFIG_RCU_FANOUT_LEAF unless they want to be. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2015-05-27rcu: Make RCU able to tolerate undefined CONFIG_RCU_FANOUTPaul E. McKenney1-2/+2
This commit introduces an RCU_FANOUT C-preprocessor macro so that RCU will build even when CONFIG_RCU_FANOUT is undefined. The RCU_FANOUT macro is set to the value of CONFIG_RCU_FANOUT when defined, otherwise it is set to 32 for 32-bit systems and 64 for 64-bit systems. This commit then makes CONFIG_RCU_FANOUT depend on CONFIG_RCU_EXPERT, so that Kconfig users won't be asked about CONFIG_RCU_FANOUT unless they want to be. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2015-05-27rcu: Enable diagnostic dump of rcu_node combining treePaul E. McKenney1-0/+27
The purpose of this commit is to make it easier to verify that RCU's combining tree is set up correctly, which is useful to have when making changes in how that tree is initialized. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com> [ paulmck: Fold fix found by Fengguang's 0-day test robot. ]
2015-05-27rcu: Convert CONFIG_RCU_FANOUT_EXACT to boot parameterPaul E. McKenney1-2/+5
The CONFIG_RCU_FANOUT_EXACT Kconfig parameter is used primarily (and perhaps only) by rcutorture to verify that RCU works correctly in specific rcu_node combining-tree configurations. It therefore does not make much sense have this as a question to people attempting to configure their kernels. So this commit creates an rcutree.rcu_fanout_exact= boot parameter that rcutorture can use, and eliminates the original CONFIG_RCU_FANOUT_EXACT Kconfig parameter. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>