diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2020-06-01 22:56:29 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2020-06-01 22:56:29 +0300 |
commit | 2227e5b21aec6c5f7f6491352f0c19fd02d19418 (patch) | |
tree | 098e17fe2c1ec5fc753c9ae7983d2f8466914c0e /Documentation | |
parent | 0bd957eb11cfeef23fcc240edde6dfe431731e69 (diff) | |
parent | cb3cb6733fbd8fd8d2c716095fdca42dadba2063 (diff) | |
download | linux-2227e5b21aec6c5f7f6491352f0c19fd02d19418.tar.xz |
Merge tag 'core-rcu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
"The RCU updates for this cycle were:
- RCU-tasks update, including addition of RCU Tasks Trace for BPF use
and TASKS_RUDE_RCU
- kfree_rcu() updates.
- Remove scheduler locking restriction
- RCU CPU stall warning updates.
- Torture-test updates.
- Miscellaneous fixes and other updates"
* tag 'core-rcu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits)
rcu: Allow for smp_call_function() running callbacks from idle
rcu: Provide rcu_irq_exit_check_preempt()
rcu: Abstract out rcu_irq_enter_check_tick() from rcu_nmi_enter()
rcu: Provide __rcu_is_watching()
rcu: Provide rcu_irq_exit_preempt()
rcu: Make RCU IRQ enter/exit functions rely on in_nmi()
rcu/tree: Mark the idle relevant functions noinstr
x86: Replace ist_enter() with nmi_enter()
x86/mce: Send #MC singal from task work
x86/entry: Get rid of ist_begin/end_non_atomic()
sched,rcu,tracing: Avoid tracing before in_nmi() is correct
sh/ftrace: Move arch_ftrace_nmi_{enter,exit} into nmi exception
lockdep: Always inline lockdep_{off,on}()
hardirq/nmi: Allow nested nmi_enter()
arm64: Prepare arch_nmi_enter() for recursion
printk: Disallow instrumenting print_nmi_enter()
printk: Prepare for nested printk_nmi_enter()
rcutorture: Convert ULONG_CMP_LT() to time_before()
torture: Add a --kasan argument
torture: Save a few lines by using config_override_param initially
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/RCU/Design/Requirements/Requirements.rst | 61 | ||||
-rw-r--r-- | Documentation/admin-guide/kernel-parameters.txt | 19 | ||||
-rw-r--r-- | Documentation/trace/ftrace-design.rst | 8 |
3 files changed, 35 insertions, 53 deletions
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst index fd5e2cbc4935..75b8ca007a11 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -1943,56 +1943,27 @@ invoked from a CPU-hotplug notifier. Scheduler and RCU ~~~~~~~~~~~~~~~~~ -RCU depends on the scheduler, and the scheduler uses RCU to protect some -of its data structures. The preemptible-RCU ``rcu_read_unlock()`` -implementation must therefore be written carefully to avoid deadlocks -involving the scheduler's runqueue and priority-inheritance locks. In -particular, ``rcu_read_unlock()`` must tolerate an interrupt where the -interrupt handler invokes both ``rcu_read_lock()`` and -``rcu_read_unlock()``. This possibility requires ``rcu_read_unlock()`` -to use negative nesting levels to avoid destructive recursion via -interrupt handler's use of RCU. - -This scheduler-RCU requirement came as a `complete -surprise <https://lwn.net/Articles/453002/>`__. - -As noted above, RCU makes use of kthreads, and it is necessary to avoid -excessive CPU-time accumulation by these kthreads. This requirement was -no surprise, but RCU's violation of it when running context-switch-heavy -workloads when built with ``CONFIG_NO_HZ_FULL=y`` `did come as a -surprise +RCU makes use of kthreads, and it is necessary to avoid excessive CPU-time +accumulation by these kthreads. This requirement was no surprise, but +RCU's violation of it when running context-switch-heavy workloads when +built with ``CONFIG_NO_HZ_FULL=y`` `did come as a surprise [PDF] <http://www.rdrop.com/users/paulmck/scalability/paper/BareMetal.2015.01.15b.pdf>`__. RCU has made good progress towards meeting this requirement, even for context-switch-heavy ``CONFIG_NO_HZ_FULL=y`` workloads, but there is room for further improvement. -It is forbidden to hold any of scheduler's runqueue or -priority-inheritance spinlocks across an ``rcu_read_unlock()`` unless -interrupts have been disabled across the entire RCU read-side critical -section, that is, up to and including the matching ``rcu_read_lock()``. -Violating this restriction can result in deadlocks involving these -scheduler spinlocks. There was hope that this restriction might be -lifted when interrupt-disabled calls to ``rcu_read_unlock()`` started -deferring the reporting of the resulting RCU-preempt quiescent state -until the end of the corresponding interrupts-disabled region. -Unfortunately, timely reporting of the corresponding quiescent state to -expedited grace periods requires a call to ``raise_softirq()``, which -can acquire these scheduler spinlocks. In addition, real-time systems -using RCU priority boosting need this restriction to remain in effect -because deferred quiescent-state reporting would also defer deboosting, -which in turn would degrade real-time latencies. - -In theory, if a given RCU read-side critical section could be guaranteed -to be less than one second in duration, holding a scheduler spinlock -across that critical section's ``rcu_read_unlock()`` would require only -that preemption be disabled across the entire RCU read-side critical -section, not interrupts. Unfortunately, given the possibility of vCPU -preemption, long-running interrupts, and so on, it is not possible in -practice to guarantee that a given RCU read-side critical section will -complete in less than one second. Therefore, as noted above, if -scheduler spinlocks are held across a given call to -``rcu_read_unlock()``, interrupts must be disabled across the entire RCU -read-side critical section. +There is no longer any prohibition against holding any of +scheduler's runqueue or priority-inheritance spinlocks across an +``rcu_read_unlock()``, even if interrupts and preemption were enabled +somewhere within the corresponding RCU read-side critical section. +Therefore, it is now perfectly legal to execute ``rcu_read_lock()`` +with preemption enabled, acquire one of the scheduler locks, and hold +that lock across the matching ``rcu_read_unlock()``. + +Similarly, the RCU flavor consolidation has removed the need for negative +nesting. The fact that interrupt-disabled regions of code act as RCU +read-side critical sections implicitly avoids earlier issues that used +to result in destructive recursion via interrupt handler's use of RCU. Tracing and RCU ~~~~~~~~~~~~~~~ diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 7bc83f3d9bdf..46df119c931a 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4210,12 +4210,24 @@ Duration of CPU stall (s) to test RCU CPU stall warnings, zero to disable. + rcutorture.stall_cpu_block= [KNL] + Sleep while stalling if set. This will result + in warnings from preemptible RCU in addition + to any other stall-related activity. + rcutorture.stall_cpu_holdoff= [KNL] Time to wait (s) after boot before inducing stall. rcutorture.stall_cpu_irqsoff= [KNL] Disable interrupts while stalling if set. + rcutorture.stall_gp_kthread= [KNL] + Duration (s) of forced sleep within RCU + grace-period kthread to test RCU CPU stall + warnings, zero to disable. If both stall_cpu + and stall_gp_kthread are specified, the + kthread is starved first, then the CPU. + rcutorture.stat_interval= [KNL] Time (s) between statistics printk()s. @@ -4286,6 +4298,13 @@ only normal grace-period primitives. No effect on CONFIG_TINY_RCU kernels. + rcupdate.rcu_task_ipi_delay= [KNL] + Set time in jiffies during which RCU tasks will + avoid sending IPIs, starting with the beginning + of a given grace period. Setting a large + number avoids disturbing real-time workloads, + but lengthens grace periods. + rcupdate.rcu_task_stall_timeout= [KNL] Set timeout in jiffies for RCU task stall warning messages. Disable with a value less than or equal diff --git a/Documentation/trace/ftrace-design.rst b/Documentation/trace/ftrace-design.rst index a8e22e0db63c..6893399157f0 100644 --- a/Documentation/trace/ftrace-design.rst +++ b/Documentation/trace/ftrace-design.rst @@ -229,14 +229,6 @@ Adding support for it is easy: just define the macro in asm/ftrace.h and pass the return address pointer as the 'retp' argument to ftrace_push_return_trace(). -HAVE_FTRACE_NMI_ENTER ---------------------- - -If you can't trace NMI functions, then skip this option. - -<details to be filled> - - HAVE_SYSCALL_TRACEPOINTS ------------------------ |