diff options
author | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2012-10-07 19:36:12 +0400 |
---|---|---|
committer | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2012-10-08 20:06:38 +0400 |
commit | a4fbe35a124526e6759be07bd9c7ea796ba1e00d (patch) | |
tree | cb5c5a1608fcff588ed9a204ea67d5891adb18fb /kernel/rcutree.h | |
parent | cb349ca95407cbc11424d5e9fc7c8e700709041b (diff) | |
download | linux-a4fbe35a124526e6759be07bd9c7ea796ba1e00d.tar.xz |
rcu: Grace-period initialization excludes only RCU notifier
Kirill noted the following deadlock cycle on shutdown involving padata:
> With commit 755609a9087fa983f567dc5452b2fa7b089b591f I've got deadlock on
> poweroff.
>
> It guess it happens because of race for cpu_hotplug.lock:
>
> CPU A CPU B
> disable_nonboot_cpus()
> _cpu_down()
> cpu_hotplug_begin()
> mutex_lock(&cpu_hotplug.lock);
> __cpu_notify()
> padata_cpu_callback()
> __padata_remove_cpu()
> padata_replace()
> synchronize_rcu()
> rcu_gp_kthread()
> get_online_cpus();
> mutex_lock(&cpu_hotplug.lock);
It would of course be good to eliminate grace-period delays from
CPU-hotplug notifiers, but that is a separate issue. Deadlock is
not an appropriate diagnostic for excessive CPU-hotplug latency.
Fortunately, grace-period initialization does not actually need to
exclude all of the CPU-hotplug operation, but rather only RCU's own
CPU_UP_PREPARE and CPU_DEAD CPU-hotplug notifiers. This commit therefore
introduces a new per-rcu_state onoff_mutex that provides the required
concurrency control in place of the get_online_cpus() that was previously
in rcu_gp_init().
Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Kirill A. Shutemov <kirill@shutemov.name>
Diffstat (limited to 'kernel/rcutree.h')
-rw-r--r-- | kernel/rcutree.h | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/kernel/rcutree.h b/kernel/rcutree.h index 5faf05d68326..a240f032848e 100644 --- a/kernel/rcutree.h +++ b/kernel/rcutree.h @@ -394,11 +394,17 @@ struct rcu_state { struct rcu_head **orphan_donetail; /* Tail of above. */ long qlen_lazy; /* Number of lazy callbacks. */ long qlen; /* Total number of callbacks. */ + /* End of fields guarded by onofflock. */ + + struct mutex onoff_mutex; /* Coordinate hotplug & GPs. */ + struct mutex barrier_mutex; /* Guards barrier fields. */ atomic_t barrier_cpu_count; /* # CPUs waiting on. */ struct completion barrier_completion; /* Wake at barrier end. */ unsigned long n_barrier_done; /* ++ at start and end of */ /* _rcu_barrier(). */ + /* End of fields guarded by barrier_mutex. */ + unsigned long jiffies_force_qs; /* Time at which to invoke */ /* force_quiescent_state(). */ unsigned long n_force_qs; /* Number of calls to */ |