diff options
author | Thomas Gleixner <tglx@linutronix.de> | 2017-07-11 01:50:09 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2017-07-11 02:32:33 +0300 |
commit | 3f906ba23689a3f824424c50f3ae937c2c70f676 (patch) | |
tree | dbf8731ebb49d4cca6dd227e6ff5aaf828f6abf0 /mm/page_alloc.c | |
parent | a47fed5b5b014f5a13878b90ef2c3a7dc294189f (diff) | |
download | linux-3f906ba23689a3f824424c50f3ae937c2c70f676.tar.xz |
mm/memory-hotplug: switch locking to a percpu rwsem
Andrey reported a potential deadlock with the memory hotplug lock and
the cpu hotplug lock.
The reason is that memory hotplug takes the memory hotplug lock and then
calls stop_machine() which calls get_online_cpus(). That's the reverse
lock order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
The problem has been there forever. The reason why this was never
reported is that the cpu hotplug locking had this homebrewn recursive
reader writer semaphore construct which due to the recursion evaded the
full lock dep coverage. The memory hotplug code copied that construct
verbatim and therefor has similar issues.
Three steps to fix this:
1) Convert the memory hotplug locking to a per cpu rwsem so the
potential issues get reported proper by lockdep.
2) Lock the online cpus in mem_hotplug_begin() before taking the memory
hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc
code to avoid recursive locking.
3) The cpu hotpluck locking in #2 causes a recursive locking of the cpu
hotplug lock via __offline_pages() -> lru_add_drain_all(). Solve this
by invoking lru_add_drain_all_cpuslocked() instead.
Link: http://lkml.kernel.org/r/20170704093421.506836322@linutronix.de
Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/page_alloc.c')
-rw-r--r-- | mm/page_alloc.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d90c31951b90..64b7d82a9b1a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5278,7 +5278,7 @@ void __ref build_all_zonelists(pg_data_t *pgdat, struct zone *zone) #endif /* we have to stop all cpus to guarantee there is no user of zonelist */ - stop_machine(__build_all_zonelists, pgdat, NULL); + stop_machine_cpuslocked(__build_all_zonelists, pgdat, NULL); /* cpuset refresh routine should be here */ } vm_total_pages = nr_free_pagecache_pages(); |