block: don't release bdi while request_queue has live references

bdi's are initialized in two steps, bdi_init() and bdi_register(), but destroyed in a single step by bdi_destroy() which, for a bdi embedded in a request_queue, is called during blk_cleanup_queue() which makes the queue invisible and starts the draining of remaining usages. A request_queue's user can access the congestion state of the embedded bdi as long as it holds a reference to the queue. As such, it may access the congested state of a queue which finished blk_cleanup_queue() but hasn't reached blk_release_queue() yet. Because the congested state was embedded in backing_dev_info which in turn is embedded in request_queue, accessing the congested state after bdi_destroy() was called was fine. The bdi was destroyed but the memory region for the congested state remained accessible till the queue got released. a13f35e87140 ("writeback: don't embed root bdi_writeback_congested in bdi_writeback") changed the situation. Now, the root congested state which is expected to be pinned while request_queue remains accessible is separately reference counted and the base ref is put during bdi_destroy(). This means that the root congested state may go away prematurely while the queue is between bdi_dstroy() and blk_cleanup_queue(), which was detected by Andrey's KASAN tests. The root cause of this problem is that bdi doesn't distinguish the two steps of destruction, unregistration and release, and now the root congested state actually requires a separate release step. To fix the issue, this patch separates out bdi_unregister() and bdi_exit() from bdi_destroy(). bdi_unregister() is called from blk_cleanup_queue() and bdi_exit() from blk_release_queue(). bdi_destroy() is now just a simple wrapper calling the two steps back-to-back. While at it, the prototype of bdi_destroy() is moved right below bdi_setup_and_register() so that the counterpart operations are located together. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: a13f35e87140 ("writeback: don't embed root bdi_writeback_congested in bdi_writeback") Cc: stable@vger.kernel.org # v4.2+ Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com> Link: http://lkml.kernel.org/g/CAAeHK+zUJ74Zn17=rOyxacHU18SgCfC6bsYW=6kCY5GXJBwGfQ@mail.gmail.com Reviewed-by: Jan Kara <jack@suse.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
author: Tejun Heo <tj@kernel.org> 2015-09-08 19:20:22 +0300
committer: Jens Axboe <axboe@fb.com> 2015-10-15 18:53:28 +0300
commit: b02176f30cd30acccd3b633ab7d9aed8b5da52ff (patch)
tree: 96861c4612e0aae3b020c2293d05b7afb67f2557 /block/blk-sysfs.c
parent: 81c04b943872e0332872df18cec1dec89b178b4d (diff)
download: linux-b02176f30cd30acccd3b633ab7d9aed8b5da52ff.tar.xz
1 files changed, 1 insertions, 0 deletions
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 3e44a9da2a13..07b42f5ad797 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -540,6 +540,7 @@ static void blk_release_queue(struct kobject *kobj)
 	struct request_queue *q =
 		container_of(kobj, struct request_queue, kobj);
 
+	bdi_exit(&q->backing_dev_info);
 	blkcg_exit_queue(q);
 
 	if (q->elevator) {
author	Tejun Heo <tj@kernel.org>	2015-09-08 19:20:22 +0300
committer	Jens Axboe <axboe@fb.com>	2015-10-15 18:53:28 +0300
commit	b02176f30cd30acccd3b633ab7d9aed8b5da52ff (patch)
tree	96861c4612e0aae3b020c2293d05b7afb67f2557 /block/blk-sysfs.c
parent	81c04b943872e0332872df18cec1dec89b178b4d (diff)
download	linux-b02176f30cd30acccd3b633ab7d9aed8b5da52ff.tar.xz