diff options
author | Coly Li <colyli@suse.de> | 2022-05-28 15:45:50 +0300 |
---|---|---|
committer | Jens Axboe <axboe@kernel.dk> | 2022-05-28 15:48:26 +0300 |
commit | a1a2d8f0162b27e85e7ce0ae6a35c96a490e0559 (patch) | |
tree | 5eac047e8f21b317acddfd009b7ad4577520a87c | |
parent | 1243172d5894e2d8f277ee3c278180792de5c521 (diff) | |
download | linux-a1a2d8f0162b27e85e7ce0ae6a35c96a490e0559.tar.xz |
bcache: avoid unnecessary soft lockup in kworker update_writeback_rate()
The kworker routine update_writeback_rate() is schedued to update the
writeback rate in every 5 seconds by default. Before calling
__update_writeback_rate() to do real job, semaphore dc->writeback_lock
should be held by the kworker routine.
At the same time, bcache writeback thread routine bch_writeback_thread()
also needs to hold dc->writeback_lock before flushing dirty data back
into the backing device. If the dirty data set is large, it might be
very long time for bch_writeback_thread() to scan all dirty buckets and
releases dc->writeback_lock. In such case update_writeback_rate() can be
starved for long enough time so that kernel reports a soft lockup warn-
ing started like:
watchdog: BUG: soft lockup - CPU#246 stuck for 23s! [kworker/246:31:179713]
Such soft lockup condition is unnecessary, because after the writeback
thread finishes its job and releases dc->writeback_lock, the kworker
update_writeback_rate() may continue to work and everything is fine
indeed.
This patch avoids the unnecessary soft lockup by the following method,
- Add new member to struct cached_dev
- dc->rate_update_retry (0 by default)
- In update_writeback_rate() call down_read_trylock(&dc->writeback_lock)
firstly, if it fails then lock contention happens.
- If dc->rate_update_retry <= BCH_WBRATE_UPDATE_MAX_SKIPS (15), doesn't
acquire the lock and reschedules the kworker for next try.
- If dc->rate_update_retry > BCH_WBRATE_UPDATE_MAX_SKIPS, no retry
anymore and call down_read(&dc->writeback_lock) to wait for the lock.
By the above method, at worst case update_writeback_rate() may retry for
1+ minutes before blocking on dc->writeback_lock by calling down_read().
For a 4TB cache device with 1TB dirty data, 90%+ of the unnecessary soft
lockup warning message can be avoided.
When retrying to acquire dc->writeback_lock in update_writeback_rate(),
of course the writeback rate cannot be updated. It is fair, because when
the kworker is blocked on the lock contention of dc->writeback_lock, the
writeback rate cannot be updated neither.
This change follows Jens Axboe's suggestion to a more clear and simple
version.
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20220528124550.32834-2-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-rw-r--r-- | drivers/md/bcache/bcache.h | 7 | ||||
-rw-r--r-- | drivers/md/bcache/writeback.c | 31 |
2 files changed, 28 insertions, 10 deletions
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index 9ed9c955add7..2acda9cea0f9 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -395,6 +395,13 @@ struct cached_dev { atomic_t io_errors; unsigned int error_limit; unsigned int offline_seconds; + + /* + * Retry to update writeback_rate if contention happens for + * down_read(dc->writeback_lock) in update_writeback_rate() + */ +#define BCH_WBRATE_UPDATE_MAX_SKIPS 15 + unsigned int rate_update_retry; }; enum alloc_reserve { diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c index d138a2d73240..3f0ff3aab6f2 100644 --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -235,19 +235,27 @@ static void update_writeback_rate(struct work_struct *work) return; } - if (atomic_read(&dc->has_dirty) && dc->writeback_percent) { - /* - * If the whole cache set is idle, set_at_max_writeback_rate() - * will set writeback rate to a max number. Then it is - * unncessary to update writeback rate for an idle cache set - * in maximum writeback rate number(s). - */ - if (!set_at_max_writeback_rate(c, dc)) { - down_read(&dc->writeback_lock); + /* + * If the whole cache set is idle, set_at_max_writeback_rate() + * will set writeback rate to a max number. Then it is + * unncessary to update writeback rate for an idle cache set + * in maximum writeback rate number(s). + */ + if (atomic_read(&dc->has_dirty) && dc->writeback_percent && + !set_at_max_writeback_rate(c, dc)) { + do { + if (!down_read_trylock((&dc->writeback_lock))) { + dc->rate_update_retry++; + if (dc->rate_update_retry <= + BCH_WBRATE_UPDATE_MAX_SKIPS) + break; + down_read(&dc->writeback_lock); + dc->rate_update_retry = 0; + } __update_writeback_rate(dc); update_gc_after_writeback(c); up_read(&dc->writeback_lock); - } + } while (0); } @@ -1006,6 +1014,9 @@ void bch_cached_dev_writeback_init(struct cached_dev *dc) dc->writeback_rate_fp_term_high = 1000; dc->writeback_rate_i_term_inverse = 10000; + /* For dc->writeback_lock contention in update_writeback_rate() */ + dc->rate_update_retry = 0; + WARN_ON(test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags)); INIT_DELAYED_WORK(&dc->writeback_rate_update, update_writeback_rate); } |