<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md/bcache/writeback.c, branch linux-5.11.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-5.11.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-5.11.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-12-07T20:25:16+00:00</updated>
<entry>
<title>bcache: fix race between setting bdev state to none and new write request direct to backing</title>
<updated>2020-12-07T20:25:16+00:00</updated>
<author>
<name>Dongsheng Yang</name>
<email>dongsheng.yang@easystack.cn</email>
</author>
<published>2020-12-07T16:39:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=df4ad53242158f9f1f97daf4feddbb4f8b77f080'/>
<id>urn:sha1:df4ad53242158f9f1f97daf4feddbb4f8b77f080</id>
<content type='text'>
There is a race condition in detaching as below:
A. detaching			B. Write request
(1) writing back
(2) write back done, set bdev
    state to clean.
(3) cached_dev_put() and
    schedule_work(&amp;dc-&gt;detach);
				(4) write data [0 - 4K] directly
				    into backing and ack to user.
(5) power-failure...

When we restart this bcache device, this bdev is clean but not detached,
and read [0 - 4K], we will get unexpected old data from cache device.

To fix this problem, set the bdev state to none when we writeback done
in detaching, and then if power-failure happened as above, the data in
cache will not be used in next bcache device starting, it's detached, we
will read the correct data from backing derectly.

Signed-off-by: Dongsheng Yang &lt;dongsheng.yang@easystack.cn&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: remove embedded struct cache_sb from struct cache_set</title>
<updated>2020-10-02T20:25:30+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2020-10-01T06:50:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4a784266c6a75f375e08915b35e909df19eff17f'/>
<id>urn:sha1:4a784266c6a75f375e08915b35e909df19eff17f</id>
<content type='text'>
Since bcache code was merged into mainline kerrnel, each cache set only
as one single cache in it. The multiple caches framework is here but the
code is far from completed. Considering the multiple copies of cached
data can also be stored on e.g. md raid1 devices, it is unnecessary to
support multiple caches in one cache set indeed.

The previous preparation patches fix the dependencies of explicitly
making a cache set only have single cache. Now we don't have to maintain
an embedded partial super block in struct cache_set, the in-memory super
block can be directly referenced from struct cache.

This patch removes the embedded struct cache_sb from struct cache_set,
and fixes all locations where the superb lock was referenced from this
removed super block by referencing the in-memory super block of struct
cache.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: fix overflow in offset_to_stripe()</title>
<updated>2020-07-25T13:38:20+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2020-07-25T12:00:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7a1481267999c02abf4a624515c1b5c7c1fccbd6'/>
<id>urn:sha1:7a1481267999c02abf4a624515c1b5c7c1fccbd6</id>
<content type='text'>
offset_to_stripe() returns the stripe number (in type unsigned int) from
an offset (in type uint64_t) by the following calculation,
	do_div(offset, d-&gt;stripe_size);
For large capacity backing device (e.g. 18TB) with small stripe size
(e.g. 4KB), the result is 4831838208 and exceeds UINT_MAX. The actual
returned value which caller receives is 536870912, due to the overflow.

Indeed in bcache_device_init(), bcache_device-&gt;nr_stripes is limited in
range [1, INT_MAX]. Therefore all valid stripe numbers in bcache are
in range [0, bcache_dev-&gt;nr_stripes - 1].

This patch adds a upper limition check in offset_to_stripe(): the max
valid stripe number should be less than bcache_device-&gt;nr_stripes. If
the calculated stripe number from do_div() is equal to or larger than
bcache_device-&gt;nr_stripe, -EINVAL will be returned. (Normally nr_stripes
is less than INT_MAX, exceeding upper limitation doesn't mean overflow,
therefore -EOVERFLOW is not used as error code.)

This patch also changes nr_stripes' type of struct bcache_device from
'unsigned int' to 'int', and return value type of offset_to_stripe()
from 'unsigned int' to 'int', to match their exact data ranges.

All locations where bcache_device-&gt;nr_stripes and offset_to_stripe() are
referenced also get updated for the above type change.

Reported-and-tested-by: Ken Raeburn &lt;raeburn@redhat.com&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1783075
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: Use struct_size() in kzalloc()</title>
<updated>2020-07-25T13:38:20+00:00</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>gustavoars@kernel.org</email>
</author>
<published>2020-07-25T12:00:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=29f1d5caced9c2db96f169422bade787ff45d584'/>
<id>urn:sha1:29f1d5caced9c2db96f169422bade787ff45d584</id>
<content type='text'>
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.

This code was detected with the help of Coccinelle and, audited and
fixed manually.

Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: writeback: Remove unneeded variable i</title>
<updated>2020-07-25T13:38:20+00:00</updated>
<author>
<name>Xu Wang</name>
<email>vulab@iscas.ac.cn</email>
</author>
<published>2020-07-25T12:00:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7236657c6b3abf25f8b0a2465cd8f51728d0196b'/>
<id>urn:sha1:7236657c6b3abf25f8b0a2465cd8f51728d0196b</id>
<content type='text'>
Remove unneeded variable i in bch_dirty_init_thread().

Signed-off-by: Xu Wang &lt;vulab@iscas.ac.cn&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: Convert pr_&lt;level&gt; uses to a more typical style</title>
<updated>2020-05-27T11:19:36+00:00</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2020-05-27T04:01:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=46f5aa8806e34f2e48de852cc7db2c74c3a5cd8d'/>
<id>urn:sha1:46f5aa8806e34f2e48de852cc7db2c74c3a5cd8d</id>
<content type='text'>
Remove the trailing newline from the define of pr_fmt and add newlines
to the uses.

Miscellanea:

o Convert bch_bkey_dump from multiple uses of pr_err to pr_cont
  as the earlier conversion was inappropriate done causing multiple
  lines to be emitted where only a single output line was desired
o Use vsprintf extension %pV in bch_cache_set_error to avoid multiple
  line output where only a single line output was desired
o Coalesce formats

Fixes: 6ae63e3501c4 ("bcache: replace printk() by pr_*() routines")

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: optimize barrier usage for atomic operations</title>
<updated>2020-03-22T16:06:57+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2020-03-22T06:03:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=eb9b6666d6ca6f3d9f218fa23ec6135eee1ac3a7'/>
<id>urn:sha1:eb9b6666d6ca6f3d9f218fa23ec6135eee1ac3a7</id>
<content type='text'>
The idea of this patch is from Davidlohr Bueso, he posts a patch
for bcache to optimize barrier usage for read-modify-write atomic
bitops. Indeed such optimization can also apply on other locations
where smp_mb() is used before or after an atomic operation.

This patch replaces smp_mb() with smp_mb__before_atomic() or
smp_mb__after_atomic() in btree.c and writeback.c,  where it is used
to synchronize memory cache just earlier on other cores. Although
the locations are not on hot code path, it is always not bad to mkae
things a little better.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: optimize barrier usage for Rmw atomic bitops</title>
<updated>2020-03-22T16:06:57+00:00</updated>
<author>
<name>Davidlohr Bueso</name>
<email>dave@stgolabs.net</email>
</author>
<published>2020-03-22T06:03:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b004aa867c48b3232835b61ed9d44b572e29498e'/>
<id>urn:sha1:b004aa867c48b3232835b61ed9d44b572e29498e</id>
<content type='text'>
We can avoid the unnecessary barrier on non LL/SC architectures,
such as x86. Instead, use the smp_mb__after_atomic().

Signed-off-by: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: make bch_sectors_dirty_init() to be multithreaded</title>
<updated>2020-03-22T16:06:57+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2020-03-22T06:03:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b144e45fc57649e15cbc79ff2d32a942af1d91d5'/>
<id>urn:sha1:b144e45fc57649e15cbc79ff2d32a942af1d91d5</id>
<content type='text'>
When attaching a cached device (a.k.a backing device) to a cache
device, bch_sectors_dirty_init() is called to count dirty sectors
and stripes (see what bcache_dev_sectors_dirty_add() does) on the
cache device.

The counting is done by a single thread recursive function
bch_btree_map_keys() to iterate all the bcache btree nodes.
If the btree has huge number of nodes, bch_sectors_dirty_init() will
take quite long time. In my testing, if the registering cache set has
a existed UUID which matches a already registered cached device, the
automatical attachment during the registration may take more than
55 minutes. This is too long for waiting the bcache to work in real
deployment.

Fortunately when bch_sectors_dirty_init() is called, no other thread
will access the btree yet, it is safe to do a read-only parallelized
dirty sectors counting by multiple threads.

This patch tries to create multiple threads, and each thread tries to
one-by-one count dirty sectors from the sub-tree indexed by a root
node key which the thread fetched. After the sub-tree is counted, the
counting thread will continue to fetch another root node key, until
the fetched key is NULL. How many threads in parallel depends on
the number of keys from the btree root node, and the number of online
CPU core. The thread number will be the less number but no more than
BCH_DIRTY_INIT_THRD_MAX. If there are only 2 keys in root node, it
can only be 2x times faster by this patch. But if there are 10 keys
in the root node, with this patch it can be 10x times faster.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>bcache: add idle_max_writeback_rate sysfs interface</title>
<updated>2019-11-13T22:42:50+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-11-13T08:03:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c5fcdedcee4e6ae15c0eb5e0fbe25467e57d2963'/>
<id>urn:sha1:c5fcdedcee4e6ae15c0eb5e0fbe25467e57d2963</id>
<content type='text'>
For writeback mode, if there is no regular I/O request for a while,
the writeback rate will be set to the maximum value (1TB/s for now).
This is good for most of the storage workload, but there are still
people don't what the maximum writeback rate in I/O idle time.

This patch adds a sysfs interface file idle_max_writeback_rate to
permit people to disable maximum writeback rate. Then the minimum
writeback rate can be advised by writeback_rate_minimum in the
bcache device's sysfs interface.

Reported-by: Christian Balzer &lt;chibi@gol.com&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
