<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/block/bio.c, branch v5.4.201</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.4.201</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.4.201'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2022-06-14T16:11:50+00:00</updated>
<entry>
<title>block: fix bio_clone_blkg_association() to associate with proper blkcg_gq</title>
<updated>2022-06-14T16:11:50+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2022-06-07T09:15:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5f62b21b7c93ad0c0b77585685ed906c919c437f'/>
<id>urn:sha1:5f62b21b7c93ad0c0b77585685ed906c919c437f</id>
<content type='text'>
commit 22b106e5355d6e7a9c3b5cb5ed4ef22ae585ea94 upstream.

Commit d92c370a16cb ("block: really clone the block cgroup in
bio_clone_blkg_association") changed bio_clone_blkg_association() to
just clone bio-&gt;bi_blkg reference from source to destination bio. This
is however wrong if the source and destination bios are against
different block devices because struct blkcg_gq is different for each
bdev-blkcg pair. This will result in IOs being accounted (and throttled
as a result) multiple times against the same device (src bdev) while
throttling of the other device (dst bdev) is ignored. In case of BFQ the
inconsistency can even result in crashes in bfq_bic_update_cgroup().
Fix the problem by looking up correct blkcg_gq for the cloned bio.

Reported-by: Logan Gunthorpe &lt;logang@deltatee.com&gt;
Reported-and-tested-by: Donald Buczek &lt;buczek@molgen.mpg.de&gt;
Fixes: d92c370a16cb ("block: really clone the block cgroup in bio_clone_blkg_association")
CC: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20220602081242.7731-1-jack@suse.cz
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block-map: add __GFP_ZERO flag for alloc_page in function bio_copy_kern</title>
<updated>2022-05-12T10:23:48+00:00</updated>
<author>
<name>Haimin Zhang</name>
<email>tcs.kernel@gmail.com</email>
</author>
<published>2022-02-16T08:40:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c7337efd1d11acb6f84c68ffee57d3f312e87b24'/>
<id>urn:sha1:c7337efd1d11acb6f84c68ffee57d3f312e87b24</id>
<content type='text'>
commit cc8f7fe1f5eab010191aa4570f27641876fa1267 upstream.

Add __GFP_ZERO flag for alloc_page in function bio_copy_kern to initialize
the buffer of a bio.

Signed-off-by: Haimin Zhang &lt;tcs.kernel@gmail.com&gt;
Reviewed-by: Chaitanya Kulkarni &lt;kch@nvidia.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20220216084038.15635-1-tcs.kernel@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
[nobelbarakat: Backported to 5.4: Manually added __GFP_ZERO flag]
Signed-off-by: Nobel Barakat &lt;nobelbarakat@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: Fix wrong offset in bio_truncate()</title>
<updated>2022-02-01T16:24:39+00:00</updated>
<author>
<name>OGAWA Hirofumi</name>
<email>hirofumi@mail.parknet.co.jp</email>
</author>
<published>2022-01-09T09:36:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6cbf4c731d7812518cd857c2cfc3da9fd120f6ae'/>
<id>urn:sha1:6cbf4c731d7812518cd857c2cfc3da9fd120f6ae</id>
<content type='text'>
commit 3ee859e384d453d6ac68bfd5971f630d9fa46ad3 upstream.

bio_truncate() clears the buffer outside of last block of bdev, however
current bio_truncate() is using the wrong offset of page. So it can
return the uninitialized data.

This happened when both of truncated/corrupted FS and userspace (via
bdev) are trying to read the last of bdev.

Reported-by: syzbot+ac94ae5f68b84197f41c@syzkaller.appspotmail.com
Signed-off-by: OGAWA Hirofumi &lt;hirofumi@mail.parknet.co.jp&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Link: https://lore.kernel.org/r/875yqt1c9g.fsf@mail.parknet.co.jp
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: only update parent bi_status when bio fail</title>
<updated>2021-04-16T09:46:38+00:00</updated>
<author>
<name>Yufen Yu</name>
<email>yuyufen@huawei.com</email>
</author>
<published>2021-03-31T11:53:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f5b60f26e36be6ae266d7731c41bf01c5b35019c'/>
<id>urn:sha1:f5b60f26e36be6ae266d7731c41bf01c5b35019c</id>
<content type='text'>
[ Upstream commit 3edf5346e4f2ce2fa0c94651a90a8dda169565ee ]

For multiple split bios, if one of the bio is fail, the whole
should return error to application. But we found there is a race
between bio_integrity_verify_fn and bio complete, which return
io success to application after one of the bio fail. The race as
following:

split bio(READ)          kworker

nvme_complete_rq
blk_update_request //split error=0
  bio_endio
    bio_integrity_endio
      queue_work(kintegrityd_wq, &amp;bip-&gt;bip_work);

                         bio_integrity_verify_fn
                         bio_endio //split bio
                          __bio_chain_endio
                             if (!parent-&gt;bi_status)

                               &lt;interrupt entry&gt;
                               nvme_irq
                                 blk_update_request //parent error=7
                                 req_bio_endio
                                    bio-&gt;bi_status = 7 //parent bio
                               &lt;interrupt exit&gt;

                               parent-&gt;bi_status = 0
                        parent-&gt;bi_end_io() // return bi_status=0

The bio has been split as two: split and parent. When split
bio completed, it depends on kworker to do endio, while
bio_integrity_verify_fn have been interrupted by parent bio
complete irq handler. Then, parent bio-&gt;bi_status which have
been set in irq handler will overwrite by kworker.

In fact, even without the above race, we also need to conside
the concurrency beteen mulitple split bio complete and update
the same parent bi_status. Normally, multiple split bios will
be issued to the same hctx and complete from the same irq
vector. But if we have updated queue map between multiple split
bios, these bios may complete on different hw queue and different
irq vector. Then the concurrency update parent bi_status may
cause the final status error.

Suggested-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Yufen Yu &lt;yuyufen@huawei.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Link: https://lore.kernel.org/r/20210331115359.1125679-1-yuyufen@huawei.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block/diskstats: more accurate approximation of io_ticks for slow disks</title>
<updated>2020-10-07T06:01:29+00:00</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2020-03-25T13:07:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2334b2d5a2bd1a4e9877623ef532b114d83bdf94'/>
<id>urn:sha1:2334b2d5a2bd1a4e9877623ef532b114d83bdf94</id>
<content type='text'>
commit 2b8bd423614c595540eaadcfbc702afe8e155e50 upstream.

Currently io_ticks is approximated by adding one at each start and end of
requests if jiffies counter has changed. This works perfectly for requests
shorter than a jiffy or if one of requests starts/ends at each jiffy.

If disk executes just one request at a time and they are longer than two
jiffies then only first and last jiffies will be accounted.

Fix is simple: at the end of request add up into io_ticks jiffies passed
since last update rather than just one jiffy.

Example: common HDD executes random read 4k requests around 12ms.

fio --name=test --filename=/dev/sdb --rw=randread --direct=1 --runtime=30 &amp;
iostat -x 10 sdb

Note changes of iostat's "%util" 8,43% -&gt; 99,99% before/after patch:

Before:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0,00     0,00   82,60    0,00   330,40     0,00     8,00     0,96   12,09   12,09    0,00   1,02   8,43

After:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0,00     0,00   82,50    0,00   330,00     0,00     8,00     1,00   12,10   12,10    0,00  12,12  99,99

Now io_ticks does not loose time between start and end of requests, but
for queue-depth &gt; 1 some I/O time between adjacent starts might be lost.

For load estimation "%util" is not as useful as average queue length,
but it clearly shows how often disk queue is completely empty.

Fixes: 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
From: "Banerjee, Debabrata" &lt;dbanerje@akamai.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block: Set same_page to false in __bio_try_merge_page if ret is false</title>
<updated>2020-09-17T11:47:44+00:00</updated>
<author>
<name>Ritesh Harjani</name>
<email>riteshh@linux.ibm.com</email>
</author>
<published>2020-09-09T03:14:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6736317f350a14a5cfe953f69685b1cb0b3607f7'/>
<id>urn:sha1:6736317f350a14a5cfe953f69685b1cb0b3607f7</id>
<content type='text'>
[ Upstream commit 2cd896a5e86fc326bda8614b96c0401dcc145868 ]

If we hit the UINT_MAX limit of bio-&gt;bi_iter.bi_size and so we are anyway
not merging this page in this bio, then it make sense to make same_page
also as false before returning.

Without this patch, we hit below WARNING in iomap.
This mostly happens with very large memory system and / or after tweaking
vm dirty threshold params to delay writeback of dirty data.

WARNING: CPU: 18 PID: 5130 at fs/iomap/buffered-io.c:74 iomap_page_release+0x120/0x150
 CPU: 18 PID: 5130 Comm: fio Kdump: loaded Tainted: G        W         5.8.0-rc3 #6
 Call Trace:
  __remove_mapping+0x154/0x320 (unreliable)
  iomap_releasepage+0x80/0x180
  try_to_release_page+0x94/0xe0
  invalidate_inode_page+0xc8/0x110
  invalidate_mapping_pages+0x1dc/0x540
  generic_fadvise+0x3c8/0x450
  xfs_file_fadvise+0x2c/0xe0 [xfs]
  vfs_fadvise+0x3c/0x60
  ksys_fadvise64_64+0x68/0xe0
  sys_fadvise64+0x28/0x40
  system_call_exception+0xf8/0x1c0
  system_call_common+0xf0/0x278

Fixes: cc90bc68422 ("block: fix "check bi_size overflow before merge"")
Reported-by: Shivaprasad G Bhat &lt;sbhat@linux.ibm.com&gt;
Suggested-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Anju T Sudhakar &lt;anju@linux.vnet.ibm.com&gt;
Signed-off-by: Ritesh Harjani &lt;riteshh@linux.ibm.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: Fix page_is_mergeable() for compound pages</title>
<updated>2020-09-03T09:26:54+00:00</updated>
<author>
<name>Matthew Wilcox (Oracle)</name>
<email>willy@infradead.org</email>
</author>
<published>2020-08-17T19:52:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2295664518c381ca422e95f2ca3a23f4bd06a711'/>
<id>urn:sha1:2295664518c381ca422e95f2ca3a23f4bd06a711</id>
<content type='text'>
[ Upstream commit d81665198b83e55a28339d1f3e4890ed8a434556 ]

If we pass in an offset which is larger than PAGE_SIZE, then
page_is_mergeable() thinks it's not mergeable with the previous bio_vec,
leading to a large number of bio_vecs being used.  Use a slightly more
obvious test that the two pages are compatible with each other.

Fixes: 52d52d1c98a9 ("block: only allow contiguous page structs in a bio_vec")
Signed-off-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: fix memleak of bio integrity data</title>
<updated>2020-01-26T09:01:09+00:00</updated>
<author>
<name>Justin Tee</name>
<email>justin.tee@broadcom.com</email>
</author>
<published>2019-12-05T02:09:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ccbc5d03c2bd83a07be78351571483bc22c0c63e'/>
<id>urn:sha1:ccbc5d03c2bd83a07be78351571483bc22c0c63e</id>
<content type='text'>
[ Upstream commit ece841abbed2da71fa10710c687c9ce9efb6bf69 ]

7c20f11680a4 ("bio-integrity: stop abusing bi_end_io") moves
bio_integrity_free from bio_uninit() to bio_integrity_verify_fn()
and bio_endio(). This way looks wrong because bio may be freed
without calling bio_endio(), for example, blk_rq_unprep_clone() is
called from dm_mq_queue_rq() when the underlying queue of dm-mpath
is busy.

So memory leak of bio integrity data is caused by commit 7c20f11680a4.

Fixes this issue by re-adding bio_integrity_free() to bio_uninit().

Fixes: 7c20f11680a4 ("bio-integrity: stop abusing bi_end_io")
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by Justin Tee &lt;justin.tee@broadcom.com&gt;

Add commit log, and simplify/fix the original patch wroten by Justin.

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>fs: move guard_bio_eod() after bio_set_op_attrs</title>
<updated>2020-01-17T18:48:21+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2020-01-05T01:41:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3fe209c8432b829a605c5b7c932d39469c25277d'/>
<id>urn:sha1:3fe209c8432b829a605c5b7c932d39469c25277d</id>
<content type='text'>
commit 83c9c547168e8b914ea6398430473a4de68c52cc upstream.

Commit 85a8ce62c2ea ("block: add bio_truncate to fix guard_bio_eod")
adds bio_truncate() for handling bio EOD. However, bio_truncate()
doesn't use the passed 'op' parameter from guard_bio_eod's callers.

So bio_trunacate() may retrieve wrong 'op', and zering pages may
not be done for READ bio.

Fixes this issue by moving guard_bio_eod() after bio_set_op_attrs()
in submit_bh_wbc() so that bio_truncate() can always retrieve correct
op info.

Meantime remove the 'op' parameter from guard_bio_eod() because it isn't
used any more.

Cc: Carlos Maiolino &lt;cmaiolino@redhat.com&gt;
Cc: linux-fsdevel@vger.kernel.org
Fixes: 85a8ce62c2ea ("block: add bio_truncate to fix guard_bio_eod")
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

Fold in kerneldoc and bio_op() change.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;

</content>
</entry>
<entry>
<title>block: add bio_truncate to fix guard_bio_eod</title>
<updated>2020-01-09T09:19:54+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-12-27T23:05:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=943cd69efac437d82a7aea0659fccbcc071730de'/>
<id>urn:sha1:943cd69efac437d82a7aea0659fccbcc071730de</id>
<content type='text'>
[ Upstream commit 85a8ce62c2eabe28b9d76ca4eecf37922402df93 ]

Some filesystem, such as vfat, may send bio which crosses device boundary,
and the worse thing is that the IO request starting within device boundaries
can contain more than one segment past EOD.

Commit dce30ca9e3b6 ("fs: fix guard_bio_eod to check for real EOD errors")
tries to fix this issue by returning -EIO for this situation. However,
this way lets fs user code lose chance to handle -EIO, then sync_inodes_sb()
may hang for ever.

Also the current truncating on last segment is dangerous by updating the
last bvec, given bvec table becomes not immutable any more, and fs bio
users may not retrieve the truncated pages via bio_for_each_segment_all() in
its .end_io callback.

Fixes this issue by supporting multi-segment truncating. And the
approach is simpler:

- just update bio size since block layer can make correct bvec with
the updated bio size. Then bvec table becomes really immutable.

- zero all truncated segments for read bio

Cc: Carlos Maiolino &lt;cmaiolino@redhat.com&gt;
Cc: linux-fsdevel@vger.kernel.org
Fixed-by: dce30ca9e3b6 ("fs: fix guard_bio_eod to check for real EOD errors")
Reported-by: syzbot+2b9e54155c8c25d8d165@syzkaller.appspotmail.com
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
