<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md/md.c, branch v6.13.6</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.13.6</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.13.6'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-02-17T10:36:15+00:00</updated>
<entry>
<title>md: reintroduce md-linear</title>
<updated>2025-02-17T10:36:15+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2025-01-02T11:28:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6a5f0160d5c1f27b183aa689c1f63dd8a7c8607a'/>
<id>urn:sha1:6a5f0160d5c1f27b183aa689c1f63dd8a7c8607a</id>
<content type='text'>
commit 127186cfb184eaccdfe948e6da66940cfa03efc5 upstream.

THe md-linear is removed by commit 849d18e27be9 ("md: Remove deprecated
CONFIG_MD_LINEAR") because it has been marked as deprecated for a long
time.

However, md-linear is used widely for underlying disks with different size,
sadly we didn't know this until now, and it's true useful to create
partitions and assemble multiple raid and then append one to the other.

People have to use dm-linear in this case now, however, they will prefer
to minimize the number of involved modules.

Fixes: 849d18e27be9 ("md: Remove deprecated CONFIG_MD_LINEAR")
Cc: stable@vger.kernel.org
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Acked-by: Coly Li &lt;colyli@kernel.org&gt;
Acked-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Link: https://lore.kernel.org/r/20250102112841.1227111-1-yukuai1@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>md/md-bitmap: Synchronize bitmap_get_stats() with bitmap lifetime</title>
<updated>2025-02-08T09:02:25+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2025-01-24T09:20:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4e9316eee3885bfb311b4759513f2ccf37891c09'/>
<id>urn:sha1:4e9316eee3885bfb311b4759513f2ccf37891c09</id>
<content type='text'>
commit 8d28d0ddb986f56920ac97ae704cc3340a699a30 upstream.

After commit ec6bb299c7c3 ("md/md-bitmap: add 'sync_size' into struct
md_bitmap_stats"), following panic is reported:

Oops: general protection fault, probably for non-canonical address
RIP: 0010:bitmap_get_stats+0x2b/0xa0
Call Trace:
 &lt;TASK&gt;
 md_seq_show+0x2d2/0x5b0
 seq_read_iter+0x2b9/0x470
 seq_read+0x12f/0x180
 proc_reg_read+0x57/0xb0
 vfs_read+0xf6/0x380
 ksys_read+0x6c/0xf0
 do_syscall_64+0x82/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Root cause is that bitmap_get_stats() can be called at anytime if mddev
is still there, even if bitmap is destroyed, or not fully initialized.
Deferenceing bitmap in this case can crash the kernel. Meanwhile, the
above commit start to deferencing bitmap-&gt;storage, make the problem
easier to trigger.

Fix the problem by protecting bitmap_get_stats() with bitmap_info.mutex.

Cc: stable@vger.kernel.org # v6.12+
Fixes: 32a7627cf3a3 ("[PATCH] md: optimised resync using Bitmap based intent logging")
Reported-and-tested-by: Harshit Mogalapalli &lt;harshit.m.mogalapalli@oracle.com&gt;
Closes: https://lore.kernel.org/linux-raid/ca3a91a2-50ae-4f68-b317-abd9889f3907@oracle.com/T/#m6e5086c95201135e4941fe38f9efa76daf9666c5
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20250124092055.4050195-1-yukuai1@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>md/md-bitmap: move bitmap_{start, end}write to md upper layer</title>
<updated>2025-02-08T09:02:20+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2025-01-09T01:51:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2fe16993ddf378d6be17f4a39491f54f660e6171'/>
<id>urn:sha1:2fe16993ddf378d6be17f4a39491f54f660e6171</id>
<content type='text'>
commit cd5fc653381811f1e0ba65f5d169918cab61476f upstream.

There are two BUG reports that raid5 will hang at
bitmap_startwrite([1],[2]), root cause is that bitmap start write and end
write is unbalanced, it's not quite clear where, and while reviewing raid5
code, it's found that bitmap operations can be optimized. For example,
for a 4 disks raid5, with chunksize=8k, if user issue a IO (0 + 48k) to
the array:

┌────────────────────────────────────────────────────────────┐
│chunk 0                                                     │
│      ┌────────────┬─────────────┬─────────────┬────────────┼
│  sh0 │A0: 0 + 4k  │A1: 8k + 4k  │A2: 16k + 4k │A3: P       │
│      ┼────────────┼─────────────┼─────────────┼────────────┼
│  sh1 │B0: 4k + 4k │B1: 12k + 4k │B2: 20k + 4k │B3: P       │
┼──────┴────────────┴─────────────┴─────────────┴────────────┼
│chunk 1                                                     │
│      ┌────────────┬─────────────┬─────────────┬────────────┤
│  sh2 │C0: 24k + 4k│C1: 32k + 4k │C2: P        │C3: 40k + 4k│
│      ┼────────────┼─────────────┼─────────────┼────────────┼
│  sh3 │D0: 28k + 4k│D1: 36k + 4k │D2: P        │D3: 44k + 4k│
└──────┴────────────┴─────────────┴─────────────┴────────────┘

Before this patch, 4 stripe head will be used, and each sh will attach
bio for 3 disks, and each attached bio will trigger
bitmap_startwrite() once, which means total 12 times.
 - 3 times (0 + 4k), for (A0, A1 and A2)
 - 3 times (4 + 4k), for (B0, B1 and B2)
 - 3 times (8 + 4k), for (C0, C1 and C3)
 - 3 times (12 + 4k), for (D0, D1 and D3)

After this patch, md upper layer will calculate that IO range (0 + 48k)
is corresponding to the bitmap (0 + 16k), and call bitmap_startwrite()
just once.

Noted that this patch will align bitmap ranges to the chunks, for example,
if user issue a IO (0 + 4k) to array:

- Before this patch, 1 time (0 + 4k), for A0;
- After this patch, 1 time (0 + 8k) for chunk 0;

Usually, one bitmap bit will represent more than one disk chunk, and this
doesn't have any difference. And even if user really created a array
that one chunk contain multiple bits, the overhead is that more data
will be recovered after power failure.

Also remove STRIPE_BITMAP_PENDING since it's not used anymore.

[1] https://lore.kernel.org/all/CAJpMwyjmHQLvm6zg1cmQErttNNQPDAAXPKM3xgTjMhbfts986Q@mail.gmail.com/
[2] https://lore.kernel.org/all/ADF7D720-5764-4AF3-B68E-1845988737AA@flyingcircus.io/

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20250109015145.158868-6-yukuai1@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: Yu Kuai &lt;yukuai1@huaweicloud.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-6.13/block-20241118' of git://git.kernel.dk/linux</title>
<updated>2024-11-19T00:50:08+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-11-19T00:50:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=77a0cfafa9af9c0d5b43534eb90d530c189edca1'/>
<id>urn:sha1:77a0cfafa9af9c0d5b43534eb90d530c189edca1</id>
<content type='text'>
Pull block updates from Jens Axboe:

 - NVMe updates via Keith:
      - Use uring_cmd helper (Pavel)
      - Host Memory Buffer allocation enhancements (Christoph)
      - Target persistent reservation support (Guixin)
      - Persistent reservation tracing (Guixen)
      - NVMe 2.1 specification support (Keith)
      - Rotational Meta Support (Matias, Wang, Keith)
      - Volatile cache detection enhancment (Guixen)

 - MD updates via Song:
      - Maintainers update
      - raid5 sync IO fix
      - Enhance handling of faulty and blocked devices
      - raid5-ppl atomic improvement
      - md-bitmap fix

 - Support for manually defining embedded partition tables

 - Zone append fixes and cleanups

 - Stop sending the queued requests in the plug list to the driver
   -&gt;queue_rqs() handle in reverse order.

 - Zoned write plug cleanups

 - Cleanups disk stats tracking and add support for disk stats for
   passthrough IO

 - Add preparatory support for file system atomic writes

 - Add lockdep support for queue freezing. Already found a bunch of
   issues, and some fixes for that are in here. More will be coming.

 - Fix race between queue stopping/quiescing and IO queueing

 - ublk recovery improvements

 - Fix ublk mmap for 64k pages

 - Various fixes and cleanups

* tag 'for-6.13/block-20241118' of git://git.kernel.dk/linux: (118 commits)
  MAINTAINERS: Update git tree for mdraid subsystem
  block: make struct rq_list available for !CONFIG_BLOCK
  block/genhd: use seq_put_decimal_ull for diskstats decimal values
  block: don't reorder requests in blk_mq_add_to_batch
  block: don't reorder requests in blk_add_rq_to_plug
  block: add a rq_list type
  block: remove rq_list_move
  virtio_blk: reverse request order in virtio_queue_rqs
  nvme-pci: reverse request order in nvme_queue_rqs
  btrfs: validate queue limits
  block: export blk_validate_limits
  nvmet: add tracing of reservation commands
  nvme: parse reservation commands's action and rtype to string
  nvmet: report ns's vwc not present
  md/raid5: Increase r5conf.cache_name size
  block: remove the ioprio field from struct request
  block: remove the write_hint field from struct request
  nvme: check ns's volatile write cache not present
  nvme: add rotational support
  nvme: use command set independent id ns if available
  ...
</content>
</entry>
<entry>
<title>md: don't record new badblocks for faulty rdev</title>
<updated>2024-11-06T00:08:38+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-10-31T03:31:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=29967332ced51a15a22f11381eeebbc500ba1858'/>
<id>urn:sha1:29967332ced51a15a22f11381eeebbc500ba1858</id>
<content type='text'>
Faulty will be checked before issuing IO to the rdev, however, rdev can
be faulty at any time, hence it's possible that rdev_set_badblocks()
will be called for faulty rdev. In this case, mddev-&gt;sb_flags will be
set and some other path can be blocked by updating super block.

Since faulty rdev will not be accesed anymore, there is no need to
record new babblocks for faulty rdev and forcing updating super block.

Noted this is not a bugfix, just prevent updating superblock in some
corner cases, and will help to slice a bug related to external
metadata[1], testing also shows that devices are removed faster in the
case IO error.

[1] https://lore.kernel.org/all/f34452df-810b-48b2-a9b4-7f925699a9e7@linux.intel.com/

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Tested-by: Mariusz Tkaczyk &lt;mariusz.tkaczyk@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20241031033114.3845582-4-yukuai1@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: don't wait faulty rdev in md_wait_for_blocked_rdev()</title>
<updated>2024-11-06T00:08:38+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-10-31T03:31:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=50e8274855e7ab5499ff8296e09802874a3f03b1'/>
<id>urn:sha1:50e8274855e7ab5499ff8296e09802874a3f03b1</id>
<content type='text'>
md_wait_for_blocked_rdev() is called for write IO while rdev is
blocked, howerver, rdev can be faulty after choosing this rdev to write,
and faulty rdev should never be accessed anymore, hence there is no point
to wait for faulty rdev to be unblocked.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Tested-by: Mariusz Tkaczyk &lt;mariusz.tkaczyk@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20241031033114.3845582-3-yukuai1@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: ensure child flush IO does not affect origin bio-&gt;bi_status</title>
<updated>2024-10-17T18:35:36+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2024-09-19T06:30:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=62ce0782bbacd32ec10292b9bdd127330e9b6968'/>
<id>urn:sha1:62ce0782bbacd32ec10292b9bdd127330e9b6968</id>
<content type='text'>
When a flush is issued to an RAID array, a child flush IO is created and
issued for each member disk in the RAID array. Since commit b75197e86e6d
("md: Remove flush handling"), each child flush IO has been chained with
the original bio. As a result, the failure of any child IO could modify
the bi_status of the original bio, potentially impacting the upper-layer
filesystem.

Fix the issue by preventing child flush IO from altering the original
bio-&gt;bi_status as before. However, this design introduces a known
issue: in the event of a power failure, if a flush IO on a member
disk fails, the upper layers may not be informed. This issue is not easy
to fix and will not be addressed for the time being in this issue.

Fixes: b75197e86e6d ("md: Remove flush handling")
Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20240919063048.2887579-1-linan666@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: Add new_level sysfs interface</title>
<updated>2024-09-06T17:31:12+00:00</updated>
<author>
<name>Xiao Ni</name>
<email>xni@redhat.com</email>
</author>
<published>2024-09-04T23:54:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d981ed8419303ed12351eea8541ad6cb76455fe3'/>
<id>urn:sha1:d981ed8419303ed12351eea8541ad6cb76455fe3</id>
<content type='text'>
Now reshape supports two ways: with backup file or without backup file.
For the situation without backup file, it needs to change data offset.
It doesn't need systemd service mdadm-grow-continue. So it can finish
the reshape job in one process environment. It can know the new level
from mdadm --grow command and can change to new level after reshape
finishes.

For the situation with backup file, it needs systemd service
mdadm-grow-continue to monitor reshape progress. So there are two process
envolved. One is mdadm --grow command whick kicks off reshape and wakes
up mdadm-grow-continue service. The second process is the service, which
doesn't know the new level from the first process.

In kernel space mddev-&gt;new_level is used to record the new level when
doing reshape. This patch adds a new interface to help mdadm update
new_level and sync it to metadata. Then mdadm-grow-continue can read the
right new_level.

Commit log revised by Song Liu. Please refer to the link for more details.

Signed-off-by: Xiao Ni &lt;xni@redhat.com&gt;
Link: https://lore.kernel.org/r/20240904235453.99120-1-xni@redhat.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
</entry>
<entry>
<title>md: Report failed arrays as broken in mdstat</title>
<updated>2024-09-04T21:52:45+00:00</updated>
<author>
<name>Mateusz Kusiak</name>
<email>mateusz.kusiak@intel.com</email>
</author>
<published>2024-09-03T14:29:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2d2b3bc145b9d5b5c6f07d22291723ddb024ca76'/>
<id>urn:sha1:2d2b3bc145b9d5b5c6f07d22291723ddb024ca76</id>
<content type='text'>
Depending on if array has personality, it is either reported as active or
inactive. This patch adds third status "broken" for arrays with
personality that became inoperative. The reason is end users tend to
assume that "active" indicates array is operational.

Add "broken" state for inoperative arrays with personality and refactor
the code.

Signed-off-by: Mateusz Kusiak &lt;mateusz.kusiak@intel.com&gt;
Link: https://lore.kernel.org/r/20240903142949.53628-1-mateusz.kusiak@intel.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'md-6.12-bitmap' into md-6.12</title>
<updated>2024-08-28T21:55:57+00:00</updated>
<author>
<name>Song Liu</name>
<email>song@kernel.org</email>
</author>
<published>2024-08-28T21:55:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7f67fdae3372c323ad02601ab65502c9e488a521'/>
<id>urn:sha1:7f67fdae3372c323ad02601ab65502c9e488a521</id>
<content type='text'>
From Yu Kuai (with minor changes by Song Liu):

The background is that currently bitmap is using a global spin_lock,
causing lock contention and huge IO performance degradation for all raid
levels.

However, it's impossible to implement a new lock free bitmap with
current situation that md-bitmap exposes the internal implementation
with lots of exported apis. Hence bitmap_operations is invented, to
describe bitmap core implementation, and a new bitmap can be introduced
with a new bitmap_operations, we only need to switch to the new one
during initialization.

And with this we can build bitmap as kernel module, but that's not
our concern for now.

This version was tested with mdadm tests and lvm2 tests. This set does
not introduce new errors in these tests.

* md-6.12-bitmap: (42 commits)
  md/md-bitmap: make in memory structure internal
  md/md-bitmap: merge md_bitmap_enabled() into bitmap_operations
  md/md-bitmap: merge md_bitmap_wait_behind_writes() into bitmap_operations
  md/md-bitmap: merge md_bitmap_free() into bitmap_operations
  md/md-bitmap: merge md_bitmap_set_pages() into struct bitmap_operations
  md/md-bitmap: merge md_bitmap_copy_from_slot() into struct bitmap_operation.
  md/md-bitmap: merge get_bitmap_from_slot() into bitmap_operations
  md/md-bitmap: merge md_bitmap_resize() into bitmap_operations
  md/md-bitmap: pass in mddev directly for md_bitmap_resize()
  md/md-bitmap: merge md_bitmap_daemon_work() into bitmap_operations
  md/md-bitmap: merge bitmap_unplug() into bitmap_operations
  md/md-bitmap: merge md_bitmap_unplug_async() into md_bitmap_unplug()
  md/md-bitmap: merge md_bitmap_sync_with_cluster() into bitmap_operations
  md/md-bitmap: merge md_bitmap_cond_end_sync() into bitmap_operations
  md/md-bitmap: merge md_bitmap_close_sync() into bitmap_operations
  md/md-bitmap: merge md_bitmap_end_sync() into bitmap_operations
  md/md-bitmap: remove the parameter 'aborted' for md_bitmap_end_sync()
  md/md-bitmap: merge md_bitmap_start_sync() into bitmap_operations
  md/md-bitmap: merge md_bitmap_endwrite() into bitmap_operations
  md/md-bitmap: merge md_bitmap_startwrite() into bitmap_operations
  ...

Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
</entry>
</feed>
