<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/nvme/host/sysfs.c, branch v6.18.22</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.22</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.22'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-05-26T18:39:36+00:00</updated>
<entry>
<title>Merge tag 'for-6.16/block-20250523' of git://git.kernel.dk/linux</title>
<updated>2025-05-26T18:39:36+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-05-26T18:39:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6f59de9bc0d576eb5a5edfea470527902315e924'/>
<id>urn:sha1:6f59de9bc0d576eb5a5edfea470527902315e924</id>
<content type='text'>
Pull block updates from Jens Axboe:

 - ublk updates:
      - Add support for updating the size of a ublk instance
      - Zero-copy improvements
      - Auto-registering of buffers for zero-copy
      - Series simplifying and improving GET_DATA and request lookup
      - Series adding quiesce support
      - Lots of selftests additions
      - Various cleanups

 - NVMe updates via Christoph:
      - add per-node DMA pools and use them for PRP/SGL allocations
        (Caleb Sander Mateos, Keith Busch)
      - nvme-fcloop refcounting fixes (Daniel Wagner)
      - support delayed removal of the multipath node and optionally
        support the multipath node for private namespaces (Nilay Shroff)
      - support shared CQs in the PCI endpoint target code (Wilfred
        Mallawa)
      - support admin-queue only authentication (Hannes Reinecke)
      - use the crc32c library instead of the crypto API (Eric Biggers)
      - misc cleanups (Christoph Hellwig, Marcelo Moreira, Hannes
        Reinecke, Leon Romanovsky, Gustavo A. R. Silva)

 - MD updates via Yu:
      - Fix that normal IO can be starved by sync IO, found by mkfs on
        newly created large raid5, with some clean up patches for bdev
        inflight counters

 - Clean up brd, getting rid of atomic kmaps and bvec poking

 - Add loop driver specifically for zoned IO testing

 - Eliminate blk-rq-qos calls with a static key, if not enabled

 - Improve hctx locking for when a plug has IO for multiple queues
   pending

 - Remove block layer bouncing support, which in turn means we can
   remove the per-node bounce stat as well

 - Improve blk-throttle support

 - Improve delay support for blk-throttle

 - Improve brd discard support

 - Unify IO scheduler switching. This should also fix a bunch of lockdep
   warnings we've been seeing, after enabling lockdep support for queue
   freezing/unfreezeing

 - Add support for block write streams via FDP (flexible data placement)
   on NVMe

 - Add a bunch of block helpers, facilitating the removal of a bunch of
   duplicated boilerplate code

 - Remove obsolete BLK_MQ pci and virtio Kconfig options

 - Add atomic/untorn write support to blktrace

 - Various little cleanups and fixes

* tag 'for-6.16/block-20250523' of git://git.kernel.dk/linux: (186 commits)
  selftests: ublk: add test for UBLK_F_QUIESCE
  ublk: add feature UBLK_F_QUIESCE
  selftests: ublk: add test case for UBLK_U_CMD_UPDATE_SIZE
  traceevent/block: Add REQ_ATOMIC flag to block trace events
  ublk: run auto buf unregisgering in same io_ring_ctx with registering
  io_uring: add helper io_uring_cmd_ctx_handle()
  ublk: remove io argument from ublk_auto_buf_reg_fallback()
  ublk: handle ublk_set_auto_buf_reg() failure correctly in ublk_fetch()
  selftests: ublk: add test for covering UBLK_AUTO_BUF_REG_FALLBACK
  selftests: ublk: support UBLK_F_AUTO_BUF_REG
  ublk: support UBLK_AUTO_BUF_REG_FALLBACK
  ublk: register buffer to local io_uring with provided buf index via UBLK_F_AUTO_BUF_REG
  ublk: prepare for supporting to register request buffer automatically
  ublk: convert to refcount_t
  selftests: ublk: make IO &amp; device removal test more stressful
  nvme: rename nvme_mpath_shutdown_disk to nvme_mpath_remove_disk
  nvme: introduce multipath_always_on module param
  nvme-multipath: introduce delayed removal of the multipath head node
  nvme-pci: derive and better document max segments limits
  nvme-pci: use struct_size for allocation struct nvme_dev
  ...
</content>
</entry>
<entry>
<title>nvme: avoid creating multipath sysfs group under namespace path devices</title>
<updated>2025-05-21T12:55:46+00:00</updated>
<author>
<name>Nilay Shroff</name>
<email>nilay@linux.ibm.com</email>
</author>
<published>2025-05-21T12:41:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=49b9f86a594a5403641e6e60508788a7310fd293'/>
<id>urn:sha1:49b9f86a594a5403641e6e60508788a7310fd293</id>
<content type='text'>
Commit 4dbd2b2ebe4c ("nvme-multipath: Add visibility for round-robin
io-policy") introduced the creation of the multipath sysfs group under
the NVMe head gendisk device node. However, it also inadvertently added
the same sysfs group under each namespace path device which head node
refers to and that is incorrect.

The multipath sysfs group should only be exposed through the namespace
head gendisk node. This is sufficient, as the head device already
provides symbolic links to the individual namespace paths it manages.

This patch fixes the issue by preventing the creation of the multipath
sysfs group under namespace path devices, ensuring it only appears under
the head disk node.

Fixes: 4dbd2b2ebe4c ("nvme-multipath: Add visibility for round-robin io-policy")
Signed-off-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>nvme-multipath: introduce delayed removal of the multipath head node</title>
<updated>2025-05-20T03:34:27+00:00</updated>
<author>
<name>Nilay Shroff</name>
<email>nilay@linux.ibm.com</email>
</author>
<published>2025-05-14T13:03:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=62188639ec160eb77cb758c3a95947ebc918e76f'/>
<id>urn:sha1:62188639ec160eb77cb758c3a95947ebc918e76f</id>
<content type='text'>
Currently, the multipath head node of an NVMe disk is removed
immediately as soon as all paths of the disk are removed. However,
this can cause issues in scenarios where:

- The disk hot-removal followed by re-addition.
- Transient PCIe link failures that trigger re-enumeration,
  temporarily removing and then restoring the disk.

In these cases, removing the head node prematurely may lead to a head
disk node name change upon re-addition, requiring applications to
reopen their handles if they were performing I/O during the failure.

To address this, introduce a delayed removal mechanism of head disk
node. During transient failure, instead of immediate removal of head
disk node, the system waits for a configurable timeout, allowing the
disk to recover.

During transient disk failure, if application sends any IO then we
queue it instead of failing such IO immediately. If the disk comes back
online within the timeout, the queued IOs are resubmitted to the disk
ensuring seamless operation. In case disk couldn't recover from the
failure then queued IOs are failed to its completion and application
receives the error.

So this way, if disk comes back online within the configured period,
the head node remains unchanged, ensuring uninterrupted workloads
without requiring applications to reopen device handles.

A new sysfs attribute, named "delayed_removal_secs" is added under head
disk blkdev for user who wish to configure time for the delayed removal
of head disk node. The default value of this attribute is set to zero
second ensuring no behavior change unless explicitly configured.

Link: https://lore.kernel.org/linux-nvme/Y9oGTKCFlOscbPc2@infradead.org/
Link: https://lore.kernel.org/linux-nvme/Y+1aKcQgbskA2tra@kbusch-mbp.dhcp.thefacebook.com/
Suggested-by: Keith Busch &lt;kbusch@kernel.org&gt;
Suggested-by: Christoph Hellwig &lt;hch@infradead.org&gt;
[nilay: reworked based on the original idea/POC from Christoph and Keith]
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Signed-off-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>nvme-multipath: Add visibility for queue-depth io-policy</title>
<updated>2025-03-20T23:53:55+00:00</updated>
<author>
<name>Nilay Shroff</name>
<email>nilay@linux.ibm.com</email>
</author>
<published>2025-01-12T12:41:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7cbafa3ff0187cdfa922aa7eb3d578a93999b3a9'/>
<id>urn:sha1:7cbafa3ff0187cdfa922aa7eb3d578a93999b3a9</id>
<content type='text'>
This patch helps add nvme native multipath visibility for queue-depth
io-policy. It adds a new attribute file named "queue_depth" under
namespace device path node which would print the number of active/
in-flight I/O requests currently queued for the given path.

For instance, if we have a shared namespace accessible from two different
controllers/paths then accessing head block node of the shared namespace
would show the following output:

$ ls -l /sys/block/nvme1n1/multipath/
nvme1c1n1 -&gt; ../../../../../pci052e:78/052e:78:00.0/nvme/nvme1/nvme1c1n1
nvme1c3n1 -&gt; ../../../../../pci058e:78/058e:78:00.0/nvme/nvme3/nvme1c3n1

In the above example, nvme1n1 is head gendisk node created for a shared
namespace and the namespace is accessible from nvme1c1n1 and nvme1c3n1
paths. For queue-depth io-policy we can then refer the "queue_depth"
attribute file created under each namespace path:

$ cat /sys/block/nvme1n1/multipath/nvme1c1n1/queue_depth
518

$cat /sys/block/nvme1n1/multipath/nvme1c3n1/queue_depth
504

&gt;From the above output, we can infer that I/O workload targeted at nvme1n1
uses two paths nvme1c1n1 and nvme1c3n1 and the current queue depth of each
path is 518 and 504 respectively. Reading "queue_depth" file when
configured io-policy is anything but queue-depth would show no output.

Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Signed-off-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-multipath: Add visibility for numa io-policy</title>
<updated>2025-03-20T23:53:54+00:00</updated>
<author>
<name>Nilay Shroff</name>
<email>nilay@linux.ibm.com</email>
</author>
<published>2025-01-12T12:41:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6546cc4a56618fda55e76c92ff7aea31d8f9fb88'/>
<id>urn:sha1:6546cc4a56618fda55e76c92ff7aea31d8f9fb88</id>
<content type='text'>
This patch helps add nvme native multipath visibility for numa io-policy.
It adds a new attribute file named "numa_nodes" under namespace gendisk
device path node which prints the list of numa nodes preferred by the
given namespace path. The numa nodes value is comma delimited list of
nodes or A-B range of nodes.

For instance, if we have a shared namespace accessible from two different
controllers/paths then accessing head node of the shared namespace would
show the following output:

$ ls -l /sys/block/nvme1n1/multipath/
nvme1c1n1 -&gt; ../../../../../pci052e:78/052e:78:00.0/nvme/nvme1/nvme1c1n1
nvme1c3n1 -&gt; ../../../../../pci058e:78/058e:78:00.0/nvme/nvme3/nvme1c3n1

In the above example, nvme1n1 is head gendisk node created for a shared
namespace and this namespace is accessible from nvme1c1n1 and nvme1c3n1
paths. For numa io-policy we can then refer the "numa_nodes" attribute
file created under each namespace path:

$ cat /sys/block/nvme1n1/multipath/nvme1c1n1/numa_nodes
0-1

$ cat /sys/block/nvme1n1/multipath/nvme1c3n1/numa_nodes
2-3

&gt;From the above output, we infer that I/O workload targeted at nvme1n1
and running on numa nodes 0 and 1 would prefer using path nvme1c1n1.
Similarly, I/O workload running on numa nodes 2 and 3 would prefer
using path nvme1c3n1. Reading "numa_nodes" file when configured
io-policy is anything but numa would show no output.

Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Signed-off-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-multipath: Add visibility for round-robin io-policy</title>
<updated>2025-03-20T23:53:54+00:00</updated>
<author>
<name>Nilay Shroff</name>
<email>nilay@linux.ibm.com</email>
</author>
<published>2025-01-12T12:41:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4dbd2b2ebe4cc5f101881e2c091a70ccd38db7ee'/>
<id>urn:sha1:4dbd2b2ebe4cc5f101881e2c091a70ccd38db7ee</id>
<content type='text'>
This patch helps add nvme native multipath visibility for round-robin
io-policy. It creates a "multipath" sysfs directory under head gendisk
device node directory and then from "multipath" directory it adds a link
to each namespace path device the head node refers.

For instance, if we have a shared namespace accessible from two different
controllers/paths then we create a soft link to each path device from head
disk node as shown below:

$ ls -l /sys/block/nvme1n1/multipath/
nvme1c1n1 -&gt; ../../../../../pci052e:78/052e:78:00.0/nvme/nvme1/nvme1c1n1
nvme1c3n1 -&gt; ../../../../../pci058e:78/058e:78:00.0/nvme/nvme3/nvme1c3n1

In the above example, nvme1n1 is head gendisk node created for a shared
namespace and the namespace is accessible from nvme1c1n1 and nvme1c3n1
paths.

For round-robin I/O policy, we could easily infer from the above output
that I/O workload targeted to nvme1n1 would toggle across paths nvme1c1n1
and nvme1c3n1.

Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Signed-off-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-tcp: request secure channel concatenation</title>
<updated>2025-03-20T23:53:54+00:00</updated>
<author>
<name>Hannes Reinecke</name>
<email>hare@kernel.org</email>
</author>
<published>2025-02-24T12:38:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e88a7595b57f2a04f1be796419444b4a14a55d18'/>
<id>urn:sha1:e88a7595b57f2a04f1be796419444b4a14a55d18</id>
<content type='text'>
Add a fabrics option 'concat' to request secure channel concatenation as
specified the NVME Base Specification v2.1, section 8.3.4.3: Secure Channel
Concatenation.
When secure channel concatenation is enabled a 'generated PSK' is inserted
into the keyring such that it's available after reset.

Signed-off-by: Hannes Reinecke &lt;hare@kernel.org&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme: make nvme_tls_attrs_group static</title>
<updated>2025-01-31T18:14:27+00:00</updated>
<author>
<name>Keith Busch</name>
<email>kbusch@kernel.org</email>
</author>
<published>2025-01-28T15:22:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2d1a2dab95cdc6f2e0c6af3c0514b0bea94af482'/>
<id>urn:sha1:2d1a2dab95cdc6f2e0c6af3c0514b0bea94af482</id>
<content type='text'>
To suppress the compiler "warning: symbol 'nvme_tls_attrs_group' was not
declared. Should it be static?"

Fixes: 1e48b34c9bc79a ("nvme: split off TLS sysfs attributes into a separate group")
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme: null terminate nvme_tls_attrs</title>
<updated>2024-09-25T06:34:13+00:00</updated>
<author>
<name>Shin'ichiro Kawasaki</name>
<email>shinichiro.kawasaki@wdc.com</email>
</author>
<published>2024-09-24T09:01:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=83340d9c6178107df581c3ebbae0e28d0b15e879'/>
<id>urn:sha1:83340d9c6178107df581c3ebbae0e28d0b15e879</id>
<content type='text'>
Commit 1e48b34c9bc7 ("nvme: split off TLS sysfs attributes into a
separate group") introduced the struct attribute array nvme_tls_attrs.
However, the array was not null terminated and caused BUG KASAN global-
out-of-bounds. To avoid the BUG, null terminate the array.

Reported-by: Yi Zhang &lt;yi.zhang@redhat.com&gt;
Closes: https://lore.kernel.org/linux-nvme/jhllwfxcedrcxcnbajwl4x2l2ujcqowqcd4ps574zrafrqhjna@f4icvecutekm/
Fixes: 1e48b34c9bc7 ("nvme: split off TLS sysfs attributes into a separate group")
Signed-off-by: Shin'ichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Tested-by: Yi Zhang &lt;yi.zhang@redhat.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-sysfs: add 'tls_keyring' attribute</title>
<updated>2024-08-22T20:25:11+00:00</updated>
<author>
<name>Hannes Reinecke</name>
<email>hare@kernel.org</email>
</author>
<published>2024-07-22T12:02:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=02a3688c53d648e027ca9c27423bf864a189d7c7'/>
<id>urn:sha1:02a3688c53d648e027ca9c27423bf864a189d7c7</id>
<content type='text'>
Add a 'tls_keyring' attribute to display the contents of the
--keyring option from the connect string. Adding this attribute
allows us to recreate the original connect string from sysfs
settings.

Signed-off-by: Hannes Reinecke &lt;hare@kernel.org&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
</content>
</entry>
</feed>
