<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/nvme, branch v6.6.39</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.39</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.39'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-07-11T10:49:21+00:00</updated>
<entry>
<title>nvmet: fix a possible leak when destroy a ctrl during qp establishment</title>
<updated>2024-07-11T10:49:21+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagi@grimberg.me</email>
</author>
<published>2024-05-27T19:38:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5502c1f1d0d7472706cc1f201aecf1c935d302d1'/>
<id>urn:sha1:5502c1f1d0d7472706cc1f201aecf1c935d302d1</id>
<content type='text'>
[ Upstream commit c758b77d4a0a0ed3a1292b3fd7a2aeccd1a169a4 ]

In nvmet_sq_destroy we capture sq-&gt;ctrl early and if it is non-NULL we
know that a ctrl was allocated (in the admin connect request handler)
and we need to release pending AERs, clear ctrl-&gt;sqs and sq-&gt;ctrl
(for nvme-loop primarily), and drop the final reference on the ctrl.

However, a small window is possible where nvmet_sq_destroy starts (as
a result of the client giving up and disconnecting) concurrently with
the nvme admin connect cmd (which may be in an early stage). But *before*
kill_and_confirm of sq-&gt;ref (i.e. the admin connect managed to get an sq
live reference). In this case, sq-&gt;ctrl was allocated however after it was
captured in a local variable in nvmet_sq_destroy.
This prevented the final reference drop on the ctrl.

Solve this by re-capturing the sq-&gt;ctrl after all inflight request has
completed, where for sure sq-&gt;ctrl reference is final, and move forward
based on that.

This issue was observed in an environment with many hosts connecting
multiple ctrls simoutanuosly, creating a delay in allocating a ctrl
leading up to this race window.

Reported-by: Alex Turin &lt;alex@vastdata.com&gt;
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme: adjust multiples of NVME_CTRL_PAGE_SIZE in offset</title>
<updated>2024-07-11T10:49:20+00:00</updated>
<author>
<name>Kundan Kumar</name>
<email>kundan.kumar@samsung.com</email>
</author>
<published>2024-05-23T11:31:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b955b47905ed22e447b1958cfc0637ef8de7a657'/>
<id>urn:sha1:b955b47905ed22e447b1958cfc0637ef8de7a657</id>
<content type='text'>
[ Upstream commit 1bd293fcf3af84674e82ed022c049491f3768840 ]

bio_vec start offset may be relatively large particularly when large
folio gets added to the bio. A bigger offset will result in avoiding the
single-segment mapping optimization and end up using expensive
mempool_alloc further.

Rather than using absolute value, adjust bv_offset by
NVME_CTRL_PAGE_SIZE while checking if segment can be fitted into one/two
PRP entries.

Suggested-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Kundan Kumar &lt;kundan.kumar@samsung.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-multipath: find NUMA path only for online numa-node</title>
<updated>2024-07-11T10:49:20+00:00</updated>
<author>
<name>Nilay Shroff</name>
<email>nilay@linux.ibm.com</email>
</author>
<published>2024-05-16T12:13:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e6e1eda06b70d087230ffaa4a904c3e8d6ece87d'/>
<id>urn:sha1:e6e1eda06b70d087230ffaa4a904c3e8d6ece87d</id>
<content type='text'>
[ Upstream commit d3a043733f25d743f3aa617c7f82dbcb5ee2211a ]

In current native multipath design when a shared namespace is created,
we loop through each possible numa-node, calculate the NUMA distance of
that node from each nvme controller and then cache the optimal IO path
for future reference while sending IO. The issue with this design is that
we may refer to the NUMA distance table for an offline node which may not
be populated at the time and so we may inadvertently end up finding and
caching a non-optimal path for IO. Then latter when the corresponding
numa-node becomes online and hence the NUMA distance table entry for that
node is created, ideally we should re-calculate the multipath node distance
for the newly added node however that doesn't happen unless we rescan/reset
the controller. So essentially, we may keep using non-optimal IO path for a
node which is made online after namespace is created.
This patch helps fix this issue ensuring that when a shared namespace is
created, we calculate the multipath node distance for each online numa-node
instead of each possible numa-node. Then latter when a node becomes online
and we receive any IO on that newly added node, we would calculate the
multipath node distance for newly added node but this time NUMA distance
table would have been already populated for newly added node. Hence we
would be able to correctly calculate the multipath node distance and choose
the optimal path for the IO.

Signed-off-by: Nilay Shroff &lt;nilay@linux.ibm.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvmet-passthru: propagate status from id override functions</title>
<updated>2024-06-21T12:38:35+00:00</updated>
<author>
<name>Daniel Wagner</name>
<email>dwagner@suse.de</email>
</author>
<published>2024-06-12T14:02:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9a3eb4816ab9af25dd2357783e591ef66d5fe616'/>
<id>urn:sha1:9a3eb4816ab9af25dd2357783e591ef66d5fe616</id>
<content type='text'>
[ Upstream commit d76584e53f4244dbc154bec447c3852600acc914 ]

The id override functions return a status which is not propagated to the
caller.

Fixes: c1fef73f793b ("nvmet: add passthru code to process commands")
Signed-off-by: Daniel Wagner &lt;dwagner@suse.de&gt;
Reviewed-by: Chaitanya Kulkarni &lt;kch@nvidia.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme: fix nvme_pr_* status code parsing</title>
<updated>2024-06-21T12:38:29+00:00</updated>
<author>
<name>Weiwen Hu</name>
<email>huweiwen@linux.alibaba.com</email>
</author>
<published>2024-05-30T06:16:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ca060e25579457d0fedf92f7ac8cac8c28e307ac'/>
<id>urn:sha1:ca060e25579457d0fedf92f7ac8cac8c28e307ac</id>
<content type='text'>
[ Upstream commit b1a1fdd7096dd2d67911b07f8118ff113d815db4 ]

Fix the parsing if extra status bits (e.g. MORE) is present.

Fixes: 7fb42780d06c ("nvme: Convert NVMe errors to PR errors")
Signed-off-by: Weiwen Hu &lt;huweiwen@linux.alibaba.com&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvmet: fix ns enable/disable possible hang</title>
<updated>2024-06-12T09:12:53+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagi@grimberg.me</email>
</author>
<published>2024-05-21T20:20:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ca3b4293dccaa29e879df64809f08f8e6db69837'/>
<id>urn:sha1:ca3b4293dccaa29e879df64809f08f8e6db69837</id>
<content type='text'>
[ Upstream commit f97914e35fd98b2b18fb8a092e0a0799f73afdfe ]

When disabling an nvmet namespace, there is a period where the
subsys-&gt;lock is released, as the ns disable waits for backend IO to
complete, and the ns percpu ref to be properly killed. The original
intent was to avoid taking the subsystem lock for a prolong period as
other processes may need to acquire it (for example new incoming
connections).

However, it opens up a window where another process may come in and
enable the ns, (re)intiailizing the ns percpu_ref, causing the disable
sequence to hang.

Solve this by taking the global nvmet_config_sem over the entire configfs
enable/disable sequence.

Fixes: a07b4970f464 ("nvmet: add a generic NVMe target")
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Chaitanya Kulkarni &lt;kch@nvidia.com&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-multipath: fix io accounting on failover</title>
<updated>2024-06-12T09:12:53+00:00</updated>
<author>
<name>Keith Busch</name>
<email>kbusch@kernel.org</email>
</author>
<published>2024-05-21T18:02:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=36989c682505d872ee618f63fa551d0c7aa67317'/>
<id>urn:sha1:36989c682505d872ee618f63fa551d0c7aa67317</id>
<content type='text'>
[ Upstream commit a2e4c5f5f68dbd206f132bc709b98dea64afc3b8 ]

There are io stats accounting that needs to be handled, so don't call
blk_mq_end_request() directly. Use the existing nvme_end_req() helper
that already handles everything.

Fixes: d4d957b53d91ee ("nvme-multipath: support io stats on the mpath device")
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvmet: prevent sprintf() overflow in nvmet_subsys_nsid_exists()</title>
<updated>2024-06-12T09:11:30+00:00</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@linaro.org</email>
</author>
<published>2024-05-08T07:43:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=396bc5e54b4fd26d1ca06045452c089e9672a8fc'/>
<id>urn:sha1:396bc5e54b4fd26d1ca06045452c089e9672a8fc</id>
<content type='text'>
[ Upstream commit d15dcd0f1a4753b57e66c64c8dc2a9779ff96aab ]

The nsid value is a u32 that comes from nvmet_req_find_ns().  It's
endian data and we're on an error path and both of those raise red
flags.  So let's make this safer.

1) Make the buffer large enough for any u32.
2) Remove the unnecessary initialization.
3) Use snprintf() instead of sprintf() for even more safety.
4) The sprintf() function returns the number of bytes printed, not
   counting the NUL terminator. It is impossible for the return value to
   be &lt;= 0 so delete that.

Fixes: 505363957fad ("nvmet: fix nvme status code when namespace is disabled")
Signed-off-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvmet: fix nvme status code when namespace is disabled</title>
<updated>2024-06-12T09:11:30+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagi@grimberg.me</email>
</author>
<published>2024-04-28T09:25:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=71de5fc303a7cb1ab67bb1d50a6aac06841cf9cf'/>
<id>urn:sha1:71de5fc303a7cb1ab67bb1d50a6aac06841cf9cf</id>
<content type='text'>
[ Upstream commit 505363957fad35f7aed9a2b0d8dad73451a80fb5 ]

If the user disabled a nvmet namespace, it is removed from the subsystem
namespaces list. When nvmet processes a command directed to an nsid that
was disabled, it cannot differentiate between a nsid that is disabled
vs. a non-existent namespace, and resorts to return NVME_SC_INVALID_NS
with the dnr bit set.

This translates to a non-retryable status for the host, which translates
to a user error. We should expect disabled namespaces to not cause an
I/O error in a multipath environment.

Address this by searching a configfs item for the namespace nvmet failed
to find, and if we found one, conclude that the namespace is disabled
(perhaps temporarily). Return NVME_SC_INTERNAL_PATH_ERROR in this case
and keep DNR bit cleared.

Reported-by: Jirong Feng &lt;jirong.feng@easystack.cn&gt;
Tested-by: Jirong Feng &lt;jirong.feng@easystack.cn&gt;
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvmet-tcp: fix possible memory leak when tearing down a controller</title>
<updated>2024-06-12T09:11:30+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagi@grimberg.me</email>
</author>
<published>2024-04-28T08:49:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ae451994ba9c0390ebca4b8bd90a64aabad60396'/>
<id>urn:sha1:ae451994ba9c0390ebca4b8bd90a64aabad60396</id>
<content type='text'>
[ Upstream commit 6825bdde44340c5a9121f6d6fa25cc885bd9e821 ]

When we teardown the controller, we wait for pending I/Os to complete
(sq-&gt;ref on all queues to drop to zero) and then we go over the commands,
and free their command buffers in case they are still fetching data from
the host (e.g. processing nvme writes) and have yet to take a reference
on the sq.

However, we may miss the case where commands have failed before executing
and are queued for sending a response, but will never occur because the
queue socket is already down. In this case we may miss deallocating command
buffers.

Solve this by freeing all commands buffers as nvmet_tcp_free_cmd_buffers is
idempotent anyways.

Reported-by: Yi Zhang &lt;yi.zhang@redhat.com&gt;
Tested-by: Yi Zhang &lt;yi.zhang@redhat.com&gt;
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
