<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/nvme, branch v4.14.263</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.263</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.263'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-09-22T09:45:17+00:00</updated>
<entry>
<title>nvme-rdma: don't update queue count when failing to set io queues</title>
<updated>2021-09-22T09:45:17+00:00</updated>
<author>
<name>Ruozhu Li</name>
<email>liruozhu@huawei.com</email>
</author>
<published>2021-07-28T09:41:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8ee21cd94d3d50e034661525a646b29dfdb9854b'/>
<id>urn:sha1:8ee21cd94d3d50e034661525a646b29dfdb9854b</id>
<content type='text'>
[ Upstream commit 85032874f80ba17bf187de1d14d9603bf3f582b8 ]

We update ctrl-&gt;queue_count and schedule another reconnect when io queue
count is zero.But we will never try to create any io queue in next reco-
nnection, because ctrl-&gt;queue_count already set to zero.We will end up
having an admin-only session in Live state, which is exactly what we try
to avoid in the original patch.
Update ctrl-&gt;queue_count after queue_count zero checking to fix it.

Signed-off-by: Ruozhu Li &lt;liruozhu@huawei.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-loop: check for NVME_LOOP_Q_LIVE in nvme_loop_destroy_admin_queue()</title>
<updated>2021-06-30T12:48:47+00:00</updated>
<author>
<name>Hannes Reinecke</name>
<email>hare@suse.de</email>
</author>
<published>2021-05-26T15:23:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4b2e0ac757bc0deb1f1115ea84ebb6928bbd9c55'/>
<id>urn:sha1:4b2e0ac757bc0deb1f1115ea84ebb6928bbd9c55</id>
<content type='text'>
[ Upstream commit 4237de2f73a669e4f89ac0aa2b44fb1a1d9ec583 ]

We need to check the NVME_LOOP_Q_LIVE flag in
nvme_loop_destroy_admin_queue() to protect against duplicate
invocations eg during concurrent reset and remove calls.

Signed-off-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Chaitanya Kulkarni &lt;chaitanya.kulkarni@wdc.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-loop: clear NVME_LOOP_Q_LIVE when nvme_loop_configure_admin_queue() fails</title>
<updated>2021-06-30T12:48:46+00:00</updated>
<author>
<name>Hannes Reinecke</name>
<email>hare@suse.de</email>
</author>
<published>2021-05-26T15:23:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=089970144e0390bd3a246b5305966045bca14891'/>
<id>urn:sha1:089970144e0390bd3a246b5305966045bca14891</id>
<content type='text'>
[ Upstream commit 1c5f8e882a05de5c011e8c3fbeceb0d1c590eb53 ]

When the call to nvme_enable_ctrl() in nvme_loop_configure_admin_queue()
fails the NVME_LOOP_Q_LIVE flag is not cleared.

Signed-off-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Chaitanya Kulkarni &lt;chaitanya.kulkarni@wdc.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-loop: reset queue count to 1 in nvme_loop_destroy_io_queues()</title>
<updated>2021-06-30T12:48:46+00:00</updated>
<author>
<name>Hannes Reinecke</name>
<email>hare@suse.de</email>
</author>
<published>2021-05-26T15:23:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=41e4946a5b4405b264243d2119a0aee313d0c1af'/>
<id>urn:sha1:41e4946a5b4405b264243d2119a0aee313d0c1af</id>
<content type='text'>
[ Upstream commit a6c144f3d2e230f2b3ac5ed8c51e0f0391556197 ]

The queue count is increased in nvme_loop_init_io_queues(), so we
need to reset it to 1 at the end of nvme_loop_destroy_io_queues().
Otherwise the function is not re-entrant safe, and crash will happen
during concurrent reset and remove calls.

Signed-off-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Chaitanya Kulkarni &lt;chaitanya.kulkarni@wdc.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-rdma: fix possible hang when failing to set io queues</title>
<updated>2021-03-24T10:05:01+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagi@grimberg.me</email>
</author>
<published>2021-03-15T21:04:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2fd9191081769137d08b15e423cfbd2c3859c991'/>
<id>urn:sha1:2fd9191081769137d08b15e423cfbd2c3859c991</id>
<content type='text'>
[ Upstream commit c4c6df5fc84659690d4391d1fba155cd94185295 ]

We only setup io queues for nvme controllers, and it makes absolutely no
sense to allow a controller (re)connect without any I/O queues.  If we
happen to fail setting the queue count for any reason, we should not allow
this to be a successful reconnect as I/O has no chance in going through.
Instead just fail and schedule another reconnect.

Reported-by: Chao Leng &lt;lengchao@huawei.com&gt;
Fixes: 711023071960 ("nvme-rdma: add a NVMe over Fabrics RDMA host driver")
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Reviewed-by: Chao Leng &lt;lengchao@huawei.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvmet: don't check iosqes,iocqes for discovery controllers</title>
<updated>2021-03-24T10:05:00+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagi@grimberg.me</email>
</author>
<published>2021-03-15T22:34:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=09537286477e32b47bd920d810f8056989bd04da'/>
<id>urn:sha1:09537286477e32b47bd920d810f8056989bd04da</id>
<content type='text'>
commit d218a8a3003e84ab136e69a4e30dd4ec7dab2d22 upstream.

From the base spec, Figure 78:

  "Controller Configuration, these fields are defined as parameters to
   configure an "I/O Controller (IOC)" and not to configure a "Discovery
   Controller (DC).

   ...
   If the controller does not support I/O queues, then this field shall
   be read-only with a value of 0h

Just perform this check for I/O controllers.

Fixes: a07b4970f464 ("nvmet: add a generic NVMe target")
Reported-by: Belanger, Martin &lt;Martin.Belanger@dell.com&gt;
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Reviewed-by: Chaitanya Kulkarni &lt;chaitanya.kulkarni@wdc.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs</title>
<updated>2021-02-10T08:12:09+00:00</updated>
<author>
<name>Thorsten Leemhuis</name>
<email>linux@leemhuis.info</email>
</author>
<published>2021-01-29T05:24:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=70a813ab11f647bcfccc6e535001423db9592d19'/>
<id>urn:sha1:70a813ab11f647bcfccc6e535001423db9592d19</id>
<content type='text'>
commit 538e4a8c571efdf131834431e0c14808bcfb1004 upstream.

Some Kingston A2000 NVMe SSDs sooner or later get confused and stop
working when they use the deepest APST sleep while running Linux. The
system then crashes and one has to cold boot it to get the SSD working
again.

Kingston seems to known about this since at least mid-September 2020:
https://bbs.archlinux.org/viewtopic.php?pid=1926994#p1926994

Someone working for a German company representing Kingston to the German
press confirmed to me Kingston engineering is aware of the issue and
investigating; the person stated that to their current knowledge only
the deepest APST sleep state causes trouble. Therefore, make Linux avoid
it for now by applying the NVME_QUIRK_NO_DEEPEST_PS to this SSD.

I have two such SSDs, but it seems the problem doesn't occur with them.
I hence couldn't verify if this patch really fixes the problem, but all
the data in front of me suggests it should.

This patch can easily be reverted or improved upon if a better solution
surfaces.

FWIW, there are many reports about the issue scattered around the web;
most of the users disabled APST completely to make things work, some
just made Linux avoid the deepest sleep state:

https://bugzilla.kernel.org/show_bug.cgi?id=195039#c65
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c73
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c74
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c78
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c79
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c80
https://askubuntu.com/questions/1222049/nvmekingston-a2000-sometimes-stops-giving-response-in-ubuntu-18-04dell-inspir
https://community.acer.com/en/discussion/604326/m-2-nvme-ssd-aspire-517-51g-issue-compatibility-kingston-a2000-linux-ubuntu

For the record, some data from 'nvme id-ctrl /dev/nvme0'

NVME Identify Controller:
vid       : 0x2646
ssvid     : 0x2646
mn        : KINGSTON SA2000M81000G
fr        : S5Z42105
[...]
ps    0 : mp:9.00W operational enlat:0 exlat:0 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-
ps    1 : mp:4.60W operational enlat:0 exlat:0 rrt:1 rrl:1
          rwt:1 rwl:1 idle_power:- active_power:-
ps    2 : mp:3.80W operational enlat:0 exlat:0 rrt:2 rrl:2
          rwt:2 rwl:2 idle_power:- active_power:-
ps    3 : mp:0.0450W non-operational enlat:2000 exlat:2000 rrt:3 rrl:3
          rwt:3 rwl:3 idle_power:- active_power:-
ps    4 : mp:0.0040W non-operational enlat:15000 exlat:15000 rrt:4 rrl:4
          rwt:4 rwl:4 idle_power:- active_power:-

Cc: stable@vger.kernel.org # 4.14+
Signed-off-by: Thorsten Leemhuis &lt;linux@leemhuis.info&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>nvme: free sq/cq dbbuf pointers when dbbuf set fails</title>
<updated>2020-12-02T07:34:41+00:00</updated>
<author>
<name>Minwoo Im</name>
<email>minwoo.im.dev@gmail.com</email>
</author>
<published>2020-11-05T14:28:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=930bb3092fe606baa23d57ae59b70b291d67a8af'/>
<id>urn:sha1:930bb3092fe606baa23d57ae59b70b291d67a8af</id>
<content type='text'>
[ Upstream commit 0f0d2c876c96d4908a9ef40959a44bec21bdd6cf ]

If Doorbell Buffer Config command fails even 'dev-&gt;dbbuf_dbs != NULL'
which means OACS indicates that NVME_CTRL_OACS_DBBUF_SUPP is set,
nvme_dbbuf_update_and_check_event() will check event even it's not been
successfully set.

This patch fixes mismatch among dbbuf for sq/cqs in case that dbbuf
command fails.

Signed-off-by: Minwoo Im &lt;minwoo.im.dev@gmail.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-rdma: fix crash when connect rejected</title>
<updated>2020-11-05T10:06:58+00:00</updated>
<author>
<name>Chao Leng</name>
<email>lengchao@huawei.com</email>
</author>
<published>2020-10-12T08:10:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f1750073adfee2cc0c27440114f8f62a599d1aad'/>
<id>urn:sha1:f1750073adfee2cc0c27440114f8f62a599d1aad</id>
<content type='text'>
[ Upstream commit 43efdb8e870ee0f58633fd579aa5b5185bf5d39e ]

A crash can happened when a connect is rejected.   The host establishes
the connection after received ConnectReply, and then continues to send
the fabrics Connect command.  If the controller does not receive the
ReadyToUse capsule, host may receive a ConnectReject reply.

Call nvme_rdma_destroy_queue_ib after the host received the
RDMA_CM_EVENT_REJECTED event.  Then when the fabrics Connect command
times out, nvme_rdma_timeout calls nvme_rdma_complete_rq to fail the
request.  A crash happenes due to use after free in
nvme_rdma_complete_rq.

nvme_rdma_destroy_queue_ib is redundant when handling the
RDMA_CM_EVENT_REJECTED event as nvme_rdma_destroy_queue_ib is already
called in connection failure handler.

Signed-off-by: Chao Leng &lt;lengchao@huawei.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvmet: fix uninitialized work for zero kato</title>
<updated>2020-10-29T08:07:13+00:00</updated>
<author>
<name>zhenwei pi</name>
<email>pizhenwei@bytedance.com</email>
</author>
<published>2020-10-15T01:51:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=477dfe7d985a2c21e3e55c5bf5989a629f296ae1'/>
<id>urn:sha1:477dfe7d985a2c21e3e55c5bf5989a629f296ae1</id>
<content type='text'>
[ Upstream commit 85bd23f3dc09a2ae9e56885420e52c54bf983713 ]

When connecting a controller with a zero kato value using the following
command line

   nvme connect -t tcp -n NQN -a ADDR -s PORT --keep-alive-tmo=0

the warning below can be reproduced:

WARNING: CPU: 1 PID: 241 at kernel/workqueue.c:1627 __queue_delayed_work+0x6d/0x90
with trace:
  mod_delayed_work_on+0x59/0x90
  nvmet_update_cc+0xee/0x100 [nvmet]
  nvmet_execute_prop_set+0x72/0x80 [nvmet]
  nvmet_tcp_try_recv_pdu+0x2f7/0x770 [nvmet_tcp]
  nvmet_tcp_io_work+0x63f/0xb2d [nvmet_tcp]
  ...

This is caused by queuing up an uninitialized work.  Althrough the
keep-alive timer is disabled during allocating the controller (fixed in
0d3b6a8d213a), ka_work still has a chance to run (called by
nvmet_start_ctrl).

Fixes: 0d3b6a8d213a ("nvmet: Disable keep-alive timer when kato is cleared to 0h")
Signed-off-by: zhenwei pi &lt;pizhenwei@bytedance.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
