<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/nvme/host, branch v4.14.263</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.263</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.263'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-09-22T09:45:17+00:00</updated>
<entry>
<title>nvme-rdma: don't update queue count when failing to set io queues</title>
<updated>2021-09-22T09:45:17+00:00</updated>
<author>
<name>Ruozhu Li</name>
<email>liruozhu@huawei.com</email>
</author>
<published>2021-07-28T09:41:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8ee21cd94d3d50e034661525a646b29dfdb9854b'/>
<id>urn:sha1:8ee21cd94d3d50e034661525a646b29dfdb9854b</id>
<content type='text'>
[ Upstream commit 85032874f80ba17bf187de1d14d9603bf3f582b8 ]

We update ctrl-&gt;queue_count and schedule another reconnect when io queue
count is zero.But we will never try to create any io queue in next reco-
nnection, because ctrl-&gt;queue_count already set to zero.We will end up
having an admin-only session in Live state, which is exactly what we try
to avoid in the original patch.
Update ctrl-&gt;queue_count after queue_count zero checking to fix it.

Signed-off-by: Ruozhu Li &lt;liruozhu@huawei.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-rdma: fix possible hang when failing to set io queues</title>
<updated>2021-03-24T10:05:01+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagi@grimberg.me</email>
</author>
<published>2021-03-15T21:04:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2fd9191081769137d08b15e423cfbd2c3859c991'/>
<id>urn:sha1:2fd9191081769137d08b15e423cfbd2c3859c991</id>
<content type='text'>
[ Upstream commit c4c6df5fc84659690d4391d1fba155cd94185295 ]

We only setup io queues for nvme controllers, and it makes absolutely no
sense to allow a controller (re)connect without any I/O queues.  If we
happen to fail setting the queue count for any reason, we should not allow
this to be a successful reconnect as I/O has no chance in going through.
Instead just fail and schedule another reconnect.

Reported-by: Chao Leng &lt;lengchao@huawei.com&gt;
Fixes: 711023071960 ("nvme-rdma: add a NVMe over Fabrics RDMA host driver")
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Reviewed-by: Chao Leng &lt;lengchao@huawei.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs</title>
<updated>2021-02-10T08:12:09+00:00</updated>
<author>
<name>Thorsten Leemhuis</name>
<email>linux@leemhuis.info</email>
</author>
<published>2021-01-29T05:24:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=70a813ab11f647bcfccc6e535001423db9592d19'/>
<id>urn:sha1:70a813ab11f647bcfccc6e535001423db9592d19</id>
<content type='text'>
commit 538e4a8c571efdf131834431e0c14808bcfb1004 upstream.

Some Kingston A2000 NVMe SSDs sooner or later get confused and stop
working when they use the deepest APST sleep while running Linux. The
system then crashes and one has to cold boot it to get the SSD working
again.

Kingston seems to known about this since at least mid-September 2020:
https://bbs.archlinux.org/viewtopic.php?pid=1926994#p1926994

Someone working for a German company representing Kingston to the German
press confirmed to me Kingston engineering is aware of the issue and
investigating; the person stated that to their current knowledge only
the deepest APST sleep state causes trouble. Therefore, make Linux avoid
it for now by applying the NVME_QUIRK_NO_DEEPEST_PS to this SSD.

I have two such SSDs, but it seems the problem doesn't occur with them.
I hence couldn't verify if this patch really fixes the problem, but all
the data in front of me suggests it should.

This patch can easily be reverted or improved upon if a better solution
surfaces.

FWIW, there are many reports about the issue scattered around the web;
most of the users disabled APST completely to make things work, some
just made Linux avoid the deepest sleep state:

https://bugzilla.kernel.org/show_bug.cgi?id=195039#c65
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c73
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c74
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c78
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c79
https://bugzilla.kernel.org/show_bug.cgi?id=195039#c80
https://askubuntu.com/questions/1222049/nvmekingston-a2000-sometimes-stops-giving-response-in-ubuntu-18-04dell-inspir
https://community.acer.com/en/discussion/604326/m-2-nvme-ssd-aspire-517-51g-issue-compatibility-kingston-a2000-linux-ubuntu

For the record, some data from 'nvme id-ctrl /dev/nvme0'

NVME Identify Controller:
vid       : 0x2646
ssvid     : 0x2646
mn        : KINGSTON SA2000M81000G
fr        : S5Z42105
[...]
ps    0 : mp:9.00W operational enlat:0 exlat:0 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-
ps    1 : mp:4.60W operational enlat:0 exlat:0 rrt:1 rrl:1
          rwt:1 rwl:1 idle_power:- active_power:-
ps    2 : mp:3.80W operational enlat:0 exlat:0 rrt:2 rrl:2
          rwt:2 rwl:2 idle_power:- active_power:-
ps    3 : mp:0.0450W non-operational enlat:2000 exlat:2000 rrt:3 rrl:3
          rwt:3 rwl:3 idle_power:- active_power:-
ps    4 : mp:0.0040W non-operational enlat:15000 exlat:15000 rrt:4 rrl:4
          rwt:4 rwl:4 idle_power:- active_power:-

Cc: stable@vger.kernel.org # 4.14+
Signed-off-by: Thorsten Leemhuis &lt;linux@leemhuis.info&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>nvme: free sq/cq dbbuf pointers when dbbuf set fails</title>
<updated>2020-12-02T07:34:41+00:00</updated>
<author>
<name>Minwoo Im</name>
<email>minwoo.im.dev@gmail.com</email>
</author>
<published>2020-11-05T14:28:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=930bb3092fe606baa23d57ae59b70b291d67a8af'/>
<id>urn:sha1:930bb3092fe606baa23d57ae59b70b291d67a8af</id>
<content type='text'>
[ Upstream commit 0f0d2c876c96d4908a9ef40959a44bec21bdd6cf ]

If Doorbell Buffer Config command fails even 'dev-&gt;dbbuf_dbs != NULL'
which means OACS indicates that NVME_CTRL_OACS_DBBUF_SUPP is set,
nvme_dbbuf_update_and_check_event() will check event even it's not been
successfully set.

This patch fixes mismatch among dbbuf for sq/cqs in case that dbbuf
command fails.

Signed-off-by: Minwoo Im &lt;minwoo.im.dev@gmail.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-rdma: fix crash when connect rejected</title>
<updated>2020-11-05T10:06:58+00:00</updated>
<author>
<name>Chao Leng</name>
<email>lengchao@huawei.com</email>
</author>
<published>2020-10-12T08:10:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f1750073adfee2cc0c27440114f8f62a599d1aad'/>
<id>urn:sha1:f1750073adfee2cc0c27440114f8f62a599d1aad</id>
<content type='text'>
[ Upstream commit 43efdb8e870ee0f58633fd579aa5b5185bf5d39e ]

A crash can happened when a connect is rejected.   The host establishes
the connection after received ConnectReply, and then continues to send
the fabrics Connect command.  If the controller does not receive the
ReadyToUse capsule, host may receive a ConnectReject reply.

Call nvme_rdma_destroy_queue_ib after the host received the
RDMA_CM_EVENT_REJECTED event.  Then when the fabrics Connect command
times out, nvme_rdma_timeout calls nvme_rdma_complete_rq to fail the
request.  A crash happenes due to use after free in
nvme_rdma_complete_rq.

nvme_rdma_destroy_queue_ib is redundant when handling the
RDMA_CM_EVENT_REJECTED event as nvme_rdma_destroy_queue_ib is already
called in connection failure handler.

Signed-off-by: Chao Leng &lt;lengchao@huawei.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-fc: fail new connections to a deleted host or remote port</title>
<updated>2020-10-14T07:51:08+00:00</updated>
<author>
<name>James Smart</name>
<email>james.smart@broadcom.com</email>
</author>
<published>2020-09-17T20:33:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d32430f927a04c541df2a03b809e0b989feb5204'/>
<id>urn:sha1:d32430f927a04c541df2a03b809e0b989feb5204</id>
<content type='text'>
[ Upstream commit 9e0e8dac985d4bd07d9e62922b9d189d3ca2fccf ]

The lldd may have made calls to delete a remote port or local port and
the delete is in progress when the cli then attempts to create a new
controller. Currently, this proceeds without error although it can't be
very successful.

Fix this by validating that both the host port and remote port are
present when a new controller is to be created.

Signed-off-by: James Smart &lt;james.smart@broadcom.com&gt;
Reviewed-by: Himanshu Madhani &lt;himanshu.madhani@oracle.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-fc: cancel async events before freeing event struct</title>
<updated>2020-09-23T08:46:34+00:00</updated>
<author>
<name>David Milburn</name>
<email>dmilburn@redhat.com</email>
</author>
<published>2020-09-02T22:42:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=972998788aae86e12338f3b750984ad135ffac7a'/>
<id>urn:sha1:972998788aae86e12338f3b750984ad135ffac7a</id>
<content type='text'>
[ Upstream commit e126e8210e950bb83414c4f57b3120ddb8450742 ]

Cancel async event work in case async event has been queued up, and
nvme_fc_submit_async_event() runs after event has been freed.

Signed-off-by: David Milburn &lt;dmilburn@redhat.com&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-fc: Fix wrong return value in __nvme_fc_init_request()</title>
<updated>2020-09-03T09:22:28+00:00</updated>
<author>
<name>Tianjia Zhang</name>
<email>tianjia.zhang@linux.alibaba.com</email>
</author>
<published>2020-08-02T11:15:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b94544a0ba3373e87a85861df6c7809a551b0e1f'/>
<id>urn:sha1:b94544a0ba3373e87a85861df6c7809a551b0e1f</id>
<content type='text'>
[ Upstream commit f34448cd0dc697723fb5f4118f8431d9233b370d ]

On an error exit path, a negative error code should be returned
instead of a positive return value.

Fixes: e399441de9115 ("nvme-fabrics: Add host support for FC transport")
Cc: James Smart &lt;jsmart2021@gmail.com&gt;
Signed-off-by: Tianjia Zhang &lt;tianjia.zhang@linux.alibaba.com&gt;
Reviewed-by: Chaitanya Kulkarni &lt;chaitanya.kulkarni@wdc.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme-rdma: assign completion vector correctly</title>
<updated>2020-07-22T07:22:16+00:00</updated>
<author>
<name>Max Gurtovoy</name>
<email>maxg@mellanox.com</email>
</author>
<published>2020-06-23T14:55:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c8b306dcc57625e7bcb310fc25e2a125356ec60f'/>
<id>urn:sha1:c8b306dcc57625e7bcb310fc25e2a125356ec60f</id>
<content type='text'>
[ Upstream commit 032a9966a22a3596addf81dacf0c1736dfedc32a ]

The completion vector index that is given during CQ creation can't
exceed the number of support vectors by the underlying RDMA device. This
violation currently can accure, for example, in case one will try to
connect with N regular read/write queues and M poll queues and the sum
of N + M &gt; num_supported_vectors. This will lead to failure in establish
a connection to remote target. Instead, in that case, share a completion
vector between queues.

Signed-off-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nvme: refine the Qemu Identify CNS quirk</title>
<updated>2020-06-20T08:25:11+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-04-04T08:11:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d5e0f8093735c16d25a626cf4b5b9d7f84144c8a'/>
<id>urn:sha1:d5e0f8093735c16d25a626cf4b5b9d7f84144c8a</id>
<content type='text'>
[ Upstream commit b9a5c3d4c34d8bd9fd75f7f28d18a57cb68da237 ]

Add a helper to check if we can use Identify CNS values &gt; 1, and refine
the Qemu quirk to not apply to reported versions larger than 1.1, as the
Qemu implementation had been fixed by then.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
