<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/nvme, branch v4.14.85</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.85</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.85'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2018-11-21T08:24:17+00:00</updated>
<entry>
<title>nvme-loop: fix kernel oops in case of unhandled command</title>
<updated>2018-11-21T08:24:17+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2018-04-12T15:16:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a8ba76fc043e1a34e98055f6e818c67ddc3c05c2'/>
<id>urn:sha1:a8ba76fc043e1a34e98055f6e818c67ddc3c05c2</id>
<content type='text'>
commit 11d9ea6f2ca69237d35d6c55755beba3e006b106 upstream.

When nvmet_req_init() fails, __nvmet_req_complete() is called
to handle the target request via .queue_response(), so
nvme_loop_queue_response() shouldn't be called again for
handling the failure.

This patch fixes this case by the following way:

- move blk_mq_start_request() before nvmet_req_init(), so
nvme_loop_queue_response() may work well to complete this
host request

- don't call nvme_cleanup_cmd() which is done in nvme_loop_complete_rq()

- don't call nvme_loop_queue_response() which is done via
.queue_response()

Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
[trimmed changelog]
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sudip Mukherjee &lt;sudipm.mukherjee@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>nvme_fc: fix ctrl create failures racing with workq items</title>
<updated>2018-10-13T07:27:28+00:00</updated>
<author>
<name>James Smart</name>
<email>jsmart2021@gmail.com</email>
</author>
<published>2018-03-13T16:48:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0f6e2f4e06be4da35c1b9e52d638218bafa91e25'/>
<id>urn:sha1:0f6e2f4e06be4da35c1b9e52d638218bafa91e25</id>
<content type='text'>
commit cf25809bec2c7df4b45df5b2196845d9a4a3c89b upstream.

If there are errors during initial controller create, the transport
will teardown the partially initialized controller struct and free
the ctlr memory.  Trouble is - most of those errors can occur due
to asynchronous events happening such io timeouts and subsystem
connectivity failures. Those failures invoke async workq items to
reset the controller and attempt reconnect.  Those may be in progress
as the main thread frees the ctrl memory, resulting in NULL ptr oops.

Prevent this from happening by having the main ctrl failure thread
changing state to DELETING followed by synchronously cancelling any
pending queued work item. The change of state will prevent the
scheduling of resets or reconnect events.

Signed-off-by: James Smart &lt;james.smart@broadcom.com&gt;
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Amit Pundir &lt;amit.pundir@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>nvmet-rdma: fix possible bogus dereference under heavy load</title>
<updated>2018-10-10T06:54:24+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagi@grimberg.me</email>
</author>
<published>2018-09-03T10:47:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f36f3ebdf1e1a211c3be521e187d69a94b30db77'/>
<id>urn:sha1:f36f3ebdf1e1a211c3be521e187d69a94b30db77</id>
<content type='text'>
[ Upstream commit 8407879c4e0d7731f6e7e905893cecf61a7762c7 ]

Currently we always repost the recv buffer before we send a response
capsule back to the host. Since ordering is not guaranteed for send
and recv completions, it is posible that we will receive a new request
from the host before we got a send completion for the response capsule.

Today, we pre-allocate 2x rsps the length of the queue, but in reality,
under heavy load there is nothing that is really preventing the gap to
expand until we exhaust all our rsps.

To fix this, if we don't have any pre-allocated rsps left, we dynamically
allocate a rsp and make sure to free it when we are done. If under memory
pressure we fail to allocate a rsp, we silently drop the command and
wait for the host to retry.

Reported-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Tested-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
[hch: dropped a superflous assignment]
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>nvme-fcloop: Fix dropped LS's to removed target port</title>
<updated>2018-10-04T00:00:59+00:00</updated>
<author>
<name>James Smart</name>
<email>jsmart2021@gmail.com</email>
</author>
<published>2018-08-09T23:00:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d11237bdcf9570edc0c562d93df219a4a8a4c9f0'/>
<id>urn:sha1:d11237bdcf9570edc0c562d93df219a4a8a4c9f0</id>
<content type='text'>
[ Upstream commit afd299ca996929f4f98ac20da0044c0cdc124879 ]

When a targetport is removed from the config, fcloop will avoid calling
the LS done() routine thinking the targetport is gone. This leaves the
initiator reset/reconnect hanging as it waits for a status on the
Create_Association LS for the reconnect.

Change the filter in the LS callback path. If tport null (set when
failed validation before "sending to remote port"), be sure to call
done. This was the main bug. But, continue the logic that only calls
done if tport was set but there is no remoteport (e.g. case where
remoteport has been removed, thus host doesn't expect a completion).

Signed-off-by: James Smart &lt;james.smart@broadcom.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>nvme-rdma: unquiesce queues when deleting the controller</title>
<updated>2018-09-26T06:38:02+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagi@grimberg.me</email>
</author>
<published>2018-07-09T09:49:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3cb3868f98f53468160eb732d30ef29aaf046ae3'/>
<id>urn:sha1:3cb3868f98f53468160eb732d30ef29aaf046ae3</id>
<content type='text'>
[ Upstream commit 90140624e8face94207003ac9a9d2a329b309d68 ]

If the controller is going away, we need to unquiesce the IO queues so
that all pending request can fail gracefully before moving forward with
controller deletion. Do that before we destroy the IO queues so
blk_cleanup_queue won't block in freeze.

Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event</title>
<updated>2018-09-05T07:26:36+00:00</updated>
<author>
<name>Michal Wnukowski</name>
<email>wnukowski@google.com</email>
</author>
<published>2018-08-15T22:51:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0c9bed3698891c256b8a83d0c5001286bb87699b'/>
<id>urn:sha1:0c9bed3698891c256b8a83d0c5001286bb87699b</id>
<content type='text'>
commit f1ed3df20d2d223e0852cc4ac1f19bba869a7e3c upstream.

In many architectures loads may be reordered with older stores to
different locations.  In the nvme driver the following two operations
could be reordered:

 - Write shadow doorbell (dbbuf_db) into memory.
 - Read EventIdx (dbbuf_ei) from memory.

This can result in a potential race condition between driver and VM host
processing requests (if given virtual NVMe controller has a support for
shadow doorbell).  If that occurs, then the NVMe controller may decide to
wait for MMIO doorbell from guest operating system, and guest driver may
decide not to issue MMIO doorbell on any of subsequent commands.

This issue is purely timing-dependent one, so there is no easy way to
reproduce it. Currently the easiest known approach is to run "Oracle IO
Numbers" (orion) that is shipped with Oracle DB:

orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \
	concat -write 40 -duration 120 -matrix row -testname nvme_test

Where nvme_test is a .lun file that contains a list of NVMe block
devices to run test against. Limiting number of vCPUs assigned to given
VM instance seems to increase chances for this bug to occur. On test
environment with VM that got 4 NVMe drives and 1 vCPU assigned the
virtual NVMe controller hang could be observed within 10-20 minutes.
That correspond to about 400-500k IO operations processed (or about
100GB of IO read/writes).

Orion tool was used as a validation and set to run in a loop for 36
hours (equivalent of pushing 550M IO operations). No issues were
observed. That suggest that the patch fixes the issue.

Fixes: f9f38e33389c ("nvme: improve performance for virtual NVMe devices")
Signed-off-by: Michal Wnukowski &lt;wnukowski@google.com&gt;
Reviewed-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
[hch: updated changelog and comment a bit]
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>nvme: fix handling of metadata_len for NVME_IOCTL_IO_CMD</title>
<updated>2018-08-24T11:09:21+00:00</updated>
<author>
<name>Roland Dreier</name>
<email>roland@purestorage.com</email>
</author>
<published>2018-07-20T03:07:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7a12f4ed07a5f7b49c7d4f86ea67f65ec953cc40'/>
<id>urn:sha1:7a12f4ed07a5f7b49c7d4f86ea67f65ec953cc40</id>
<content type='text'>
[ Upstream commit 9b382768135ee3ff282f828c906574a8478e036b ]

The old code in nvme_user_cmd() passed the userspace virtual address
from nvme_passthru_cmd.metadata as the length of the metadata buffer
as well as the address to nvme_submit_user_cmd().

Fixes: 63263d60 ("nvme: Use metadata for passthrough commands")
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Reviewed-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>nvmet: reset keep alive timer in controller enable</title>
<updated>2018-08-24T11:09:02+00:00</updated>
<author>
<name>Max Gurtuvoy</name>
<email>maxg@mellanox.com</email>
</author>
<published>2018-06-19T12:45:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=955887c1fe90a2be056358001fd4920614a0f8b6'/>
<id>urn:sha1:955887c1fe90a2be056358001fd4920614a0f8b6</id>
<content type='text'>
[ Upstream commit d68a90e148f5a82aa67654c5012071e31c0e4baa ]

Controllers that are not yet enabled should not really enforce keep alive
timeouts, but we still want to track a timeout and cleanup in case a host
died before it enabled the controller.  Hence, simply reset the keep
alive timer when the controller is enabled.

Suggested-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>nvmet-fc: fix target sgl list on large transfers</title>
<updated>2018-08-09T10:16:39+00:00</updated>
<author>
<name>James Smart</name>
<email>jsmart2021@gmail.com</email>
</author>
<published>2018-07-16T21:38:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d626ac9669f29a4cd4d16ccf27349688a6dda2fd'/>
<id>urn:sha1:d626ac9669f29a4cd4d16ccf27349688a6dda2fd</id>
<content type='text'>
commit d082dc1562a2ff0947b214796f12faaa87e816a9 upstream.

The existing code to carve up the sg list expected an sg element-per-page
which can be very incorrect with iommu's remapping multiple memory pages
to fewer bus addresses. To hit this error required a large io payload
(greater than 256k) and a system that maps on a per-page basis. It's
possible that large ios could get by fine if the system condensed the
sgl list into the first 64 elements.

This patch corrects the sg list handling by specifically walking the
sg list element by element and attempting to divide the transfer up
on a per-sg element boundary. While doing so, it still tries to keep
sequences under 256k, but will exceed that rule if a single sg element
is larger than 256k.

Fixes: 48fa362b6c3f ("nvmet-fc: simplify sg list handling")
Cc: &lt;stable@vger.kernel.org&gt; # 4.14
Signed-off-by: James Smart &lt;james.smart@broadcom.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
</entry>
<entry>
<title>nvme-pci: Fix queue double allocations</title>
<updated>2018-08-09T10:16:39+00:00</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2018-01-23T16:16:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4af9c61ad953ccc7dbd059a45e77e84db563bd41'/>
<id>urn:sha1:4af9c61ad953ccc7dbd059a45e77e84db563bd41</id>
<content type='text'>
commit 62314e405fa101dbb82563394f9dfc225e3f1167 upstream.

The queue count says the highest queue that's been allocated, so don't
reallocate a queue lower than that.

Fixes: 147b27e4bd0 ("nvme-pci: allocate device queues storage space at probe")
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jon Derrick &lt;jonathan.derrick@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
