<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md/dm-thin.c, branch linux-4.20.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-4.20.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-4.20.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2019-02-20T09:29:15+00:00</updated>
<entry>
<title>dm thin: fix bug where bio that overwrites thin block ignores FUA</title>
<updated>2019-02-20T09:29:15+00:00</updated>
<author>
<name>Nikos Tsironis</name>
<email>ntsironis@arrikto.com</email>
</author>
<published>2019-02-14T18:38:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e161e0ef3bbb2ffacd38bdc12c17d4ee6d764cb4'/>
<id>urn:sha1:e161e0ef3bbb2ffacd38bdc12c17d4ee6d764cb4</id>
<content type='text'>
commit 4ae280b4ee3463fa57bbe6eede26b97daff8a0f1 upstream.

When provisioning a new data block for a virtual block, either because
the block was previously unallocated or because we are breaking sharing,
if the whole block of data is being overwritten the bio that triggered
the provisioning is issued immediately, skipping copying or zeroing of
the data block.

When this bio completes the new mapping is inserted in to the pool's
metadata by process_prepared_mapping(), where the bio completion is
signaled to the upper layers.

This completion is signaled without first committing the metadata.  If
the bio in question has the REQ_FUA flag set and the system crashes
right after its completion and before the next metadata commit, then the
write is lost despite the REQ_FUA flag requiring that I/O completion for
this request must only be signaled after the data has been committed to
non-volatile storage.

Fix this by deferring the completion of overwrite bios, with the REQ_FUA
flag set, until after the metadata has been committed.

Cc: stable@vger.kernel.org
Signed-off-by: Nikos Tsironis &lt;ntsironis@arrikto.com&gt;
Acked-by: Joe Thornber &lt;ejt@redhat.com&gt;
Acked-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm thin: fix passdown_double_checking_shared_status()</title>
<updated>2019-01-31T07:15:41+00:00</updated>
<author>
<name>Joe Thornber</name>
<email>ejt@redhat.com</email>
</author>
<published>2019-01-15T18:27:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e723ef4b04c3900b55df9e8d737449f5832edfd0'/>
<id>urn:sha1:e723ef4b04c3900b55df9e8d737449f5832edfd0</id>
<content type='text'>
commit d445bd9cec1a850c2100fcf53684c13b3fd934f2 upstream.

Commit 00a0ea33b495 ("dm thin: do not queue freed thin mapping for next
stage processing") changed process_prepared_discard_passdown_pt1() to
increment all the blocks being discarded until after the passdown had
completed to avoid them being prematurely reused.

IO issued to a thin device that breaks sharing with a snapshot, followed
by a discard issued to snapshot(s) that previously shared the block(s),
results in passdown_double_checking_shared_status() being called to
iterate through the blocks double checking their reference count is zero
and issuing the passdown if so.  So a side effect of commit 00a0ea33b495
is passdown_double_checking_shared_status() was broken.

Fix this by checking if the block reference count is greater than 1.
Also, rename dm_pool_block_is_used() to dm_pool_block_is_shared().

Fixes: 00a0ea33b495 ("dm thin: do not queue freed thin mapping for next stage processing")
Cc: stable@vger.kernel.org # 4.9+
Reported-by: ryan.p.norwood@gmail.com
Signed-off-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm thin: bump target version</title>
<updated>2018-12-12T14:39:54+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2018-12-12T14:39:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2af6c0703d75fc3ff2e6de19b4b3adab96acc12d'/>
<id>urn:sha1:2af6c0703d75fc3ff2e6de19b4b3adab96acc12d</id>
<content type='text'>
Decoupled version bump from commit f6c367585d0 ("dm thin: send event
about thin-pool state change _after_ making it") because version bumps
just create conflicts when backporting to the stable trees.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm thin: send event about thin-pool state change _after_ making it</title>
<updated>2018-12-11T20:19:26+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2018-12-11T18:31:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f6c367585d0d851349d3a9e607c43e5bea993fa1'/>
<id>urn:sha1:f6c367585d0d851349d3a9e607c43e5bea993fa1</id>
<content type='text'>
Sending a DM event before a thin-pool state change is about to happen is
a bug.  It wasn't realized until it became clear that userspace response
to the event raced with the actual state change that the event was
meant to notify about.

Fix this by first updating internal thin-pool state to reflect what the
DM event is being issued about.  This fixes a long-standing racey/buggy
userspace device-mapper-test-suite 'resize_io' test that would get an
event but not find the state it was looking for -- so it would just go
on to hang because no other events caused the test to reevaluate the
thin-pool's state.

Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm thin: use refcount_t for thin_c reference counting</title>
<updated>2018-10-16T18:27:03+00:00</updated>
<author>
<name>John Pittman</name>
<email>jpittman@redhat.com</email>
</author>
<published>2018-08-23T17:35:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=22d4c291f58752d14efb8a185b772a3fc3e947d2'/>
<id>urn:sha1:22d4c291f58752d14efb8a185b772a3fc3e947d2</id>
<content type='text'>
The API surrounding refcount_t should be used in place of atomic_t
when variables are being used as reference counters.  It can
potentially prevent reference counter overflows and use-after-free
conditions.  In the dm thin layer, one such example is tc-&gt;refcount.
Change this from the atomic_t API to the refcount_t API to prevent
mentioned conditions.

Signed-off-by: John Pittman &lt;jpittman@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm thin metadata: try to avoid ever aborting transactions</title>
<updated>2018-09-10T21:03:18+00:00</updated>
<author>
<name>Joe Thornber</name>
<email>ejt@redhat.com</email>
</author>
<published>2018-09-10T15:50:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3ab91828166895600efd9cdc3a0eb32001f7204a'/>
<id>urn:sha1:3ab91828166895600efd9cdc3a0eb32001f7204a</id>
<content type='text'>
Committing a transaction can consume some metadata of it's own, we now
reserve a small amount of metadata to cover this.  Free metadata
reported by the kernel will not include this reserve.

If any of the reserve has been used after a commit we enter a new
internal state PM_OUT_OF_METADATA_SPACE.  This is reported as
PM_READ_ONLY, so no userland changes are needed.  If the metadata
device is resized the pool will move back to PM_WRITE.

These changes mean we never need to abort and rollback a transaction due
to running out of metadata space.  This is particularly important
because there have been a handful of reports of data corruption against
DM thin-provisioning that can all be attributed to the thin-pool having
ran out of metadata space.

Signed-off-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm thin: stop no_space_timeout worker when switching to write-mode</title>
<updated>2018-08-07T18:30:29+00:00</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2018-08-02T08:18:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=75294442d896f2767be34f75aca7cc2b0d01301f'/>
<id>urn:sha1:75294442d896f2767be34f75aca7cc2b0d01301f</id>
<content type='text'>
Now both check_for_space() and do_no_space_timeout() will read &amp; write
pool-&gt;pf.error_if_no_space.  If these functions run concurrently, as
shown in the following case, the default setting of "queue_if_no_space"
can get lost.

precondition:
    * error_if_no_space = false (aka "queue_if_no_space")
    * pool is in Out-of-Data-Space (OODS) mode
    * no_space_timeout worker has been queued

CPU 0:                          CPU 1:
// delete a thin device
process_delete_mesg()
// check_for_space() invoked by commit()
set_pool_mode(pool, PM_WRITE)
    pool-&gt;pf.error_if_no_space = \
     pt-&gt;requested_pf.error_if_no_space

				// timeout, pool is still in OODS mode
				do_no_space_timeout
				    // "queue_if_no_space" config is lost
				    pool-&gt;pf.error_if_no_space = true
    pool-&gt;pf.mode = new_mode

Fix it by stopping no_space_timeout worker when switching to write mode.

Fixes: bcc696fac11f ("dm thin: stay in out-of-data-space mode once no_space_timeout expires")
Cc: stable@vger.kernel.org
Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm kcopyd: return void from dm_kcopyd_copy()</title>
<updated>2018-07-31T21:33:21+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2018-07-31T21:27:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7209049d40dc37791ce0f3738965296f30e26044'/>
<id>urn:sha1:7209049d40dc37791ce0f3738965296f30e26044</id>
<content type='text'>
dm_kcopyd_copy() only ever returns 0 so there is no need for callers to
account for possible failure.  Same goes for dm_kcopyd_zero().

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm thin: include metadata_low_watermark threshold in pool status</title>
<updated>2018-07-30T15:49:08+00:00</updated>
<author>
<name>Andy Grover</name>
<email>agrover@redhat.com</email>
</author>
<published>2018-07-27T22:51:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=63c8ecb6261abcb79191a264778e8dae222e67cf'/>
<id>urn:sha1:63c8ecb6261abcb79191a264778e8dae222e67cf</id>
<content type='text'>
The metadata low watermark threshold is set by the kernel.  But the
kernel depends on userspace to extend the thinpool metadata device when
the threshold is crossed.

Since the metadata low watermark threshold is not visible to userspace,
upon receiving an event, userspace cannot tell that the kernel wants the
metadata device extended, instead of some other eventing condition.
Making it visible (but not settable) enables userspace to affirmatively
know the kernel is asking for a metadata device extension, by comparing
metadata_low_watermark against nr_free_blocks_metadata, also reported in
status.

Current solutions like dmeventd have their own thresholds for extending
the data and metadata devices, and both devices are checked against
their thresholds on each event.  This lessens the value of the kernel-set
threshold, since userspace will either extend the metadata device sooner,
when receiving another event; or will receive the metadata lowater event
and do nothing, if dmeventd's threshold is less than the kernel's.
(This second case is dangerous. The metadata lowater event will not be
re-sent, so no further event will be generated before the metadata
device is out if space, unless some other event causes userspace to
recheck its thresholds.)

Signed-off-by: Andy Grover &lt;agrover@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm thin: handle running out of data space vs concurrent discard</title>
<updated>2018-06-27T12:49:46+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2018-06-26T16:04:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a685557fbbc3122ed11e8ad3fa63a11ebc5de8c3'/>
<id>urn:sha1:a685557fbbc3122ed11e8ad3fa63a11ebc5de8c3</id>
<content type='text'>
Discards issued to a DM thin device can complete to userspace (via
fstrim) _before_ the metadata changes associated with the discards is
reflected in the thinp superblock (e.g. free blocks).  As such, if a
user constructs a test that loops repeatedly over these steps, block
allocation can fail due to discards not having completed yet:
1) fill thin device via filesystem file
2) remove file
3) fstrim

From initial report, here:
https://www.redhat.com/archives/dm-devel/2018-April/msg00022.html

"The root cause of this issue is that dm-thin will first remove
mapping and increase corresponding blocks' reference count to prevent
them from being reused before DISCARD bios get processed by the
underlying layers. However. increasing blocks' reference count could
also increase the nr_allocated_this_transaction in struct sm_disk
which makes smd-&gt;old_ll.nr_allocated +
smd-&gt;nr_allocated_this_transaction bigger than smd-&gt;old_ll.nr_blocks.
In this case, alloc_data_block() will never commit metadata to reset
the begin pointer of struct sm_disk, because sm_disk_get_nr_free()
always return an underflow value."

While there is room for improvement to the space-map accounting that
thinp is making use of: the reality is this test is inherently racey and
will result in the previous iteration's fstrim's discard(s) completing
vs concurrent block allocation, via dd, in the next iteration of the
loop.

No amount of space map accounting improvements will be able to allow
user's to use a block before a discard of that block has completed.

So the best we can really do is allow DM thinp to gracefully handle such
aggressive use of all the pool's data by degrading the pool into
out-of-data-space (OODS) mode.  We _should_ get that behaviour already
(if space map accounting didn't falsely cause alloc_data_block() to
believe free space was available).. but short of that we handle the
current reality that dm_pool_alloc_data_block() can return -ENOSPC.

Reported-by: Dennis Yang &lt;dennisyang@qnap.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
</feed>
