<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md/dm-thin.c, branch v4.4.171</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.4.171</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.4.171'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2018-10-10T06:52:12+00:00</updated>
<entry>
<title>dm thin metadata: try to avoid ever aborting transactions</title>
<updated>2018-10-10T06:52:12+00:00</updated>
<author>
<name>Joe Thornber</name>
<email>ejt@redhat.com</email>
</author>
<published>2018-09-10T15:50:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a438d505966601c324ebe3c231de9ed9d5ef8642'/>
<id>urn:sha1:a438d505966601c324ebe3c231de9ed9d5ef8642</id>
<content type='text'>
[ Upstream commit 3ab91828166895600efd9cdc3a0eb32001f7204a ]

Committing a transaction can consume some metadata of it's own, we now
reserve a small amount of metadata to cover this.  Free metadata
reported by the kernel will not include this reserve.

If any of the reserve has been used after a commit we enter a new
internal state PM_OUT_OF_METADATA_SPACE.  This is reported as
PM_READ_ONLY, so no userland changes are needed.  If the metadata
device is resized the pool will move back to PM_WRITE.

These changes mean we never need to abort and rollback a transaction due
to running out of metadata space.  This is particularly important
because there have been a handful of reports of data corruption against
DM thin-provisioning that can all be attributed to the thin-pool having
ran out of metadata space.

Signed-off-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dm thin: handle running out of data space vs concurrent discard</title>
<updated>2018-07-03T09:21:35+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2018-06-26T16:04:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=052ef26b088d0ad32875472725d2fc9cea80858b'/>
<id>urn:sha1:052ef26b088d0ad32875472725d2fc9cea80858b</id>
<content type='text'>
commit a685557fbbc3122ed11e8ad3fa63a11ebc5de8c3 upstream.

Discards issued to a DM thin device can complete to userspace (via
fstrim) _before_ the metadata changes associated with the discards is
reflected in the thinp superblock (e.g. free blocks).  As such, if a
user constructs a test that loops repeatedly over these steps, block
allocation can fail due to discards not having completed yet:
1) fill thin device via filesystem file
2) remove file
3) fstrim

From initial report, here:
https://www.redhat.com/archives/dm-devel/2018-April/msg00022.html

"The root cause of this issue is that dm-thin will first remove
mapping and increase corresponding blocks' reference count to prevent
them from being reused before DISCARD bios get processed by the
underlying layers. However. increasing blocks' reference count could
also increase the nr_allocated_this_transaction in struct sm_disk
which makes smd-&gt;old_ll.nr_allocated +
smd-&gt;nr_allocated_this_transaction bigger than smd-&gt;old_ll.nr_blocks.
In this case, alloc_data_block() will never commit metadata to reset
the begin pointer of struct sm_disk, because sm_disk_get_nr_free()
always return an underflow value."

While there is room for improvement to the space-map accounting that
thinp is making use of: the reality is this test is inherently racey and
will result in the previous iteration's fstrim's discard(s) completing
vs concurrent block allocation, via dd, in the next iteration of the
loop.

No amount of space map accounting improvements will be able to allow
user's to use a block before a discard of that block has completed.

So the best we can really do is allow DM thinp to gracefully handle such
aggressive use of all the pool's data by degrading the pool into
out-of-data-space (OODS) mode.  We _should_ get that behaviour already
(if space map accounting didn't falsely cause alloc_data_block() to
believe free space was available).. but short of that we handle the
current reality that dm_pool_alloc_data_block() can return -ENOSPC.

Reported-by: Dennis Yang &lt;dennisyang@qnap.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm thin: fix race condition when destroying thin pool workqueue</title>
<updated>2016-03-03T23:07:10+00:00</updated>
<author>
<name>Nikolay Borisov</name>
<email>kernel@kyup.com</email>
</author>
<published>2015-12-17T16:03:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5c8a03a351257ae175d57c9c06f57ab95fc99db3'/>
<id>urn:sha1:5c8a03a351257ae175d57c9c06f57ab95fc99db3</id>
<content type='text'>
commit 18d03e8c25f173f4107a40d0b8c24defb6ed69f3 upstream.

When a thin pool is being destroyed delayed work items are
cancelled using cancel_delayed_work(), which doesn't guarantee that on
return the delayed item isn't running.  This can cause the work item to
requeue itself on an already destroyed workqueue.  Fix this by using
cancel_delayed_work_sync() which guarantees that on return the work item
is not running anymore.

Fixes: 905e51b39a555 ("dm thin: commit outstanding data every second")
Fixes: 85ad643b7e7e5 ("dm thin: add timeout to stop out-of-data-space mode holding IO forever")
Signed-off-by: Nikolay Borisov &lt;kernel@kyup.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm thin: fix regression in advertised discard limits</title>
<updated>2015-11-23T19:54:46+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2015-11-23T18:44:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0fcb04d59351f790efb8da18edefd6ab4d9bbf3b'/>
<id>urn:sha1:0fcb04d59351f790efb8da18edefd6ab4d9bbf3b</id>
<content type='text'>
When establishing a thin device's discard limits we cannot rely on the
underlying thin-pool device's discard capabilities (which are inherited
from the thin-pool's underlying data device) given that DM thin devices
must provide discard support even when the thin-pool's underlying data
device doesn't support discards.

Users were exposed to this thin device discard limits regression if
their thin-pool's underlying data device does _not_ support discards.
This regression caused all upper-layers that called the
blkdev_issue_discard() interface to not be able to issue discards to
thin devices (because discard_granularity was 0).  This regression
wasn't caught earlier because the device-mapper-test-suite's extensive
'thin-provisioning' discard tests are only ever performed against
thin-pool's with data devices that support discards.

Fix is to have thin_io_hints() test the pool's 'discard_enabled' feature
rather than inferring whether or not a thin device's discard support
should be enabled by looking at the thin-pool's discard_granularity.

Fixes: 216076705 ("dm thin: disable discard support for thin devices if pool's is disabled")
Reported-by: Mike Gerber &lt;mike@sprachgewalt.de&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Cc: stable@vger.kernel.org # 4.1+
</content>
</entry>
<entry>
<title>dm thin: restore requested 'error_if_no_space' setting on OODS to WRITE transition</title>
<updated>2015-11-16T14:36:08+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2015-11-06T15:53:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=172c238612ebf81cabccc86b788c9209af591f61'/>
<id>urn:sha1:172c238612ebf81cabccc86b788c9209af591f61</id>
<content type='text'>
A thin-pool that is in out-of-data-space (OODS) mode may transition back
to write mode -- without the admin adding more space to the thin-pool --
if/when blocks are released (either by deleting thin devices or
discarding provisioned blocks).

But as part of the thin-pool's earlier transition to out-of-data-space
mode the thin-pool may have set the 'error_if_no_space' flag to true if
the no_space_timeout expires without more space having been made
available.  That implementation detail, of changing the pool's
error_if_no_space setting, needs to be reset back to the default that
the user specified when the thin-pool's table was loaded.

Otherwise we'll drop the user requested behaviour on the floor when this
out-of-data-space to write mode transition occurs.

Reported-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Acked-by: Joe Thornber &lt;ejt@redhat.com&gt;
Fixes: 2c43fd26e4 ("dm thin: fix missing out-of-data-space to write mode transition if blocks are released")
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>dm thin: fix missing pool reference count decrement in pool_ctr error path</title>
<updated>2015-10-13T16:20:55+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2015-10-13T16:04:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ba30670f4d5292c4e7f7980bbd5071f7c4794cdd'/>
<id>urn:sha1:ba30670f4d5292c4e7f7980bbd5071f7c4794cdd</id>
<content type='text'>
Fixes: ac8c3f3df ("dm thin: generate event when metadata threshold passed")
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Cc: stable@vger.kernel.org # 3.10+
</content>
</entry>
<entry>
<title>dm thin: disable discard support for thin devices if pool's is disabled</title>
<updated>2015-09-14T01:32:10+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2015-09-08T12:56:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=216076705d6ac291d42e0f8dd85e6a0da98c0fa3'/>
<id>urn:sha1:216076705d6ac291d42e0f8dd85e6a0da98c0fa3</id>
<content type='text'>
If the pool is configured with 'ignore_discard' its discard support is
disabled.  The pool's thin devices should also have queue_limits that
reflect discards are disabled.

Fixes: 34fbcf62 ("dm thin: range discard support")
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Cc: stable@vger.kernel.org # 4.1+
</content>
</entry>
<entry>
<title>Merge tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm</title>
<updated>2015-09-02T23:35:26+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-09-02T23:35:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1e1a4e8f439113b7820bc7150569f685e1cc2b43'/>
<id>urn:sha1:1e1a4e8f439113b7820bc7150569f685e1cc2b43</id>
<content type='text'>
Pull device mapper update from Mike Snitzer:

 - a couple small cleanups in dm-cache, dm-verity, persistent-data's
   dm-btree, and DM core.

 - a 4.1-stable fix for dm-cache that fixes the leaking of deferred bio
   prison cells

 - a 4.2-stable fix that adds feature reporting for the dm-stats
   features added in 4.2

 - improve DM-snapshot to not invalidate the on-disk snapshot if
   snapshot device write overflow occurs; but a write overflow triggered
   through the origin device will still invalidate the snapshot.

 - optimize DM-thinp's async discard submission a bit now that late bio
   splitting has been included in block core.

 - switch DM-cache's SMQ policy lock from using a mutex to a spinlock;
   improves performance on very low latency devices (eg. NVMe SSD).

 - document DM RAID 4/5/6's discard support

[ I did not pull the slab changes, which weren't appropriate for this
  tree, and weren't obviously the right thing to do anyway.  At the very
  least they need some discussion and explanation before getting merged.

  Because not pulling the actual tagged commit but doing a partial pull
  instead, this merge commit thus also obviously is missing the git
  signature from the original tag ]

* tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm cache: fix use after freeing migrations
  dm cache: small cleanups related to deferred prison cell cleanup
  dm cache: fix leaking of deferred bio prison cells
  dm raid: document RAID 4/5/6 discard support
  dm stats: report precise_timestamps and histogram in @stats_list output
  dm thin: optimize async discard submission
  dm snapshot: don't invalidate on-disk image on snapshot write overflow
  dm: remove unlikely() before IS_ERR()
  dm: do not override error code returned from dm_get_device()
  dm: test return value for DM_MAPIO_SUBMITTED
  dm verity: remove unused mempool
  dm cache: move wake_waker() from free_migrations() to where it is needed
  dm btree remove: remove unused function get_nr_entries()
  dm btree: remove unused "dm_block_t root" parameter in btree_split_sibling()
  dm cache policy smq: change the mutex to a spinlock
</content>
</entry>
<entry>
<title>Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block</title>
<updated>2015-09-02T20:10:25+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-09-02T20:10:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1081230b748de8f03f37f80c53dfa89feda9b8de'/>
<id>urn:sha1:1081230b748de8f03f37f80c53dfa89feda9b8de</id>
<content type='text'>
Pull core block updates from Jens Axboe:
 "This first core part of the block IO changes contains:

   - Cleanup of the bio IO error signaling from Christoph.  We used to
     rely on the uptodate bit and passing around of an error, now we
     store the error in the bio itself.

   - Improvement of the above from myself, by shrinking the bio size
     down again to fit in two cachelines on x86-64.

   - Revert of the max_hw_sectors cap removal from a revision again,
     from Jeff Moyer.  This caused performance regressions in various
     tests.  Reinstate the limit, bump it to a more reasonable size
     instead.

   - Make /sys/block/&lt;dev&gt;/queue/discard_max_bytes writeable, by me.
     Most devices have huge trim limits, which can cause nasty latencies
     when deleting files.  Enable the admin to configure the size down.
     We will look into having a more sane default instead of UINT_MAX
     sectors.

   - Improvement of the SGP gaps logic from Keith Busch.

   - Enable the block core to handle arbitrarily sized bios, which
     enables a nice simplification of bio_add_page() (which is an IO hot
     path).  From Kent.

   - Improvements to the partition io stats accounting, making it
     faster.  From Ming Lei.

   - Also from Ming Lei, a basic fixup for overflow of the sysfs pending
     file in blk-mq, as well as a fix for a blk-mq timeout race
     condition.

   - Ming Lin has been carrying Kents above mentioned patches forward
     for a while, and testing them.  Ming also did a few fixes around
     that.

   - Sasha Levin found and fixed a use-after-free problem introduced by
     the bio-&gt;bi_error changes from Christoph.

   - Small blk cgroup cleanup from Viresh Kumar"

* 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits)
  blk: Fix bio_io_vec index when checking bvec gaps
  block: Replace SG_GAPS with new queue limits mask
  block: bump BLK_DEF_MAX_SECTORS to 2560
  Revert "block: remove artifical max_hw_sectors cap"
  blk-mq: fix race between timeout and freeing request
  blk-mq: fix buffer overflow when reading sysfs file of 'pending'
  Documentation: update notes in biovecs about arbitrarily sized bios
  block: remove bio_get_nr_vecs()
  fs: use helper bio_add_page() instead of open coding on bi_io_vec
  block: kill merge_bvec_fn() completely
  md/raid5: get rid of bio_fits_rdev()
  md/raid5: split bio for chunk_aligned_read
  block: remove split code in blkdev_issue_{discard,write_same}
  btrfs: remove bio splitting and merge_bvec_fn() calls
  bcache: remove driver private bio splitting code
  block: simplify bio_add_page()
  block: make generic_make_request handle arbitrarily sized bios
  blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL)
  block: don't access bio-&gt;bi_error after bio_put()
  block: shrink struct bio down to 2 cache lines again
  ...
</content>
</entry>
<entry>
<title>dm thin: optimize async discard submission</title>
<updated>2015-08-18T15:36:11+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2015-08-18T14:31:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=84f8bd86cc8977c344df572169f7ec10b8188cfa'/>
<id>urn:sha1:84f8bd86cc8977c344df572169f7ec10b8188cfa</id>
<content type='text'>
__blkdev_issue_discard_async() doesn't need to worry about further
splitting because the upper layer blkdev_issue_discard() will have
already handled splitting bios such that the bi_size isn't
overflowed.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Acked-by: Joe Thornber &lt;ejt@redhat.com&gt;
</content>
</entry>
</feed>
