<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/md/dm-thin.c, branch v4.17.1</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.17.1</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.17.1'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2018-04-03T19:04:10+00:00</updated>
<entry>
<title>dm: allow targets to return output from messages they are sent</title>
<updated>2018-04-03T19:04:10+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2018-02-28T20:59:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1eb5fa849f2bf9186a618e85bea23f02e527540a'/>
<id>urn:sha1:1eb5fa849f2bf9186a618e85bea23f02e527540a</id>
<content type='text'>
Could be useful for a target to return stats or other information.
If a target does DMEMIT() anything to @result from its .message method
then it must return 1 to the caller.

Signed-off-By: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm thin: fix trailing semicolon in __remap_and_issue_shared_cell</title>
<updated>2018-01-29T18:44:57+00:00</updated>
<author>
<name>Luis de Bethencourt</name>
<email>luisbg@kernel.org</email>
</author>
<published>2018-01-17T15:09:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bd6d1e0a5fafd47acb4ca5ca51d5dc8c7563112c'/>
<id>urn:sha1:bd6d1e0a5fafd47acb4ca5ca51d5dc8c7563112c</id>
<content type='text'>
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.

Signed-off-by: Luis de Bethencourt &lt;luisbg@kernel.org&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm: backfill missing calls to mutex_destroy()</title>
<updated>2018-01-17T14:16:15+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2018-01-06T02:17:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d5ffebdd797a7c1c89576267640f671db2a668fc'/>
<id>urn:sha1:d5ffebdd797a7c1c89576267640f671db2a668fc</id>
<content type='text'>
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>dm: fix various targets to dm_register_target after module __init resources created</title>
<updated>2017-12-04T15:23:10+00:00</updated>
<author>
<name>monty_pavel@sina.com</name>
<email>monty_pavel@sina.com</email>
</author>
<published>2017-11-24T17:43:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7e6358d244e4706fe612a77b9c36519a33600ac0'/>
<id>urn:sha1:7e6358d244e4706fe612a77b9c36519a33600ac0</id>
<content type='text'>
A NULL pointer is seen if two concurrent "vgchange -ay -K &lt;vg name&gt;"
processes race to load the dm-thin-pool module:

 PID: 25992 TASK: ffff883cd7d23500 CPU: 4 COMMAND: "vgchange"
  #0 [ffff883cd743d600] machine_kexec at ffffffff81038fa9
  0000001 [ffff883cd743d660] crash_kexec at ffffffff810c5992
  0000002 [ffff883cd743d730] oops_end at ffffffff81515c90
  0000003 [ffff883cd743d760] no_context at ffffffff81049f1b
  0000004 [ffff883cd743d7b0] __bad_area_nosemaphore at ffffffff8104a1a5
  0000005 [ffff883cd743d800] bad_area at ffffffff8104a2ce
  0000006 [ffff883cd743d830] __do_page_fault at ffffffff8104aa6f
  0000007 [ffff883cd743d950] do_page_fault at ffffffff81517bae
  0000008 [ffff883cd743d980] page_fault at ffffffff81514f95
     [exception RIP: kmem_cache_alloc+108]
     RIP: ffffffff8116ef3c RSP: ffff883cd743da38 RFLAGS: 00010046
     RAX: 0000000000000004 RBX: ffffffff81121b90 RCX: ffff881bf1e78cc0
     RDX: 0000000000000000 RSI: 00000000000000d0 RDI: 0000000000000000
     RBP: ffff883cd743da68 R8: ffff881bf1a4eb00 R9: 0000000080042000
     R10: 0000000000002000 R11: 0000000000000000 R12: 00000000000000d0
     R13: 0000000000000000 R14: 00000000000000d0 R15: 0000000000000246
     ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
  0000009 [ffff883cd743da70] mempool_alloc_slab at ffffffff81121ba5
 0000010 [ffff883cd743da80] mempool_create_node at ffffffff81122083
 0000011 [ffff883cd743dad0] mempool_create at ffffffff811220f4
 0000012 [ffff883cd743dae0] pool_ctr at ffffffffa08de049 [dm_thin_pool]
 0000013 [ffff883cd743dbd0] dm_table_add_target at ffffffffa0005f2f [dm_mod]
 0000014 [ffff883cd743dc30] table_load at ffffffffa0008ba9 [dm_mod]
 0000015 [ffff883cd743dc90] ctl_ioctl at ffffffffa0009dc4 [dm_mod]

The race results in a NULL pointer because:

Process A (vgchange -ay -K):
 	a. send DM_LIST_VERSIONS_CMD ioctl;
 	b. pool_target not registered;
 	c. modprobe dm_thin_pool and wait until end.

Process B (vgchange -ay -K):
 	a. send DM_LIST_VERSIONS_CMD ioctl;
 	b. pool_target registered;
 	c. table_load-&gt;dm_table_add_target-&gt;pool_ctr;
 	d. _new_mapping_cache is NULL and panic.
Note:
 	1. process A and process B are two concurrent processes.
 	2. pool_target can be detected by process B but
 	_new_mapping_cache initialization has not ended.

To fix dm-thin-pool, and other targets (cache, multipath, and snapshot)
with the same problem, simply dm_register_target() after all resources
created during module init (as labelled with __init) are finished.

Cc: stable@vger.kernel.org
Signed-off-by: monty &lt;monty_pavel@sina.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()</title>
<updated>2017-10-25T09:01:08+00:00</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2017-10-23T21:07:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6aa7de059173a986114ac43b8f50b297a86f09a8'/>
<id>urn:sha1:6aa7de059173a986114ac43b8f50b297a86f09a8</id>
<content type='text'>
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.

For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.

However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:

----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()

// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch

virtual patch

@ depends on patch @
expression E1, E2;
@@

- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)

@ depends on patch @
expression E;
@@

- ACCESS_ONCE(E)
+ READ_ONCE(E)
----

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm</title>
<updated>2017-09-14T20:43:16+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-09-14T20:43:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dff4d1f6fe85627b7ce8e4c5291d8621a1995605'/>
<id>urn:sha1:dff4d1f6fe85627b7ce8e4c5291d8621a1995605</id>
<content type='text'>
Pull device mapper updates from Mike Snitzer:

 - Some request-based DM core and DM multipath fixes and cleanups

 - Constify a few variables in DM core and DM integrity

 - Add bufio optimization and checksum failure accounting to DM
   integrity

 - Fix DM integrity to avoid checking integrity of failed reads

 - Fix DM integrity to use init_completion

 - A couple DM log-writes target fixes

 - Simplify DAX flushing by eliminating the unnecessary flush
   abstraction that was stood up for DM's use.

* tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dax: remove the pmem_dax_ops-&gt;flush abstraction
  dm integrity: use init_completion instead of COMPLETION_INITIALIZER_ONSTACK
  dm integrity: make blk_integrity_profile structure const
  dm integrity: do not check integrity for failed read operations
  dm log writes: fix &gt;512b sectorsize support
  dm log writes: don't use all the cpu while waiting to log blocks
  dm ioctl: constify ioctl lookup table
  dm: constify argument arrays
  dm integrity: count and display checksum failures
  dm integrity: optimize writing dm-bufio buffers that are partially changed
  dm rq: do not update rq partially in each ending bio
  dm rq: make dm-sq requeuing behavior consistent with dm-mq behavior
  dm mpath: complain about unsupported __multipath_map_bio() return values
  dm mpath: avoid that building with W=1 causes gcc 7 to complain about fall-through
</content>
</entry>
<entry>
<title>dm: constify argument arrays</title>
<updated>2017-08-28T15:47:18+00:00</updated>
<author>
<name>Eric Biggers</name>
<email>ebiggers@google.com</email>
</author>
<published>2017-06-22T18:32:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5916a22b83041b07d63191fe06206ae0fff6ec7a'/>
<id>urn:sha1:5916a22b83041b07d63191fe06206ae0fff6ec7a</id>
<content type='text'>
The arrays of 'struct dm_arg' are never modified by the device-mapper
core, so constify them so that they are placed in .rodata.

(Exception: the args array in dm-raid cannot be constified because it is
allocated on the stack and modified.)

Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
<entry>
<title>block: replace bi_bdev with a gendisk pointer and partitions index</title>
<updated>2017-08-23T18:49:55+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2017-08-23T17:10:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=74d46992e0d9dee7f1f376de0d56d31614c8a17a'/>
<id>urn:sha1:74d46992e0d9dee7f1f376de0d56d31614c8a17a</id>
<content type='text'>
This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-4.13/block' of git://git.kernel.dk/linux-block</title>
<updated>2017-07-03T17:34:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-07-03T17:34:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c6b1e36c8fa04a6680c44fe0321d0370400e90b6'/>
<id>urn:sha1:c6b1e36c8fa04a6680c44fe0321d0370400e90b6</id>
<content type='text'>
Pull core block/IO updates from Jens Axboe:
 "This is the main pull request for the block layer for 4.13. Not a huge
  round in terms of features, but there's a lot of churn related to some
  core cleanups.

  Note this depends on the UUID tree pull request, that Christoph
  already sent out.

  This pull request contains:

   - A series from Christoph, unifying the error/stats codes in the
     block layer. We now use blk_status_t everywhere, instead of using
     different schemes for different places.

   - Also from Christoph, some cleanups around request allocation and IO
     scheduler interactions in blk-mq.

   - And yet another series from Christoph, cleaning up how we handle
     and do bounce buffering in the block layer.

   - A blk-mq debugfs series from Bart, further improving on the support
     we have for exporting internal information to aid debugging IO
     hangs or stalls.

   - Also from Bart, a series that cleans up the request initialization
     differences across types of devices.

   - A series from Goldwyn Rodrigues, allowing the block layer to return
     failure if we will block and the user asked for non-blocking.

   - Patch from Hannes for supporting setting loop devices block size to
     that of the underlying device.

   - Two series of patches from Javier, fixing various issues with
     lightnvm, particular around pblk.

   - A series from me, adding support for write hints. This comes with
     NVMe support as well, so applications can help guide data placement
     on flash to improve performance, latencies, and write
     amplification.

   - A series from Ming, improving and hardening blk-mq support for
     stopping/starting and quiescing hardware queues.

   - Two pull requests for NVMe updates. Nothing major on the feature
     side, but lots of cleanups and bug fixes. From the usual crew.

   - A series from Neil Brown, greatly improving the bio rescue set
     support. Most notably, this kills the bio rescue work queues, if we
     don't really need them.

   - Lots of other little bug fixes that are all over the place"

* 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits)
  lightnvm: pblk: set line bitmap check under debug
  lightnvm: pblk: verify that cache read is still valid
  lightnvm: pblk: add initialization check
  lightnvm: pblk: remove target using async. I/Os
  lightnvm: pblk: use vmalloc for GC data buffer
  lightnvm: pblk: use right metadata buffer for recovery
  lightnvm: pblk: schedule if data is not ready
  lightnvm: pblk: remove unused return variable
  lightnvm: pblk: fix double-free on pblk init
  lightnvm: pblk: fix bad le64 assignations
  nvme: Makefile: remove dead build rule
  blk-mq: map all HWQ also in hyperthreaded system
  nvmet-rdma: register ib_client to not deadlock in device removal
  nvme_fc: fix error recovery on link down.
  nvmet_fc: fix crashes on bad opcodes
  nvme_fc: Fix crash when nvme controller connection fails.
  nvme_fc: replace ioabort msleep loop with completion
  nvme_fc: fix double calls to nvme_cleanup_cmd()
  nvme-fabrics: verify that a controller returns the correct NQN
  nvme: simplify nvme_dev_attrs_are_visible
  ...
</content>
</entry>
<entry>
<title>dm thin: do not queue freed thin mapping for next stage processing</title>
<updated>2017-06-27T19:14:34+00:00</updated>
<author>
<name>Vallish Vaidyeshwara</name>
<email>vallish@amazon.com</email>
</author>
<published>2017-06-23T18:53:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=00a0ea33b495ee6149bf5a77ac5807ce87323abb'/>
<id>urn:sha1:00a0ea33b495ee6149bf5a77ac5807ce87323abb</id>
<content type='text'>
process_prepared_discard_passdown_pt1() should cleanup
dm_thin_new_mapping in cases of error.

dm_pool_inc_data_range() can fail trying to get a block reference:

metadata operation 'dm_pool_inc_data_range' failed: error = -61

When dm_pool_inc_data_range() fails, dm thin aborts current metadata
transaction and marks pool as PM_READ_ONLY. Memory for thin mapping
is released as well. However, current thin mapping will be queued
onto next stage as part of queue_passdown_pt2() or passdown_endio().
This dangling thin mapping memory when processed and accessed in
next stage will lead to device mapper crashing.

Code flow without fix:
-&gt; process_prepared_discard_passdown_pt1(m)
   -&gt; dm_thin_remove_range()
   -&gt; discard passdown
      --&gt; passdown_endio(m) queues m onto next stage
   -&gt; dm_pool_inc_data_range() fails, frees memory m
            but does not remove it from next stage queue

-&gt; process_prepared_discard_passdown_pt2(m)
   -&gt; processes freed memory m and crashes

One such stack:

Call Trace:
[&lt;ffffffffa037a46f&gt;] dm_cell_release_no_holder+0x2f/0x70 [dm_bio_prison]
[&lt;ffffffffa039b6dc&gt;] cell_defer_no_holder+0x3c/0x80 [dm_thin_pool]
[&lt;ffffffffa039b88b&gt;] process_prepared_discard_passdown_pt2+0x4b/0x90 [dm_thin_pool]
[&lt;ffffffffa0399611&gt;] process_prepared+0x81/0xa0 [dm_thin_pool]
[&lt;ffffffffa039e735&gt;] do_worker+0xc5/0x820 [dm_thin_pool]
[&lt;ffffffff8152bf54&gt;] ? __schedule+0x244/0x680
[&lt;ffffffff81087e72&gt;] ? pwq_activate_delayed_work+0x42/0xb0
[&lt;ffffffff81089f53&gt;] process_one_work+0x153/0x3f0
[&lt;ffffffff8108a71b&gt;] worker_thread+0x12b/0x4b0
[&lt;ffffffff8108a5f0&gt;] ? rescuer_thread+0x350/0x350
[&lt;ffffffff8108fd6a&gt;] kthread+0xca/0xe0
[&lt;ffffffff8108fca0&gt;] ? kthread_park+0x60/0x60
[&lt;ffffffff81530b45&gt;] ret_from_fork+0x25/0x30

The fix is to first take the block ref count for discarded block and
then do a passdown discard of this block. If block ref count fails,
then bail out aborting current metadata transaction, mark pool as
PM_READ_ONLY and also free current thin mapping memory (existing error
handling code) without queueing this thin mapping onto next stage of
processing. If block ref count succeeds, then passdown discard of this
block. Discard callback of passdown_endio() will queue this thin mapping
onto next stage of processing.

Code flow with fix:
-&gt; process_prepared_discard_passdown_pt1(m)
   -&gt; dm_thin_remove_range()
   -&gt; dm_pool_inc_data_range()
      --&gt; if fails, free memory m and bail out
   -&gt; discard passdown
      --&gt; passdown_endio(m) queues m onto next stage

Cc: stable &lt;stable@vger.kernel.org&gt; # v4.9+
Reviewed-by: Eduardo Valentin &lt;eduval@amazon.com&gt;
Reviewed-by: Cristian Gafton &lt;gafton@amazon.com&gt;
Reviewed-by: Anchal Agarwal &lt;anchalag@amazon.com&gt;
Signed-off-by: Vallish Vaidyeshwara &lt;vallish@amazon.com&gt;
Reviewed-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
</entry>
</feed>
