summaryrefslogtreecommitdiff
path: root/drivers/md/dm-crypt.c
AgeCommit message (Collapse)AuthorFilesLines
2023-03-25Merge tag 'for-6.3/dm-fixes' of ↵Linus Torvalds1-6/+10
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM thin to work as a swap device by using 'limit_swap_bios' DM target flag (initially added to allow swap to dm-crypt) to throttle the amount of outstanding swap bios. - Fix DM crypt soft lockup warnings by calling cond_resched() from the cpu intensive loop in dmcrypt_write(). - Fix DM crypt to not access an uninitialized tasklet. This fix allows for consistent handling of IO completion, by _not_ needlessly punting to a workqueue when tasklets are not needed. - Fix DM core's alloc_dev() initialization for DM stats to check for and propagate alloc_percpu() failure. * tag 'for-6.3/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm stats: check for and propagate alloc_percpu failure dm crypt: avoid accessing uninitialized tasklet dm crypt: add cond_resched() to dmcrypt_write() dm thin: fix deadlock when swapping to thin device
2023-03-09dm crypt: avoid accessing uninitialized taskletMike Snitzer1-6/+9
When neither "no_read_workqueue" nor "no_write_workqueue" are enabled, tasklet_trylock() in crypt_dec_pending() may still return false due to an uninitialized state, and dm-crypt will unnecessarily do io completion in io_queue workqueue instead of current context. Fix this by adding an 'in_tasklet' flag to dm_crypt_io struct and initialize it to false in crypt_io_init(). Set this flag to true in kcryptd_queue_crypt() before calling tasklet_schedule(). If set crypt_dec_pending() will punt io completion to a workqueue. This also nicely avoids the tasklet_trylock/unlock hack when tasklets aren't in use. Fixes: 8e14f610159d ("dm crypt: do not call bio_endio() from the dm-crypt tasklet") Cc: stable@vger.kernel.org Reported-by: Hou Tao <houtao1@huawei.com> Suggested-by: Ignat Korchagin <ignat@cloudflare.com> Reviewed-by: Ignat Korchagin <ignat@cloudflare.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-03-07dm crypt: add cond_resched() to dmcrypt_write()Mikulas Patocka1-0/+1
The loop in dmcrypt_write may be running for unbounded amount of time, thus we need cond_resched() in it. This commit fixes the following warning: [ 3391.153255][ C12] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [dmcrypt_write/2:2897] ... [ 3391.387210][ C12] Call trace: [ 3391.390338][ C12] blk_attempt_bio_merge.part.6+0x38/0x158 [ 3391.395970][ C12] blk_attempt_plug_merge+0xc0/0x1b0 [ 3391.401085][ C12] blk_mq_submit_bio+0x398/0x550 [ 3391.405856][ C12] submit_bio_noacct+0x308/0x380 [ 3391.410630][ C12] dmcrypt_write+0x1e4/0x208 [dm_crypt] [ 3391.416005][ C12] kthread+0x130/0x138 [ 3391.419911][ C12] ret_from_fork+0x10/0x18 Reported-by: yangerkun <yangerkun@huawei.com> Fixes: dc2676210c42 ("dm crypt: offload writes to thread") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-23Merge tag 'for-6.3/dm-changes' of ↵Linus Torvalds1-55/+62
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Fix DM cache target to free background tracker work items, otherwise slab BUG will occur when kmem_cache_destroy() is called. - Improve 2 of DM's shrinker names to reflect their use. - Fix the DM flakey target to not corrupt the zero page. Fix dm-flakey on 32-bit hughmem systems by using bvec_kmap_local instead of page_address. Also, fix logic used when imposing the "corrupt_bio_byte" feature. - Stop using WQ_UNBOUND for DM verity target's verify_wq because it causes significant Android latencies on ARM64 (and doesn't show real benefit on other architectures). - Add negative check to catch simple case of a DM table referencing itself. More complex scenarios that use intermediate devices to self-reference still need to be avoided/handled in userspace. - Fix DM core's resize to only send one uevent instead of two. This fixes a race with udev, that if udev wins, will cause udev to miss uevents (which caused premature unmount attempts by systemd). - Add cond_resched() to workqueue functions in DM core, dn-thin and dm-cache so that their loops aren't the cause of unintended cpu scheduling fairness issues. - Fix all of DM's checkpatch errors and warnings (famous last words). Various other small cleanups. * tag 'for-6.3/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (62 commits) dm: remove unnecessary (void*) conversion in event_callback() dm ioctl: remove unnecessary check when using dm_get_mdptr() dm ioctl: assert _hash_lock is held in __hash_remove dm cache: add cond_resched() to various workqueue loops dm thin: add cond_resched() to various workqueue loops dm: add cond_resched() to dm_wq_requeue_work() dm: add cond_resched() to dm_wq_work() dm sysfs: make kobj_type structure constant dm: update targets using system workqueues to use a local workqueue dm: remove flush_scheduled_work() during local_exit() dm clone: prefer kvmalloc_array() dm: declare variables static when sensible dm: fix suspect indent whitespace dm ioctl: prefer strscpy() instead of strlcpy() dm: avoid void function return statements dm integrity: change macros min/max() -> min_t/max_t where appropriate dm: fix use of sizeof() macro dm: avoid 'do {} while(0)' loop in single statement macros dm log: avoid multiple line dereference dm log: avoid trailing semicolon in macro ...
2023-02-14dm: avoid split of quoted strings where possibleHeinz Mauelshagen1-2/+1
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: add missing empty linesHeinz Mauelshagen1-0/+3
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm crypt: correct 'foo*' to 'foo *'Heinz Mauelshagen1-9/+9
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: correct block comments format.Heinz Mauelshagen1-3/+4
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: address indent/space issuesHeinz Mauelshagen1-1/+1
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: avoid initializing static variablesHeinz Mauelshagen1-1/+1
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: avoid assignment in if conditionsHeinz Mauelshagen1-3/+6
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: change "unsigned" to "unsigned int"Heinz Mauelshagen1-24/+24
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: prefer kmap_local_page() instead of deprecated kmap_atomic()Heinz Mauelshagen1-12/+12
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14dm: add missing SPDX-License-IndentifiersHeinz Mauelshagen1-0/+1
'GPL-2.0-only' is used instead of 'GPL-2.0' because SPDX has deprecated its use. Suggested-by: John Wiele <jwiele@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-13dm: Remove completion function scaffoldingHerbert Xu1-3/+3
This patch removes the temporary scaffolding now that the comletion function signature has been converted. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-02-13dm: Add scaffolding to change completion function signatureHerbert Xu1-5/+3
This patch adds temporary scaffolding so that the Crypto API completion function can take a void * instead of crypto_async_request. Once affected users have been converted this can be removed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-02-02dm crypt: Slightly simplify crypt_set_keyring_key()Christophe JAILLET1-1/+1
Use strchr() instead of strpbrk() when there is only 1 element in the set of characters to look for. This potentially saves a few cycles, but gcc does already account for optimizing this pattern thanks to it's fold_builtin_strpbrk(). Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-11-17dm-crypt: provide dma_alignment limit in io_hintsKeith Busch1-0/+1
This device mapper needs bio vectors to be sized and memory aligned to the logical block size. Set the minimum required queue limit accordingly. Link: https://lore.kernel.org/linux-block/20221101001558.648ee024@xps.demsh.org/ Fixes: b1a000d3b8ec5 ("block: relax direct io memory alignment") Reportred-by: Eric Biggers <ebiggers@kernel.org> Reported-by: Dmitrii Tcvetkov <me@demsh.org> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Mike Snitzer <snitzer@kernel.org> Link: https://lore.kernel.org/r/20221110184501.2451620-3-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-09dm crypt: make printing of the key constant-timeMikulas Patocka1-3/+11
The device mapper dm-crypt target is using scnprintf("%02x", cc->key[i]) to report the current key to userspace. However, this is not a constant-time operation and it may leak information about the key via timing, via cache access patterns or via the branch predictor. Change dm-crypt's key printing to use "%c" instead of "%02x". Also introduce hex2asc() that carefully avoids any branching or memory accesses when converting a number in the range 0 ... 15 to an ascii character. Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-03-25Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-1/+0
Pull SCSI updates from James Bottomley: "This series consists of the usual driver updates (qla2xxx, pm8001, libsas, smartpqi, scsi_debug, lpfc, iscsi, mpi3mr) plus minor updates and bug fixes. The high blast radius core update is the removal of write same, which affects block and several non-SCSI devices. The other big change, which is more local, is the removal of the SCSI pointer" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (281 commits) scsi: scsi_ioctl: Drop needless assignment in sg_io() scsi: bsg: Drop needless assignment in scsi_bsg_sg_io_fn() scsi: lpfc: Copyright updates for 14.2.0.0 patches scsi: lpfc: Update lpfc version to 14.2.0.0 scsi: lpfc: SLI path split: Refactor BSG paths scsi: lpfc: SLI path split: Refactor Abort paths scsi: lpfc: SLI path split: Refactor SCSI paths scsi: lpfc: SLI path split: Refactor CT paths scsi: lpfc: SLI path split: Refactor misc ELS paths scsi: lpfc: SLI path split: Refactor VMID paths scsi: lpfc: SLI path split: Refactor FDISC paths scsi: lpfc: SLI path split: Refactor LS_RJT paths scsi: lpfc: SLI path split: Refactor LS_ACC paths scsi: lpfc: SLI path split: Refactor the RSCN/SCR/RDF/EDC/FARPR paths scsi: lpfc: SLI path split: Refactor PLOGI/PRLI/ADISC/LOGO paths scsi: lpfc: SLI path split: Refactor base ELS paths and the FLOGI path scsi: lpfc: SLI path split: Introduce lpfc_prep_wqe scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4 scsi: lpfc: SLI path split: Refactor lpfc_iocbq scsi: lpfc: Use kcalloc() ...
2022-03-25Merge tag 'for-5.18/dm-changes' of ↵Linus Torvalds1-6/+9
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Significant refactoring and fixing of how DM core does bio-based IO accounting with focus on fixing wildly inaccurate IO stats for dm-crypt (and other DM targets that defer bio submission in their own workqueues). End result is proper IO accounting, made possible by targets being updated to use the new dm_submit_bio_remap() interface. - Add hipri bio polling support (REQ_POLLED) to bio-based DM. - Reduce dm_io and dm_target_io structs so that a single dm_io (which contains dm_target_io and first clone bio) weighs in at 256 bytes. For reference the bio struct is 128 bytes. - Various other small cleanups, fixes or improvements in DM core and targets. - Update MAINTAINERS with my kernel.org email address to allow distinction between my "upstream" and "Red" Hats. * tag 'for-5.18/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (46 commits) dm: consolidate spinlocks in dm_io struct dm: reduce size of dm_io and dm_target_io structs dm: switch dm_target_io booleans over to proper flags dm: switch dm_io booleans over to proper flags dm: update email address in MAINTAINERS dm: return void from __send_empty_flush dm: factor out dm_io_complete dm cache: use dm_submit_bio_remap dm: simplify dm_sumbit_bio_remap interface dm thin: use dm_submit_bio_remap dm: add WARN_ON_ONCE to dm_submit_bio_remap dm: support bio polling block: add ->poll_bio to block_device_operations dm mpath: use DMINFO instead of printk with KERN_INFO dm: stop using bdevname dm-zoned: remove the ->name field in struct dmz_dev dm: remove unnecessary local variables in __bind dm: requeue IO if mapping table not yet available dm io: remove stale comment block for dm_io() dm thin metadata: remove unused dm_thin_remove_block and __remove ...
2022-03-10dm: simplify dm_sumbit_bio_remap interfaceMike Snitzer1-3/+3
Remove the from_wq argument from dm_sumbit_bio_remap(). Eliminates the need for dm_sumbit_bio_remap() callers to know whether they are calling for a workqueue or from the original dm_submit_bio(). Add map_task to dm_io struct, record the map_task in alloc_io and clear it after all target ->map() calls have completed. Update dm_sumbit_bio_remap to check if 'current' matches io->map_task rather than rely on passed 'from_rq' argument. This change really simplifies the chore of porting each DM target to using dm_sumbit_bio_remap() because there is no longer the risk of programming error by not completely knowing all the different contexts a particular method that calls dm_sumbit_bio_remap() might be used in. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2022-03-07dm-crypt: stop using bio_devnameChristoph Hellwig1-6/+4
Use the %pg format specifier to save on stack consuption and code size. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220304180105.409765-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-23scsi: dm: Remove WRITE_SAME supportChristoph Hellwig1-1/+0
There are no more end-users of REQ_OP_WRITE_SAME left, so we can start deleting it. Link: https://lore.kernel.org/r/20220209082828.2629273-7-hch@lst.de Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-22dm crypt: fix get_key_size compiler warning if !CONFIG_KEYSAashish Sharma1-1/+1
Explicitly convert unsigned int in the right of the conditional expression to int to match the left side operand and the return type, fixing the following compiler warning: drivers/md/dm-crypt.c:2593:43: warning: signed and unsigned type in conditional expression [-Wsign-compare] Fixes: c538f6ec9f56 ("dm crypt: add ability to use keys from the kernel key retention service") Signed-off-by: Aashish Sharma <shraash@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2022-02-21dm crypt: use dm_submit_bio_remapMike Snitzer1-5/+8
Care was taken to support kcryptd_io_read being called from crypt_map or workqueue. Use of an intermediate CRYPT_MAP_READ_GFP gfp_t (defined as GFP_NOWAIT) should protect from maintenance burden if that flag were to change for some reason. Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2022-02-04block: pass a block_device to bio_clone_fastChristoph Hellwig1-6/+5
Pass a block_device to bio_clone_fast and __bio_clone_fast and give the functions more suitable names. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-14-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-02block: pass a block_device and opf to bio_alloc_biosetChristoph Hellwig1-3/+2
Pass the block_device and operation that we plan to use this bio for to bio_alloc_bioset to optimize the assigment. NULL/0 can be passed, both for the passthrough case on a raw request_queue and to temporarily avoid refactoring some nasty code. Also move the gfp_mask argument after the nr_vecs argument for a much more logical calling convention matching what most of the kernel does. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220124091107.642561-16-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-02dm-crypt: remove clone_initChristoph Hellwig1-13/+8
Just open code it next to the bio allocations, which saves a few lines of code, prepares for future changes and allows to remove the duplicate bi_opf assignment for the bio_clone_fast case in kcryptd_io_read. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220124091107.642561-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-02dm: bio_alloc can't fail if it is allowed to sleepChristoph Hellwig1-4/+1
Remove handling of NULL returns from sleeping bio_alloc calls given that those can't fail. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220124091107.642561-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-09Merge tag 'for-5.16/dm-changes' of ↵Linus Torvalds1-6/+19
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Add DM core support for emitting audit events through the audit subsystem. Also enhance both the integrity and crypt targets to emit events to via dm-audit. - Various other simple code improvements and cleanups. * tag 'for-5.16/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm table: log table creation error code dm: make workqueue names device-specific dm writecache: Make use of the helper macro kthread_run() dm crypt: Make use of the helper macro kthread_run() dm verity: use bvec_kmap_local in verity_for_bv_block dm log writes: use memcpy_from_bvec in log_writes_map dm integrity: use bvec_kmap_local in __journal_read_write dm integrity: use bvec_kmap_local in integrity_metadata dm: add add_disk() error handling dm: Remove redundant flush_workqueue() calls dm crypt: log aead integrity violations to audit subsystem dm integrity: log audit events for dm-integrity target dm: introduce audit event module for device mapper
2021-11-01dm crypt: Make use of the helper macro kthread_run()Cai Huoqing1-2/+1
Replace kthread_create/wake_up_process() with kthread_run() to simplify the code. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-10-27dm crypt: log aead integrity violations to audit subsystemMichael Weiß1-4/+18
Since dm-crypt target can be stacked on dm-integrity targets to provide authenticated encryption, integrity violations are recognized here during aead computation. We use the dm-audit submodule to signal those events to user space, too. The construction and destruction of crypt device mappings are also logged as audit events. Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-10-18block: move integrity handling out of <linux/blkdev.h>Christoph Hellwig1-0/+1
Split the integrity/metadata handling definitions out into a new header. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-17-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-20dm crypt: use in_hardirq() instead of deprecated in_irq()Changbin Du1-2/+2
Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-08-18dm crypt: Avoid percpu_counter spinlock contention in crypt_page_alloc()Arne Welzel1-1/+6
On systems with many cores using dm-crypt, heavy spinlock contention in percpu_counter_compare() can be observed when the page allocation limit for a given device is reached or close to be reached. This is due to percpu_counter_compare() taking a spinlock to compute an exact result on potentially many CPUs at the same time. Switch to non-exact comparison of allocated and allowed pages by using the value returned by percpu_counter_read_positive() to avoid taking the percpu_counter spinlock. This may over/under estimate the actual number of allocated pages by at most (batch-1) * num_online_cpus(). Currently, batch is bounded by 32. The system on which this issue was first observed has 256 CPUs and 512GB of RAM. With a 4k page size, this change may over/under estimate by 31MB. With ~10G (2%) allowed dm-crypt allocations, this seems an acceptable error. Certainly preferred over running into the spinlock contention. This behavior was reproduced on an EC2 c5.24xlarge instance with 96 CPUs and 192GB RAM as follows, but can be provoked on systems with less CPUs as well. * Disable swap * Tune vm settings to promote regular writeback $ echo 50 > /proc/sys/vm/dirty_expire_centisecs $ echo 25 > /proc/sys/vm/dirty_writeback_centisecs $ echo $((128 * 1024 * 1024)) > /proc/sys/vm/dirty_background_bytes * Create 8 dmcrypt devices based on files on a tmpfs * Create and mount an ext4 filesystem on each crypt devices * Run stress-ng --hdd 8 within one of above filesystems Total %system usage collected from sysstat goes to ~35%. Write throughput on the underlying loop device is ~2GB/s. perf profiling an individual kworker kcryptd thread shows the following profile, indicating spinlock contention in percpu_counter_compare(): 99.98% 0.00% kworker/u193:46 [kernel.kallsyms] [k] ret_from_fork | --ret_from_fork kthread worker_thread | --99.92%--process_one_work | |--80.52%--kcryptd_crypt | | | |--62.58%--mempool_alloc | | | | | --62.24%--crypt_page_alloc | | | | | --61.51%--__percpu_counter_compare | | | | | --61.34%--__percpu_counter_sum | | | | | |--58.68%--_raw_spin_lock_irqsave | | | | | | | --58.30%--native_queued_spin_lock_slowpath | | | | | --0.69%--cpumask_next | | | | | --0.51%--_find_next_bit | | | |--10.61%--crypt_convert | | | | | |--6.05%--xts_crypt ... After applying this patch and running the same test, %system usage is lowered to ~7% and write throughput on the loop device increases to ~2.7GB/s. perf report shows mempool_alloc() as ~8% rather than ~62% in the profile and not hitting the percpu_counter() spinlock anymore. |--8.15%--mempool_alloc | | | |--3.93%--crypt_page_alloc | | | | | --3.75%--__alloc_pages | | | | | --3.62%--get_page_from_freelist | | | | | --3.22%--rmqueue_bulk | | | | | --2.59%--_raw_spin_lock | | | | | --2.57%--native_queued_spin_lock_slowpath | | | --3.05%--_raw_spin_lock_irqsave | | | --2.49%--native_queued_spin_lock_slowpath Suggested-by: DJ Gregor <dj@corelight.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Arne Welzel <arne.welzel@corelight.com> Fixes: 5059353df86e ("dm crypt: limit the number of allocated pages") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-08-10dm: update target status functions to support IMA measurementTushar Sugandhi1-0/+27
For device mapper targets to take advantage of IMA's measurement capabilities, the status functions for the individual targets need to be updated to handle the status_type_t case for value STATUSTYPE_IMA. Update status functions for the following target types, to log their respective attributes to be measured using IMA. 01. cache 02. crypt 03. integrity 04. linear 05. mirror 06. multipath 07. raid 08. snapshot 09. striped 10. verity For rest of the targets, handle the STATUSTYPE_IMA case by setting the measurement buffer to NULL. For IMA to measure the data on a given system, the IMA policy on the system needs to be updated to have the following line, and the system needs to be restarted for the measurements to take effect. /etc/ima/ima-policy measure func=CRITICAL_DATA label=device-mapper template=ima-buf The measurements will be reflected in the IMA logs, which are located at: /sys/kernel/security/integrity/ima/ascii_runtime_measurements /sys/kernel/security/integrity/ima/binary_runtime_measurements These IMA logs can later be consumed by various attestation clients running on the system, and send them to external services for attesting the system. The DM target data measured by IMA subsystem can alternatively be queried from userspace by setting DM_IMA_MEASUREMENT_FLAG with DM_TABLE_STATUS_CMD. Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-06-04dm crypt: Fix zoned block device supportDamien Le Moal1-5/+19
Zone append BIOs (REQ_OP_ZONE_APPEND) always specify the start sector of the zone to be written instead of the actual sector location to write. The write location is determined by the device and returned to the host upon completion of the operation. This interface, while simple and efficient for writing into sequential zones of a zoned block device, is incompatible with the use of sector values to calculate a cypher block IV. All data written in a zone end up using the same IV values corresponding to the first sectors of the zone, but read operation will specify any sector within the zone resulting in an IV mismatch between encryption and decryption. To solve this problem, report to DM core that zone append operations are not supported. This result in the zone append operations being emulated using regular write operations. Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-06-04dm: Introduce dm_report_zones()Damien Le Moal1-4/+3
To simplify the implementation of the report_zones operation of a zoned target, introduce the function dm_report_zones() to set a target mapping start sector in struct dm_report_zones_args and call blkdev_report_zones(). This new function is exported and the report zones callback function dm_report_zones_cb() is not. dm-linear, dm-flakey and dm-crypt are modified to use dm_report_zones(). Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-03-11block: rename BIO_MAX_PAGES to BIO_MAX_VECSChristoph Hellwig1-4/+4
Ever since the addition of multipage bio_vecs BIO_MAX_PAGES has been horribly confusingly misnamed. Rename it to BIO_MAX_VECS to stop confusing users of the bio API. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210311110137.1132391-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-11dm: fix deadlock when swapping to encrypted deviceMikulas Patocka1-0/+1
The system would deadlock when swapping to a dm-crypt device. The reason is that for each incoming write bio, dm-crypt allocates memory that holds encrypted data. These excessive allocations exhaust all the memory and the result is either deadlock or OOM trigger. This patch limits the number of in-flight swap bios, so that the memory consumed by dm-crypt is limited. The limit is enforced if the target set the "limit_swap_bios" variable and if the bio has REQ_SWAP set. Non-swap bios are not affected becuase taking the semaphore would cause performance degradation. This is similar to request-based drivers - they will also block when the number of requests is over the limit. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-11dm: simplify target code conditional on CONFIG_BLK_DEV_ZONEDMike Snitzer1-4/+2
Allow removal of CONFIG_BLK_DEV_ZONED conditionals in target_type definition of various targets. Suggested-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-03dm crypt: support using trusted keysAhmad Fatoum1-1/+22
Commit 27f5411a718c ("dm crypt: support using encrypted keys") extended dm-crypt to allow use of "encrypted" keys along with "user" and "logon". Along the same lines, teach dm-crypt to support "trusted" keys as well. Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-03dm crypt: replaced #if defined with IS_ENABLEDAhmad Fatoum1-5/+2
IS_ENABLED(CONFIG_ENCRYPTED_KEYS) is true whether the option is built-in or a module, so use it instead of #if defined checking for each separately. The other #if was to avoid a static function defined, but unused warning. As we now always build the callsite when the function is defined, we can remove that first #if guard. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-03dm crypt: Spelling s/cihper/cipher/Geert Uytterhoeven1-1/+1
Fix a misspelling of "cipher". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-01-21dm crypt: fix copy and paste bug in crypt_alloc_req_aeadIgnat Korchagin1-3/+3
In commit d68b29584c25 ("dm crypt: use GFP_ATOMIC when allocating crypto requests from softirq") code was incorrectly copy and pasted from crypt_alloc_req_skcipher()'s crypto request allocation code to crypt_alloc_req_aead(). It is OK from runtime perspective as both simple encryption request pointer and AEAD request pointer are part of a union, but may confuse code reviewers. Fixes: d68b29584c25 ("dm crypt: use GFP_ATOMIC when allocating crypto requests from softirq") Cc: stable@vger.kernel.org # v5.9+ Reported-by: Pavel Machek <pavel@denx.de> Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-01-14dm crypt: defer decryption to a tasklet if interrupts disabledIgnat Korchagin1-2/+6
On some specific hardware on early boot we occasionally get: [ 1193.920255][ T0] BUG: sleeping function called from invalid context at mm/mempool.c:381 [ 1193.936616][ T0] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/69 [ 1193.953233][ T0] no locks held by swapper/69/0. [ 1193.965871][ T0] irq event stamp: 575062 [ 1193.977724][ T0] hardirqs last enabled at (575061): [<ffffffffab73f662>] tick_nohz_idle_exit+0xe2/0x3e0 [ 1194.002762][ T0] hardirqs last disabled at (575062): [<ffffffffab74e8af>] flush_smp_call_function_from_idle+0x4f/0x80 [ 1194.029035][ T0] softirqs last enabled at (575050): [<ffffffffad600fd2>] asm_call_irq_on_stack+0x12/0x20 [ 1194.054227][ T0] softirqs last disabled at (575043): [<ffffffffad600fd2>] asm_call_irq_on_stack+0x12/0x20 [ 1194.079389][ T0] CPU: 69 PID: 0 Comm: swapper/69 Not tainted 5.10.6-cloudflare-kasan-2021.1.4-dev #1 [ 1194.104103][ T0] Hardware name: NULL R162-Z12-CD/MZ12-HD4-CD, BIOS R10 06/04/2020 [ 1194.119591][ T0] Call Trace: [ 1194.130233][ T0] dump_stack+0x9a/0xcc [ 1194.141617][ T0] ___might_sleep.cold+0x180/0x1b0 [ 1194.153825][ T0] mempool_alloc+0x16b/0x300 [ 1194.165313][ T0] ? remove_element+0x160/0x160 [ 1194.176961][ T0] ? blk_mq_end_request+0x4b/0x490 [ 1194.188778][ T0] crypt_convert+0x27f6/0x45f0 [dm_crypt] [ 1194.201024][ T0] ? rcu_read_lock_sched_held+0x3f/0x70 [ 1194.212906][ T0] ? module_assert_mutex_or_preempt+0x3e/0x70 [ 1194.225318][ T0] ? __module_address.part.0+0x1b/0x3a0 [ 1194.237212][ T0] ? is_kernel_percpu_address+0x5b/0x190 [ 1194.249238][ T0] ? crypt_iv_tcw_ctr+0x4a0/0x4a0 [dm_crypt] [ 1194.261593][ T0] ? is_module_address+0x25/0x40 [ 1194.272905][ T0] ? static_obj+0x8a/0xc0 [ 1194.283582][ T0] ? lockdep_init_map_waits+0x26a/0x700 [ 1194.295570][ T0] ? __raw_spin_lock_init+0x39/0x110 [ 1194.307330][ T0] kcryptd_crypt_read_convert+0x31c/0x560 [dm_crypt] [ 1194.320496][ T0] ? kcryptd_queue_crypt+0x1be/0x380 [dm_crypt] [ 1194.333203][ T0] blk_update_request+0x6d7/0x1500 [ 1194.344841][ T0] ? blk_mq_trigger_softirq+0x190/0x190 [ 1194.356831][ T0] blk_mq_end_request+0x4b/0x490 [ 1194.367994][ T0] ? blk_mq_trigger_softirq+0x190/0x190 [ 1194.379693][ T0] flush_smp_call_function_queue+0x24b/0x560 [ 1194.391847][ T0] flush_smp_call_function_from_idle+0x59/0x80 [ 1194.403969][ T0] do_idle+0x287/0x450 [ 1194.413891][ T0] ? arch_cpu_idle_exit+0x40/0x40 [ 1194.424716][ T0] ? lockdep_hardirqs_on_prepare+0x286/0x3f0 [ 1194.436399][ T0] ? _raw_spin_unlock_irqrestore+0x39/0x40 [ 1194.447759][ T0] cpu_startup_entry+0x19/0x20 [ 1194.458038][ T0] secondary_startup_64_no_verify+0xb0/0xbb IO completion can be queued to a different CPU by the block subsystem as a "call single function/data". The CPU may run these routines from the idle task, but it does so with interrupts disabled. It is not a good idea to do decryption with irqs disabled even in an idle task context, so just defer it to a tasklet (as is done with requests from hard irqs). Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-01-12dm crypt: do not call bio_endio() from the dm-crypt taskletIgnat Korchagin1-1/+23
Sometimes, when dm-crypt executes decryption in a tasklet, we may get "BUG: KASAN: use-after-free in tasklet_action_common.constprop..." with a kasan-enabled kernel. When the decryption fully completes in the tasklet, dm-crypt will call bio_endio(), which in turn will call clone_endio() from dm.c core code. That function frees the resources associated with the bio, including per bio private structures. For dm-crypt it will free the current struct dm_crypt_io, which contains our tasklet object, causing use-after-free, when the tasklet is being dequeued by the kernel. To avoid this, do not call bio_endio() from the current tasklet context, but delay its execution to the dm-crypt IO workqueue. Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues") Cc: <stable@vger.kernel.org> # v5.9+ Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-01-04dm crypt: use GFP_ATOMIC when allocating crypto requests from softirqIgnat Korchagin1-10/+25
Commit 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues") made it possible for some code paths in dm-crypt to be executed in softirq context, when the underlying driver processes IO requests in interrupt/softirq context. In this case sometimes when allocating a new crypto request we may get a stacktrace like below: [ 210.103008][ C0] BUG: sleeping function called from invalid context at mm/mempool.c:381 [ 210.104746][ C0] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2602, name: fio [ 210.106599][ C0] CPU: 0 PID: 2602 Comm: fio Tainted: G W 5.10.0+ #50 [ 210.108331][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 210.110212][ C0] Call Trace: [ 210.110921][ C0] <IRQ> [ 210.111527][ C0] dump_stack+0x7d/0xa3 [ 210.112411][ C0] ___might_sleep.cold+0x122/0x151 [ 210.113527][ C0] mempool_alloc+0x16b/0x2f0 [ 210.114524][ C0] ? __queue_work+0x515/0xde0 [ 210.115553][ C0] ? mempool_resize+0x700/0x700 [ 210.116586][ C0] ? crypt_endio+0x91/0x180 [ 210.117479][ C0] ? blk_update_request+0x757/0x1150 [ 210.118513][ C0] ? blk_mq_end_request+0x4b/0x480 [ 210.119572][ C0] ? blk_done_softirq+0x21d/0x340 [ 210.120628][ C0] ? __do_softirq+0x190/0x611 [ 210.121626][ C0] crypt_convert+0x29f9/0x4c00 [ 210.122668][ C0] ? _raw_spin_lock_irqsave+0x87/0xe0 [ 210.123824][ C0] ? kasan_set_track+0x1c/0x30 [ 210.124858][ C0] ? crypt_iv_tcw_ctr+0x4a0/0x4a0 [ 210.125930][ C0] ? kmem_cache_free+0x104/0x470 [ 210.126973][ C0] ? crypt_endio+0x91/0x180 [ 210.127947][ C0] kcryptd_crypt_read_convert+0x30e/0x420 [ 210.129165][ C0] blk_update_request+0x757/0x1150 [ 210.130231][ C0] blk_mq_end_request+0x4b/0x480 [ 210.131294][ C0] blk_done_softirq+0x21d/0x340 [ 210.132332][ C0] ? _raw_spin_lock+0x81/0xd0 [ 210.133289][ C0] ? blk_mq_stop_hw_queue+0x30/0x30 [ 210.134399][ C0] ? _raw_read_lock_irq+0x40/0x40 [ 210.135458][ C0] __do_softirq+0x190/0x611 [ 210.136409][ C0] ? handle_edge_irq+0x221/0xb60 [ 210.137447][ C0] asm_call_irq_on_stack+0x12/0x20 [ 210.138507][ C0] </IRQ> [ 210.139118][ C0] do_softirq_own_stack+0x37/0x40 [ 210.140191][ C0] irq_exit_rcu+0x110/0x1b0 [ 210.141151][ C0] common_interrupt+0x74/0x120 [ 210.142171][ C0] asm_common_interrupt+0x1e/0x40 Fix this by allocating crypto requests with GFP_ATOMIC mask in interrupt context. Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues") Cc: stable@vger.kernel.org # v5.9+ Reported-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-01-04dm crypt: do not wait for backlogged crypto request completion in softirqIgnat Korchagin1-5/+98
Commit 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues") made it possible for some code paths in dm-crypt to be executed in softirq context, when the underlying driver processes IO requests in interrupt/softirq context. When Crypto API backlogs a crypto request, dm-crypt uses wait_for_completion to avoid sending further requests to an already overloaded crypto driver. However, if the code is executing in softirq context, we might get the following stacktrace: [ 210.235213][ C0] BUG: scheduling while atomic: fio/2602/0x00000102 [ 210.236701][ C0] Modules linked in: [ 210.237566][ C0] CPU: 0 PID: 2602 Comm: fio Tainted: G W 5.10.0+ #50 [ 210.239292][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 [ 210.241233][ C0] Call Trace: [ 210.241946][ C0] <IRQ> [ 210.242561][ C0] dump_stack+0x7d/0xa3 [ 210.243466][ C0] __schedule_bug.cold+0xb3/0xc2 [ 210.244539][ C0] __schedule+0x156f/0x20d0 [ 210.245518][ C0] ? io_schedule_timeout+0x140/0x140 [ 210.246660][ C0] schedule+0xd0/0x270 [ 210.247541][ C0] schedule_timeout+0x1fb/0x280 [ 210.248586][ C0] ? usleep_range+0x150/0x150 [ 210.249624][ C0] ? unpoison_range+0x3a/0x60 [ 210.250632][ C0] ? ____kasan_kmalloc.constprop.0+0x82/0xa0 [ 210.251949][ C0] ? unpoison_range+0x3a/0x60 [ 210.252958][ C0] ? __prepare_to_swait+0xa7/0x190 [ 210.254067][ C0] do_wait_for_common+0x2ab/0x370 [ 210.255158][ C0] ? usleep_range+0x150/0x150 [ 210.256192][ C0] ? bit_wait_io_timeout+0x160/0x160 [ 210.257358][ C0] ? blk_update_request+0x757/0x1150 [ 210.258582][ C0] ? _raw_spin_lock_irq+0x82/0xd0 [ 210.259674][ C0] ? _raw_read_unlock_irqrestore+0x30/0x30 [ 210.260917][ C0] wait_for_completion+0x4c/0x90 [ 210.261971][ C0] crypt_convert+0x19a6/0x4c00 [ 210.263033][ C0] ? _raw_spin_lock_irqsave+0x87/0xe0 [ 210.264193][ C0] ? kasan_set_track+0x1c/0x30 [ 210.265191][ C0] ? crypt_iv_tcw_ctr+0x4a0/0x4a0 [ 210.266283][ C0] ? kmem_cache_free+0x104/0x470 [ 210.267363][ C0] ? crypt_endio+0x91/0x180 [ 210.268327][ C0] kcryptd_crypt_read_convert+0x30e/0x420 [ 210.269565][ C0] blk_update_request+0x757/0x1150 [ 210.270563][ C0] blk_mq_end_request+0x4b/0x480 [ 210.271680][ C0] blk_done_softirq+0x21d/0x340 [ 210.272775][ C0] ? _raw_spin_lock+0x81/0xd0 [ 210.273847][ C0] ? blk_mq_stop_hw_queue+0x30/0x30 [ 210.275031][ C0] ? _raw_read_lock_irq+0x40/0x40 [ 210.276182][ C0] __do_softirq+0x190/0x611 [ 210.277203][ C0] ? handle_edge_irq+0x221/0xb60 [ 210.278340][ C0] asm_call_irq_on_stack+0x12/0x20 [ 210.279514][ C0] </IRQ> [ 210.280164][ C0] do_softirq_own_stack+0x37/0x40 [ 210.281281][ C0] irq_exit_rcu+0x110/0x1b0 [ 210.282286][ C0] common_interrupt+0x74/0x120 [ 210.283376][ C0] asm_common_interrupt+0x1e/0x40 [ 210.284496][ C0] RIP: 0010:_aesni_enc1+0x65/0xb0 Fix this by making crypt_convert function reentrant from the point of a single bio and make dm-crypt defer further bio processing to a workqueue, if Crypto API backlogs a request in interrupt context. Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Ignat Korchagin <ignat@cloudflare.com> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>