starfive-tech/linux.git - StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2024-02-01	iomap: map multiple blocks at a time	Christoph Hellwig	1	-35/+81
	The ->map_blocks interface returns a valid range for writeback, but we still call back into it for every block, which is a bit inefficient. Change iomap_writepage_map to use the valid range in the map until the end of the folio or the dirty range inside the folio instead of calling back into every block. Note that the range is not used over folio boundaries as we need to be able to check the mapping sequence count under the folio lock. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-14-hch@lst.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: submit ioends immediately	Christoph Hellwig	1	-86/+76
	Currently the writeback code delays submitting fill ioends until we reach the end of the folio. The reason for that is that otherwise the end I/O handler could clear the writeback bit before we've even finished submitting all I/O for the folio. Add a bias to ifs->write_bytes_pending while we are submitting I/O for a folio so that it never reaches zero until all I/O is completed to prevent the early writeback bit clearing, and remove the now superfluous submit_list. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-13-hch@lst.de Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: factor out a iomap_writepage_map_block helper	Christoph Hellwig	1	-27/+43
	Split the loop body that calls into the file system to map a block and add it to the ioend into a separate helper to prefer for refactoring of the surrounding code. Note that this was the only place in iomap_writepage_map that could return an error, so include the call to ->discard_folio into the new helper as that will help to avoid code duplication in the future. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-12-hch@lst.de Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: only call mapping_set_error once for each failed bio	Christoph Hellwig	1	-13/+14
	Instead of clling mapping_set_error once per folio, only do that once per bio, and consolidate all the writeback error handling code in iomap_finish_ioend. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-11-hch@lst.de Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: don't chain bios	Christoph Hellwig	2	-70/+26
	Back in the days when a single bio could only be filled to the hardware limits, and we scheduled a work item for each bio completion, chaining multiple bios for a single ioend made a lot of sense to reduce the number of completions. But these days bios can be filled until we reach the number of vectors or total size limit, which means we can always fit at least 1 megabyte worth of data in the worst case, but usually a lot more due to large folios. The only thing bio chaining is buying us now is to reduce the size of the allocation from an ioend with an embedded bio into a plain bio, which is a 52 bytes differences on 64-bit systems. This is not worth the added complexity, so remove the bio chaining and only use the bio embedded into the ioend. This will help to simplify further changes to the iomap writeback code. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-10-hch@lst.de Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: move the iomap_sector sector calculation out of iomap_add_to_ioend	Christoph Hellwig	1	-14/+11
	The calculation in iomap_sector is pretty trivial and most of the time iomap_add_to_ioend only callers either iomap_can_add_to_ioend or iomap_alloc_ioend from a single invocation. Calculate the sector in the two lower level functions and stop passing it from iomap_add_to_ioend and update the iomap_alloc_ioend argument passing order to match that of iomap_add_to_ioend. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-9-hch@lst.de Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: clean up the iomap_alloc_ioend calling convention	Christoph Hellwig	1	-6/+5
	Switch to the same argument order as iomap_writepage_map and remove the ifs argument that can be trivially recalculated. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-8-hch@lst.de Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: move all remaining per-folio logic into iomap_writepage_map	Christoph Hellwig	1	-23/+11
	Move the tracepoint and the iomap check from iomap_do_writepage into iomap_writepage_map. This keeps all logic in one places, and leaves iomap_do_writepage just as the wrapper for the callback conventions of write_cache_pages, which will go away when that is converted to an iterator. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-7-hch@lst.de Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: factor out a iomap_writepage_handle_eof helper	Christoph Hellwig	1	-66/+62
	Most of iomap_do_writepage is dedidcated to handling a folio crossing or beyond i_size. Split this is into a separate helper and update the commens to deal with folios instead of pages and make them more readable. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-6-hch@lst.de Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: move the PF_MEMALLOC check to iomap_writepages	Christoph Hellwig	1	-16/+8
	The iomap writepage implementation has been removed in commit 478af190cb6c ("iomap: remove iomap_writepage") and this code is now only called through ->writepages which never happens from memory reclaim. Nove the check from iomap_do_writepage to iomap_writepages so that is only called once per ->writepage invocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-5-hch@lst.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: move the io_folios field out of struct iomap_ioend	Christoph Hellwig	1	-3/+4
	The io_folios member in struct iomap_ioend counts the number of folios added to an ioend. It is only used at submission time and can thus be moved to iomap_writepage_ctx instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-4-hch@lst.de Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: treat inline data in iomap_writepage_map as an I/O error	Christoph Hellwig	1	-2/+4
	iomap_writepage_map aready warns about inline data, but then just ignores it. Treat it as an error and return -EIO. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-3-hch@lst.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-01	iomap: clear the per-folio dirty bits on all writeback failures	Christoph Hellwig	1	-7/+11
	write_cache_pages always clear the page dirty bit before calling into the file systems, and leaves folios with a writeback failure without the dirty bit after return. We also clear the per-block writeback bits for writeback failures unless no I/O has submitted, which will leave the folio in an inconsistent state where it doesn't have the folio dirty, but one or more per-block dirty bits. This seems to be due the place where the iomap_clear_range_dirty call was inserted into the existing not very clearly structured code when adding per-block dirty bit support and not actually intentional. Switch to always clearing the dirty on writeback failure. Fixes: 4ce02c679722 ("iomap: Add per-block dirty state tracking to improve performance") Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231207072710.176093-2-hch@lst.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-01-22	Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs	Linus Torvalds	78	-1426/+1629
	Pull more bcachefs updates from Kent Overstreet: "Some fixes, Some refactoring, some minor features: - Assorted prep work for disk space accounting rewrite - BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this makes our trigger context more explicit - A few fixes to avoid excessive transaction restarts on multithreaded workloads: fstests (in addition to ktest tests) are now checking slowpath counters, and that's shaking out a few bugs - Assorted tracepoint improvements - Starting to break up bcachefs_format.h and move on disk types so they're with the code they belong to; this will make room to start documenting the on disk format better. - A few minor fixes" * tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits) bcachefs: Improve inode_to_text() bcachefs: logged_ops_format.h bcachefs: reflink_format.h bcachefs; extents_format.h bcachefs: ec_format.h bcachefs: subvolume_format.h bcachefs: snapshot_format.h bcachefs: alloc_background_format.h bcachefs: xattr_format.h bcachefs: dirent_format.h bcachefs: inode_format.h bcachefs; quota_format.h bcachefs: sb-counters_format.h bcachefs: counters.c -> sb-counters.c bcachefs: comment bch_subvolume bcachefs: bch_snapshot::btime bcachefs: add missing __GFP_NOWARN bcachefs: opts->compression can now also be applied in the background bcachefs: Prep work for variable size btree node buffers bcachefs: grab s_umount only if snapshotting ...
2024-01-21	bcachefs: Improve inode_to_text()	Kent Overstreet	1	-7/+18
	Add line breaks - inode_to_text() is now much easier to read. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: logged_ops_format.h	Kent Overstreet	2	-27/+31
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: reflink_format.h	Kent Overstreet	3	-47/+48
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs; extents_format.h	Kent Overstreet	2	-279/+284
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: ec_format.h	Kent Overstreet	2	-16/+20
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: subvolume_format.h	Kent Overstreet	2	-32/+36
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: snapshot_format.h	Kent Overstreet	2	-33/+37
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: alloc_background_format.h	Kent Overstreet	2	-93/+94
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: xattr_format.h	Kent Overstreet	2	-15/+20
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: dirent_format.h	Kent Overstreet	2	-39/+43
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: inode_format.h	Kent Overstreet	2	-164/+167
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs; quota_format.h	Kent Overstreet	2	-42/+48
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: sb-counters_format.h	Kent Overstreet	2	-95/+100
	bcachefs_format.h has gotten too big; let's do some organizing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: counters.c -> sb-counters.c	Kent Overstreet	5	-8/+7
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: comment bch_subvolume	Kent Overstreet	1	-0/+3
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: bch_snapshot::btime	Kent Overstreet	2	-0/+3
	Add a field to bch_snapshot for creation time; this will be important when we start exposing the snapshot tree to userspace. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: add missing __GFP_NOWARN	Kent Overstreet	1	-1/+1
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: opts->compression can now also be applied in the background	Kent Overstreet	11	-23/+24
	The "apply this compression method in the background" paths now use the compression option if background_compression is not set; this means that setting or changing the compression option will cause existing data to be compressed accordingly in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: Prep work for variable size btree node buffers	Kent Overstreet	18	-97/+87
	bcachefs btree nodes are big - typically 256k - and btree roots are pinned in memory. As we're now up to 18 btrees, we now have significant memory overhead in mostly empty btree roots. And in the future we're going to start enforcing that certain btree node boundaries exist, to solve lock contention issues - analagous to XFS's AGIs. Thus, we need to start allocating smaller btree node buffers when we can. This patch changes code that refers to the filesystem constant c->opts.btree_node_size to refer to the btree node buffer size - btree_buf_bytes() - where appropriate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: grab s_umount only if snapshotting	Su Yue	1	-6/+5
	When I was testing mongodb over bcachefs with compression, there is a lockdep warning when snapshotting mongodb data volume. $ cat test.sh prog=bcachefs $prog subvolume create /mnt/data $prog subvolume create /mnt/data/snapshots while true;do $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s) sleep 1s done $ cat /etc/mongodb.conf systemLog: destination: file logAppend: true path: /mnt/data/mongod.log storage: dbPath: /mnt/data/ lockdep reports: [ 3437.452330] ====================================================== [ 3437.452750] WARNING: possible circular locking dependency detected [ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G E [ 3437.453562] ------------------------------------------------------ [ 3437.453981] bcachefs/35533 is trying to acquire lock: [ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190 [ 3437.454875] but task is already holding lock: [ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.456009] which lock already depends on the new lock. [ 3437.456553] the existing dependency chain (in reverse order) is: [ 3437.457054] -> #3 (&type->s_umount_key#48){.+.+}-{3:3}: [ 3437.457507] down_read+0x3e/0x170 [ 3437.457772] bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.458206] __x64_sys_ioctl+0x93/0xd0 [ 3437.458498] do_syscall_64+0x42/0xf0 [ 3437.458779] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.459155] -> #2 (&c->snapshot_create_lock){++++}-{3:3}: [ 3437.459615] down_read+0x3e/0x170 [ 3437.459878] bch2_truncate+0x82/0x110 [bcachefs] [ 3437.460276] bchfs_truncate+0x254/0x3c0 [bcachefs] [ 3437.460686] notify_change+0x1f1/0x4a0 [ 3437.461283] do_truncate+0x7f/0xd0 [ 3437.461555] path_openat+0xa57/0xce0 [ 3437.461836] do_filp_open+0xb4/0x160 [ 3437.462116] do_sys_openat2+0x91/0xc0 [ 3437.462402] __x64_sys_openat+0x53/0xa0 [ 3437.462701] do_syscall_64+0x42/0xf0 [ 3437.462982] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.463359] -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}: [ 3437.463843] down_write+0x3b/0xc0 [ 3437.464223] bch2_write_iter+0x5b/0xcc0 [bcachefs] [ 3437.464493] vfs_write+0x21b/0x4c0 [ 3437.464653] ksys_write+0x69/0xf0 [ 3437.464839] do_syscall_64+0x42/0xf0 [ 3437.465009] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.465231] -> #0 (sb_writers#10){.+.+}-{0:0}: [ 3437.465471] __lock_acquire+0x1455/0x21b0 [ 3437.465656] lock_acquire+0xc6/0x2b0 [ 3437.465822] mnt_want_write+0x46/0x1a0 [ 3437.465996] filename_create+0x62/0x190 [ 3437.466175] user_path_create+0x2d/0x50 [ 3437.466352] bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs] [ 3437.466617] __x64_sys_ioctl+0x93/0xd0 [ 3437.466791] do_syscall_64+0x42/0xf0 [ 3437.466957] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.467180] other info that might help us debug this: [ 3437.469670] 2 locks held by bcachefs/35533: other info that might help us debug this: [ 3437.467507] Chain exists of: sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48 [ 3437.467979] Possible unsafe locking scenario: [ 3437.468223] CPU0 CPU1 [ 3437.468405] ---- ---- [ 3437.468585] rlock(&type->s_umount_key#48); [ 3437.468758] lock(&c->snapshot_create_lock); [ 3437.469030] lock(&type->s_umount_key#48); [ 3437.469291] rlock(sb_writers#10); [ 3437.469434] * DEADLOCK * [ 3437.469670] 2 locks held by bcachefs/35533: [ 3437.469838] #0: ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs] [ 3437.470294] #1: ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs] [ 3437.470744] stack backtrace: [ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G E 6.7.0-rc7-custom+ #85 [ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 3437.471694] Call Trace: [ 3437.471795] <TASK> [ 3437.471884] dump_stack_lvl+0x57/0x90 [ 3437.472035] check_noncircular+0x132/0x150 [ 3437.472202] __lock_acquire+0x1455/0x21b0 [ 3437.472369] lock_acquire+0xc6/0x2b0 [ 3437.472518] ? filename_create+0x62/0x190 [ 3437.472683] ? lock_is_held_type+0x97/0x110 [ 3437.472856] mnt_want_write+0x46/0x1a0 [ 3437.473025] ? filename_create+0x62/0x190 [ 3437.473204] filename_create+0x62/0x190 [ 3437.473380] user_path_create+0x2d/0x50 [ 3437.473555] bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs] [ 3437.473819] ? lock_acquire+0xc6/0x2b0 [ 3437.474002] ? __fget_files+0x2a/0x190 [ 3437.474195] ? __fget_files+0xbc/0x190 [ 3437.474380] ? lock_release+0xc5/0x270 [ 3437.474567] ? __x64_sys_ioctl+0x93/0xd0 [ 3437.474764] ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs] [ 3437.475090] __x64_sys_ioctl+0x93/0xd0 [ 3437.475277] do_syscall_64+0x42/0xf0 [ 3437.475454] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 3437.475691] RIP: 0033:0x7f2743c313af ====================================================== In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally and unlock it at the end of the function. There is a comment "why do we need this lock?" about the lock coming from commit 42d237320e98 ("bcachefs: Snapshot creation, deletion") The reason is that __bch2_ioctl_subvolume_create() calls sync_inodes_sb() which enforce locked s_umount to writeback all dirty nodes before doing snapshot works. Fix it by read locking s_umount for snapshotting only and unlocking s_umount after sync_inodes_sb(). Signed-off-by: Su Yue <glass.su@suse.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit	Su Yue	1	-1/+1
	bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut. It should be freed by kvfree not kfree. Or umount will triger: [ 406.829178 ] BUG: unable to handle page fault for address: ffffe7b487148008 [ 406.830676 ] #PF: supervisor read access in kernel mode [ 406.831643 ] #PF: error_code(0x0000) - not-present page [ 406.832487 ] PGD 0 P4D 0 [ 406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI [ 406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G OE 6.7.0-rc7-custom+ #90 [ 406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 [ 406.835796 ] RIP: 0010:kfree+0x62/0x140 [ 406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6 [ 406.837810 ] RSP: 0018:ffffb9d641607e48 EFLAGS: 00010286 [ 406.838213 ] RAX: ffffe7b487148000 RBX: ffffb9d645200000 RCX: ffffb9d641607dc4 [ 406.838738 ] RDX: 000065bb00000000 RSI: ffffffffc0d88b84 RDI: ffffb9d645200000 [ 406.839217 ] RBP: ffff9a4625d00068 R08: 0000000000000001 R09: 0000000000000001 [ 406.839650 ] R10: 0000000000000001 R11: 000000000000001f R12: ffff9a4625d4da80 [ 406.840055 ] R13: ffff9a4625d00000 R14: ffffffffc0e2eb20 R15: 0000000000000000 [ 406.840451 ] FS: 00007f0a264ffb80(0000) GS:ffff9a4e2d500000(0000) knlGS:0000000000000000 [ 406.840851 ] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 406.841125 ] CR2: ffffe7b487148008 CR3: 000000018c4d2000 CR4: 00000000000006f0 [ 406.841464 ] Call Trace: [ 406.841583 ] <TASK> [ 406.841682 ] ? __die+0x1f/0x70 [ 406.841828 ] ? page_fault_oops+0x159/0x470 [ 406.842014 ] ? fixup_exception+0x22/0x310 [ 406.842198 ] ? exc_page_fault+0x1ed/0x200 [ 406.842382 ] ? asm_exc_page_fault+0x22/0x30 [ 406.842574 ] ? bch2_fs_release+0x54/0x280 [bcachefs] [ 406.842842 ] ? kfree+0x62/0x140 [ 406.842988 ] ? kfree+0x104/0x140 [ 406.843138 ] bch2_fs_release+0x54/0x280 [bcachefs] [ 406.843390 ] kobject_put+0xb7/0x170 [ 406.843552 ] deactivate_locked_super+0x2f/0xa0 [ 406.843756 ] cleanup_mnt+0xba/0x150 [ 406.843917 ] task_work_run+0x59/0xa0 [ 406.844083 ] exit_to_user_mode_prepare+0x197/0x1a0 [ 406.844302 ] syscall_exit_to_user_mode+0x16/0x40 [ 406.844510 ] do_syscall_64+0x4e/0xf0 [ 406.844675 ] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 406.844907 ] RIP: 0033:0x7f0a2664e4fb Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: bios must be 512 byte algined	Kent Overstreet	1	-0/+4
	Fixes: 023f9ac9f70f bcachefs: Delete dio read alignment check Reported-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: remove redundant variable tmp	Colin Ian King	1	-3/+1
	The variable tmp is being assigned a value but it isn't being read afterwards. The assignment is redundant and so tmp can be removed. Cleans up clang scan build warning: warning: Although the value stored to 'ret' is used in the enclosing expression, the value is never actually read from 'ret' [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: Improve trace_trans_restart_relock	Kent Overstreet	5	-24/+44
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: Fix excess transaction restarts in __bchfs_fallocate()	Kent Overstreet	4	-16/+35
	drop_locks_do() should not be used in a fastpath without first trying the do in nonblocking mode - the unlock and relock will cause excessive transaction restarts and potentially livelocking with other threads that are contending for the same locks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: extents_to_bp_state	Kent Overstreet	1	-48/+41
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: bkey_and_val_eq()	Kent Overstreet	1	-3/+8
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: Better journal tracepoints	Kent Overstreet	2	-60/+79
	Factor out bch2_journal_bufs_to_text(), and use it in the journal_entry_full() tracepoint; when we can't get a journal reservation we need to know the outstanding journal entry sizes to know if the problem is due to excessive flushing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: Print size of superblock with space allocated	Kent Overstreet	1	-1/+3
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: Avoid flushing the journal in the discard path	Kent Overstreet	1	-19/+41
	When issuing discards, we may need to flush the journal if there's too many buckets that can't be discarded until a journal flush. But the heuristic was bad; we should be comparing the number of buckets that need to flushes against the number of free buckets, not the number of buckets we saw. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: Improve move_extent tracepoint	Kent Overstreet	5	-7/+48
	Also print out the data_opts, so that we can see what specifically is being done to an extent. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: Add missing bch2_moving_ctxt_flush_all()	Kent Overstreet	1	-0/+1
	This fixes a bug with rebalance IOs getting stuck with reads completed, but writes never being issued. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: Re-add move_extent_write tracepoint	Kent Overstreet	2	-23/+20
	It appears this was accidentally deleted at some point - also, do a bit of cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount	Kent Overstreet	1	-2/+2
	Drop t he loop in bch2_kthread_io_clock_wait(): this allows the code that uses it to be woken up for other reasons, and fixes a bug where rebalance wouldn't wake up when a scan was requested. This raises the possibility of spurious wakeups, but callers should always be able to handle that reasonably well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: Add .val_to_text() for KEY_TYPE_cookie	Kent Overstreet	1	-0/+9
	Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21	bcachefs: Don't pass memcmp() as a pointer	Kent Overstreet	1	-2/+8
	Some (buggy!) compilers have issues with this. Fixes: https://github.com/koverstreet/bcachefs/issues/625 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>