summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-08-21block: call into the file system for bdev_mark_deadChristoph Hellwig4-44/+40
Combine the newly merged bdev_mark_dead helper with the existing mark_dead holder operation so that all operations that invalidate a device that is dead or being removed now go through the holder ops. This allows file systems to explicitly shutdown either ASAP (for a surprise removal) or after writing back data (for an orderly removal), and do so not only for the main device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-15-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-21block: consolidate __invalidate_device and fsync_bdevChristoph Hellwig7-20/+37
We currently have two interfaces that take a block_devices and the find a mounted file systems to flush or invaldidate data on it. Both are a bit problematic because they only work for the "main" block devices that is used as s_dev for the super_block, and because they don't call into the file system at all. Merge the two into a new bdev_mark_dead helper that does both the syncing and invalidation and which is properly documented. This is in preparation of merging the functionality into the ->mark_dead holder operation so that it will work on additional block devices used by a file systems and give us a single entry point for invalidation of dead devices or media. Note that a single standalone fsync_bdev call for an obscure ioctl remains for now, but that one will also be deal with in a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-14-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-21block: drop the "busy inodes on changed media" log messageChristoph Hellwig1-6/+2
This message isn't exactly helpful, and file systems already print way more useful messages when shut down while active. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-13-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-21dasd: also call __invalidate_device when setting the device offlineChristoph Hellwig1-3/+2
Don't just write out the data, but also invalidate all caches when setting the device offline. Stop canceling the offlining when writeback fails as there is no way to recover from that anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-12-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-21amiflop: don't call fsync_bdev in FDFMTBEGChristoph Hellwig1-1/+0
FDFMTBEG is used by fdformat to calibrate before formatting a disk. Neither the atari nor PC floppy driver sync data, which also seems a bit pointless for a disk hat is about to get formatted. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-11-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-21floppy: call disk_force_media_change when changing the formatChristoph Hellwig1-1/+1
While changing the format of a floppy isn't strictly speaking a media change, the effects are the same in that the content of the media changes and the diskseq should be increased and uevent should be sent. Switch from calling __invalidate_device to disk_force_media_change to do so. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-10-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-21block: simplify the disk_force_media_change interfaceChristoph Hellwig3-15/+8
Hard code the events to DISK_EVENT_MEDIA_CHANGE as that is the only useful use case, and drop the superfluous return value. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-9-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-21nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctlChristoph Hellwig1-5/+3
nbd_clear_sock_ioctl kills the socket and with that the block device. Instead of just invalidating file system buffers, mark the device as dead, which will also invalidate the buffers as part of the proper shutdown sequence. This also includes invalidating partitions if there are any. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-8-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11xfs use fs_holder_ops for the log and RT devicesChristoph Hellwig1-15/+4
Use the generic fs_holder_ops to shut down the file system when the log or RT device goes away instead of duplicating the logic. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230802154131.2221419-13-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11xfs: drop s_umount over opening the log and RT devicesChristoph Hellwig1-4/+14
Just like get_tree_bdev needs to drop s_umount when opening the main device, we need to do the same for the xfs log and RT devices to avoid a potential lock order reversal with s_unmount for the mark_dead path. It might be preferable to just drop s_umount over ->fill_super entirely, but that will require a fairly massive audit first, so we'll do the easy version here first. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230802154131.2221419-12-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11ext4: use fs_holder_ops for the log deviceChristoph Hellwig1-10/+1
Use the generic fs_holder_ops to shut down the file system when the log device goes away instead of duplicating the logic. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230802154131.2221419-11-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11ext4: drop s_umount over opening the log deviceChristoph Hellwig1-0/+3
Just like get_tree_bdev needs to drop s_umount when opening the main device, we need to do the same for the ext4 log device to avoid a potential lock order reversal with s_unmount for the mark_dead path. It might be preferable to just drop s_umount over ->fill_super entirely, but that will require a fairly massive audit first, so we'll do the easy version here first. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-10-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11fs: export fs_holder_opsChristoph Hellwig2-1/+4
Export fs_holder_ops so that file systems that open additional block devices can use it as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-9-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11fs: stop using get_super in fs_mark_deadChristoph Hellwig1-4/+26
fs_mark_dead currently uses get_super to find the superblock for the block device that is going away. This means it is limited to the main device stored in sb->s_dev, leading to a lot of code duplication for file systems that can use multiple block devices. Now that the holder for all block devices used by file systems is set to the super_block, we can instead look at that holder and then check if the file system is born and active, so do that instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-8-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11fs: use the super_block as holder when mounting file systemsChristoph Hellwig5-11/+10
The file system type is not a very useful holder as it doesn't allow us to go back to the actual file system instance. Pass the super_block instead which is useful when passed back to the file system driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-7-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10ext4: make the IS_EXT2_SB/IS_EXT3_SB checks more robustChristoph Hellwig1-2/+2
Check for sb->s_type which is the right place to look at the file system type, not the holder, which is just an implementation detail in the VFS helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-6-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10nilfs2: use setup_bdev_super to de-duplicate the mount codeChristoph Hellwig1-51/+30
Use the generic setup_bdev_super helper to open the main block device and do various bits of superblock setup instead of duplicating the logic. This includes moving to the new scheme implemented in common code that only opens the block device after the superblock has allocated. It does not yet convert nilfs2 to the new mount API, but doing so will become a bit simpler after this first step. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Message-Id: <20230802154131.2221419-3-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10fs: export setup_bdev_superChristoph Hellwig2-1/+4
We'll want to use setup_bdev_super instead of duplicating it in nilfs2. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-2-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10fs: open block device after superblock creationJan Kara3-99/+107
Currently get_tree_bdev and mount_bdev open the block device before committing to allocating a super block. That creates problems for restricting the number of writers to a device, and also leads to a unusual and not very helpful holder (the fs_type). Reorganize the super block code to first look whether the superblock for a particular device does already exist and open the block device only if it doesn't. [hch: port to before the bdev_handle changes, duplicate the bdev read-only check from blkdev_get_by_path, extend the fsfree_mutex coverage to protect against freezes, fix an open bdev leak when the bdev is frozen, use the bdev local variable more, rename the s variable to sb to be more descriptive] [brauner: remove references to mounts as they're mostly irrelevant] [brauner & hch: fold fixes for romfs and cramfs for syzbot+2faac0423fdc9692822b@syzkaller.appspotmail.com] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Message-Id: <20230724175145.201318-1-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10ntfs3: free the sbi in ->kill_sbChristoph Hellwig1-13/+12
As a rule of thumb everything allocated to the fs_context and moved into the super_block should be freed by ->kill_sb so that the teardown handling doesn't need to be duplicated between the fill_super error path and put_super. Implement an ntfs3-specific kill_sb method to do that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230809220545.1308228-14-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10ntfs3: don't call sync_blockdev in ntfs_put_superChristoph Hellwig1-2/+0
kill_block_super will call sync_blockdev just a tad later already. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230809220545.1308228-13-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10ntfs3: rename put_ntfs ntfs3_free_sbiChristoph Hellwig1-5/+5
put_ntfs is a rather unconventional name for a function that frees the sbi and associated resources. Give it a more descriptive name and drop the duplicate name in the top of the function comment. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230809220545.1308228-12-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10exfat: free the sbi and iocharset in ->kill_sbChristoph Hellwig1-10/+18
As a rule of thumb everything allocated to the fs_context and moved into the super_block should be freed by ->kill_sb so that the teardown handling doesn't need to be duplicated between the fill_super error path and put_super. Implement an exfat-specific kill_sb method to do that and share the code with the mount contex free helper for the mount error handling case. Signed-off-by: Christoph Hellwig <hch@lst.de> Message-Id: <20230809220545.1308228-11-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10exfat: don't RCU-free the sbiChristoph Hellwig2-13/+4
There are no RCU critical sections for accessing any information in the sbi, so drop the call_rcu indirection for freeing the sbi. Signed-off-by: Christoph Hellwig <hch@lst.de> Message-Id: <20230809220545.1308228-10-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10ext4: close the external journal device in ->kill_sbChristoph Hellwig1-25/+25
blkdev_put must not be called under sb->s_umount to avoid a lock order reversal with disk->open_mutex. Move closing the external journal device into ->kill_sb to archive that. Signed-off-by: Christoph Hellwig <hch@lst.de> Message-Id: <20230809220545.1308228-9-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10xfs: document the invalidate_bdev call in invalidate_bdevChristoph Hellwig1-0/+26
Copy and paste the commit message from Darrick into a comment to explain the seemingly odd invalidate_bdev in xfs_shutdown_devices. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230809220545.1308228-8-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10xfs: close the external block devices in xfs_mount_freeChristoph Hellwig2-16/+22
blkdev_put must not be called under sb->s_umount to avoid a lock order reversal with disk->open_mutex. Move closing the buftargs into ->kill_sb to archive that. Note that the flushing of the disk caches and block device mapping invalidated needs to stay in ->put_super as the main block device is closed in kill_block_super already. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230809220545.1308228-7-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10xfs: close the RT and log block devices in xfs_free_buftargChristoph Hellwig1-0/+5
Closing the block devices logically belongs into xfs_free_buftarg, So instead of open coding it in the caller move it there and add a check for the s_bdev so that the main device isn't close as that's done by the VFS helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230809220545.1308228-6-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10xfs: remove xfs_blkdev_putChristoph Hellwig1-13/+5
There isn't much use for this trivial wrapper, especially as the NULL check is only needed in a single call site. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230809220545.1308228-5-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10xfs: free the xfs_mount in ->kill_sbChristoph Hellwig1-9/+11
As a rule of thumb everything allocated to the fs_context and moved into the super_block should be freed by ->kill_sb so that the teardown handling doesn't need to be duplicated between the fill_super error path and put_super. Implement a XFS-specific kill_sb method to do that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230809220545.1308228-4-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10xfs: remove a superfluous s_fs_info NULL check in xfs_fs_put_superChristoph Hellwig1-4/+0
->put_super is only called when sb->s_root is set, and thus when fill_super succeeds. Thus drop the NULL check that can't happen in xfs_fs_put_super. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230809220545.1308228-3-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-10xfs: reformat the xfs_fs_free prototypeChristoph Hellwig1-1/+2
The xfs_fs_free prototype formatting is a weird mix of the classic XFS style and the Linux style. Fix it up to be consistent. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230809220545.1308228-2-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-09fs, block: remove bdev->bd_superChristoph Hellwig3-5/+0
bdev->bd_super is unused now, remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230807112625.652089-5-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-09ocfs2: stop using bdev->bd_super for journal error loggingChristoph Hellwig1-3/+3
All ocfs2 journal error handling and logging is based on buffer_heads, and the owning inode and thus super_block can be retrieved through bh->b_assoc_map->host. Switch to using that to remove the last users of bdev->bd_super. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Message-Id: <20230807112625.652089-4-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-09ext4: don't use bdev->bd_super in __ext4_journal_get_write_accessChristoph Hellwig1-2/+1
__ext4_journal_get_write_access already has a super_block available, and there is no need to go from that to the bdev to go back to the owning super_block. Signed-off-by: Christoph Hellwig <hch@lst.de> Message-Id: <20230807112625.652089-3-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-09fs: stop using bdev->bd_super in mark_buffer_write_io_errorChristoph Hellwig1-8/+3
bdev->bd_super is a somewhat awkward backpointer from a block device to an owning file system with unclear rules. For the buffer_head code we already have a good backpointer for the inode that the buffer_head is associated with, even if it lives on the block device mapping: b_assoc_map. It is used track dirty buffers associated with an inode but living on the block device mapping like directory buffers in ext4. mark_buffer_write_io_error already uses it for the call to mapping_set_error, and should be doing the same for the per-sb error sequence. Signed-off-by: Christoph Hellwig <hch@lst.de> Message-Id: <20230807112625.652089-2-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-09Linux 6.5-rc1v6.5-rc1Linus Torvalds1-2/+2
2023-07-09MAINTAINERS 2: Electric BoogalooLinus Torvalds1-46/+46
We just sorted the entries and fields last release, so just out of a perverse sense of curiosity, I decided to see if we can keep things ordered for even just one release. The answer is "No. No we cannot". I suggest that all kernel developers will need weekly training sessions, involving a lot of Big Bird and Sesame Street. And at the yearly maintainer summit, we will all sing the alphabet song together. I doubt I will keep doing this. At some point "perverse sense of curiosity" turns into just a cold dark place filled with sadness and despair. Repeats: 80e62bc8487b ("MAINTAINERS: re-sort all entries and fields") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-09Merge tag 'dma-mapping-6.5-2023-07-09' of ↵Linus Torvalds1-11/+35
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fixes from Christoph Hellwig: - swiotlb area sizing fixes (Petr Tesarik) * tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: reduce the number of areas to match actual memory pool size swiotlb: always set the number of areas before allocating the pool
2023-07-09Merge tag 'irq_urgent_for_v6.5_rc1' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq update from Borislav Petkov: - Optimize IRQ domain's name assignment * tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdomain: Use return value of strreplace()
2023-07-09Merge tag 'x86_urgent_for_v6.5_rc1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu fix from Borislav Petkov: - Do FPU AP initialization on Xen PV too which got missed by the recent boot reordering work * tag 'x86_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/xen: Fix secondary processors' FPU initialization
2023-07-09Merge tag 'x86-core-2023-07-09' of ↵Linus Torvalds1-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single fix for the mechanism to park CPUs with an INIT IPI. On shutdown or kexec, the kernel tries to park the non-boot CPUs with an INIT IPI. But the same code path is also used by the crash utility. If the CPU which panics is not the boot CPU then it sends an INIT IPI to the boot CPU which resets the machine. Prevent this by validating that the CPU which runs the stop mechanism is the boot CPU. If not, leave the other CPUs in HLT" * tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/smp: Don't send INIT to boot CPU
2023-07-09Merge tag 'mips_6.5_1' of ↵Linus Torvalds10-53/+46
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fixes for KVM - fix for loongson build and cpu probing - DT fixes * tag 'mips_6.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: kvm: Fix build error with KVM_MIPS_DEBUG_COP0_COUNTERS enabled MIPS: dts: add missing space before { MIPS: Loongson: Fix build error when make modules_install MIPS: KVM: Fix NULL pointer dereference MIPS: Loongson: Fix cpu_probe_loongson() again
2023-07-09Merge tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-1/+1
Pull xfs fix from Darrick Wong: "Nothing exciting here, just getting rid of a gcc warning that I got tired of seeing when I turn on gcov" * tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix uninit warning in xfs_growfs_data
2023-07-09Merge tag '6.5-rc-smb3-client-fixes-part2' of ↵Linus Torvalds4-5/+72
git://git.samba.org/sfrench/cifs-2.6 Pull more smb client updates from Steve French: - fix potential use after free in unmount - minor cleanup - add worker to cleanup stale directory leases * tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: Add a laundromat thread for cached directories smb: client: remove redundant pointer 'server' cifs: fix session state transition to avoid use-after-free issue
2023-07-09Merge tag 'ntb-6.5' of https://github.com/jonmason/ntbLinus Torvalds10-33/+36
Pull NTB updates from Jon Mason: "Fixes for pci_clean_master, error handling in driver inits, and various other issues/bugs" * tag 'ntb-6.5' of https://github.com/jonmason/ntb: ntb: hw: amd: Fix debugfs_create_dir error checking ntb.rst: Fix copy and paste error ntb_netdev: Fix module_init problem ntb: intel: Remove redundant pci_clear_master ntb: epf: Remove redundant pci_clear_master ntb_hw_amd: Remove redundant pci_clear_master ntb: idt: drop redundant pci_enable_pcie_error_reporting() MAINTAINERS: git://github -> https://github.com for jonmason NTB: EPF: fix possible memory leak in pci_vntb_probe() NTB: ntb_tool: Add check for devm_kcalloc NTB: ntb_transport: fix possible memory leak while device_register() fails ntb: intel: Fix error handling in intel_ntb_pci_driver_init() NTB: amd: Fix error handling in amd_ntb_pci_driver_init() ntb: idt: Fix error handling in idt_pci_driver_init()
2023-07-09mm: lock newly mapped VMA with corrected orderingHugh Dickins1-2/+2
Lockdep is certainly right to complain about (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f but task is already holding lock: (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db Invert those to the usual ordering. Fixes: 33313a747e81 ("mm: lock newly mapped VMA which can be modified after it becomes visible") Cc: stable@vger.kernel.org Signed-off-by: Hugh Dickins <hughd@google.com> Tested-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-09Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of ↵Linus Torvalds18-44/+99
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues" The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since it was all hopefully fixed in mainline. * tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: lib: dhry: fix sleeping allocations inside non-preemptable section kasan, slub: fix HW_TAGS zeroing with slub_debug kasan: fix type cast in memory_is_poisoned_n mailmap: add entries for Heiko Stuebner mailmap: update manpage link bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page MAINTAINERS: add linux-next info mailmap: add Markus Schneider-Pargmann writeback: account the number of pages written back mm: call arch_swap_restore() from do_swap_page() squashfs: fix cache race with migration mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison docs: update ocfs2-devel mailing list address MAINTAINERS: update ocfs2-devel mailing list address mm: disable CONFIG_PER_VMA_LOCK until its fixed fork: lock VMAs of the parent process when forking
2023-07-09fork: lock VMAs of the parent process when forkingSuren Baghdasaryan1-0/+1
When forking a child process, the parent write-protects anonymous pages and COW-shares them with the child being forked using copy_present_pte(). We must not take any concurrent page faults on the source vma's as they are being processed, as we expect both the vma and the pte's behind it to be stable. For example, the anon_vma_fork() expects the parents vma->anon_vma to not change during the vma copy. A concurrent page fault on a page newly marked read-only by the page copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the source vma, defeating the anon_vma_clone() that wasn't done because the parent vma originally didn't have an anon_vma, but we now might end up copying a pte entry for a page that has one. Before the per-vma lock based changes, the mmap_lock guaranteed exclusion with concurrent page faults. But now we need to do a vma_start_write() to make sure no concurrent faults happen on this vma while it is being processed. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Suggested-by: David Hildenbrand <david@redhat.com> Reported-by: Jiri Slaby <jirislaby@kernel.org> Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/ Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com> Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/ Reported-by: Jacob Young <jacobly.alt@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first") Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-09mm: lock newly mapped VMA which can be modified after it becomes visibleSuren Baghdasaryan1-0/+2
mmap_region adds a newly created VMA into VMA tree and might modify it afterwards before dropping the mmap_lock. This poses a problem for page faults handled under per-VMA locks because they don't take the mmap_lock and can stumble on this VMA while it's still being modified. Currently this does not pose a problem since post-addition modifications are done only for file-backed VMAs, which are not handled under per-VMA lock. However, once support for handling file-backed page faults with per-VMA locks is added, this will become a race. Fix this by write-locking the VMA before inserting it into the VMA tree. Other places where a new VMA is added into VMA tree do not modify it after the insertion, so do not need the same locking. Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>