summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-09-03xfs: don't bother returning errors from xfs_file_releaseChristoph Hellwig1-9/+9
While ->release returns int, the only caller ignores the return value. As we're only doing cleanup work there isn't much of a point in return a value to start with, so just document the situation instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-09-03xfs: refactor f_op->release handlingChristoph Hellwig3-83/+68
Currently f_op->release is split in not very obvious ways. Fix that by folding xfs_release into xfs_file_release. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-09-03xfs: remove the i_mode check in xfs_releaseChristoph Hellwig1-3/+0
xfs_release is only called from xfs_file_release, which is wired up as the f_op->release handler for regular files only. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-09-03Merge tag 'btree-cleanups-6.12_2024-09-02' of ↵Chandan Babu R19-154/+237
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.12-mergeA xfs: cleanups for inode rooted btree code [v4.2 8/8] This series prepares the btree code to support realtime reverse mapping btrees by refactoring xfs_ifork_realloc to be fed a per-btree ops structure so that it can handle multiple types of inode-rooted btrees. It moves on to refactoring the btree code to use the new realloc routines. With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> * tag 'btree-cleanups-6.12_2024-09-02' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: standardize the btree maxrecs function parameters xfs: replace shouty XFS_BM{BT,DR} macros
2024-09-03Merge tag 'xfs-fixes-6.12_2024-09-02' of ↵Chandan Babu R3-8/+9
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.12-mergeA xfs: various bug fixes for 6.12 [7/8] Various bug fixes for 6.12. With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> * tag 'xfs-fixes-6.12_2024-09-02' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: fix a sloppy memory handling bug in xfs_iroot_realloc xfs: fix FITRIM reporting again xfs: fix C++ compilation errors in xfs_fs.h
2024-09-03Merge tag 'quota-cleanups-6.12_2024-09-02' of ↵Chandan Babu R4-36/+81
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.12-mergeA xfs: cleanups for quota mount [v4.2 6/8] Refactor the quota file loading code in preparation for adding metadata directory trees. Did you know that quotarm works even when quota isn't active? With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> * tag 'quota-cleanups-6.12_2024-09-02' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: refactor loading quota inodes in the regular case
2024-09-03Merge tag 'rtalloc-cleanups-6.12_2024-09-02' of ↵Chandan Babu R14-490/+478
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.12-mergeA xfs: cleanups for the realtime allocator [v4.2 5/8] This third series cleans up the realtime allocator code so that it'll be somewhat less difficult to figure out what on earth it's doing. We also rearrange the fsmap code a bit. With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> * tag 'rtalloc-cleanups-6.12_2024-09-02' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: move xfs_ioc_getfsmap out of xfs_ioctl.c xfs: rearrange xfs_fsmap.c a little bit xfs: replace m_rsumsize with m_rsumblocks xfs: remove xfs_{rtbitmap,rtsummary}_wordcount xfs: add xchk_setup_nothing and xchk_nothing helpers xfs: make the rtalloc start hint a xfs_rtblock_t xfs: factor out a xfs_rtallocate_align helper xfs: rework the rtalloc fallback handling xfs: factor out a xfs_rtallocate helper xfs: clean up the ISVALID macro in xfs_bmap_adjacent
2024-09-03Merge tag 'rtalloc-fixes-6.12_2024-09-02' of ↵Chandan Babu R7-133/+124
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.12-mergeA xfs: fixes for the realtime allocator [v4.2 4/8] While I was reviewing how to integrate realtime allocation groups with the rt allocator, I noticed several bugs in the existing allocation code with regards to calculating the maximum range of rtx to scan for free space. This series fixes those range bugs and cleans up a few things too. I also added a few cleanups from Christoph. With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> * tag 'rtalloc-fixes-6.12_2024-09-02' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: simplify xfs_rtalloc_query_range xfs: remove xfs_rtb_to_rtxrem xfs: fix broken variable-sized allocation detection in xfs_rtallocate_extent_block xfs: reduce excessive clamping of maxlen in xfs_rtallocate_extent_near xfs: clean up xfs_rtallocate_extent_exact a bit xfs: refactor aligning bestlen to prod xfs: don't scan off the end of the rt volume in xfs_rtallocate_extent_block xfs: don't return too-short extents from xfs_rtallocate_extent_block xfs: ensure rtx mask/shift are correct after growfs xfs: use the recalculated transaction reservation in xfs_growfs_rt_bmblock
2024-09-03Merge tag 'rtbitmap-cleanups-6.12_2024-09-02' of ↵Chandan Babu R7-402/+438
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.12-mergeA xfs: clean up the rtbitmap code [v4.2 3/8] Here are some cleanups and reorganization of the realtime bitmap code to share more of that code between userspace and the kernel. With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> * tag 'rtbitmap-cleanups-6.12_2024-09-02' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: push transaction join out of xfs_rtbitmap_lock and xfs_rtgroup_lock xfs: factor out rtbitmap/summary initialization helpers xfs: factor out a xfs_last_rt_bmblock helper xfs: factor out a xfs_growfs_rt_bmblock helper xfs: push the calls to xfs_rtallocate_range out to xfs_bmap_rtalloc xfs: cleanup the calling convention for xfs_rtpick_extent xfs: add bounds checking to xfs_rt{bitmap,summary}_read_buf xfs: assert a valid limit in xfs_rtfind_forw xfs: remove the limit argument to xfs_rtfind_back xfs: make the RT rsum_cache mandatory xfs: factor out a xfs_validate_rt_geometry helper xfs: remove xfs_validate_rtextents
2024-09-03Merge tag 'metadir-cleanups-6.12_2024-09-02' of ↵Chandan Babu R8-12/+16
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.12-mergeA xfs: cleanups before adding metadata directories [v4.2 2/8] Before we start adding code for metadata directory trees, let's clean up some warts in the realtime bitmap code and the inode allocator code. With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> * tag 'metadir-cleanups-6.12_2024-09-02' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: pass the icreate args object to xfs_dialloc xfs: match on the global RT inode numbers in xfs_is_metadata_inode xfs: validate inumber in xfs_iget
2024-09-03Merge tag 'atomic-file-commits-6.12_2024-09-02' of ↵Chandan Babu R5-3/+243
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.12-mergeA xfs: atomic file content commits [v31.1 1/8] This series creates XFS_IOC_START_COMMIT and XFS_IOC_COMMIT_RANGE ioctls to perform the exchange only if the target file has not been changed since a given sampling point. This new functionality uses the mechanism underlying EXCHANGE_RANGE to stage and commit file updates such that reader programs will see either the old contents or the new contents in their entirety, with no chance of torn writes. A successful call completion guarantees that the new contents will be seen even if the system fails. The pair of ioctls allows userspace to perform what amounts to a compare and exchange operation on entire file contents. Note that there are ongoing arguments in the community about how best to implement some sort of file data write counter that nfsd could also use to signal invalidations to clients. Until such a thing is implemented, this patch will rely on ctime/mtime updates. Here are the proposed manual pages: IOCTL-XFS-COMMIT-RANGE(2) System Calls ManualIOCTL-XFS-COMMIT-RANGE(2) NAME ioctl_xfs_start_commit - prepare to exchange the contents of two files ioctl_xfs_commit_range - conditionally exchange the contents of parts of two files SYNOPSIS #include <sys/ioctl.h> #include <xfs/xfs_fs.h> int ioctl(int file2_fd, XFS_IOC_START_COMMIT, struct xfs_com‐ mit_range *arg); int ioctl(int file2_fd, XFS_IOC_COMMIT_RANGE, struct xfs_com‐ mit_range *arg); DESCRIPTION Given a range of bytes in a first file file1_fd and a second range of bytes in a second file file2_fd, this ioctl(2) ex‐ changes the contents of the two ranges if file2_fd passes cer‐ tain freshness criteria. Before exchanging the contents, the program must call the XFS_IOC_START_COMMIT ioctl to sample freshness data for file2_fd. If the sampled metadata does not match the file metadata at commit time, XFS_IOC_COMMIT_RANGE will return EBUSY. Exchanges are atomic with regards to concurrent file opera‐ tions. Implementations must guarantee that readers see either the old contents or the new contents in their entirety, even if the system fails. The system call parameters are conveyed in structures of the following form: struct xfs_commit_range { __s32 file1_fd; __u32 pad; __u64 file1_offset; __u64 file2_offset; __u64 length; __u64 flags; __u64 file2_freshness[5]; }; The field pad must be zero. The fields file1_fd, file1_offset, and length define the first range of bytes to be exchanged. The fields file2_fd, file2_offset, and length define the second range of bytes to be exchanged. The field file2_freshness is an opaque field whose contents are determined by the kernel. These file attributes are used to confirm that file2_fd has not changed by another thread since the current thread began staging its own update. Both files must be from the same filesystem mount. If the two file descriptors represent the same file, the byte ranges must not overlap. Most disk-based filesystems require that the starts of both ranges must be aligned to the file block size. If this is the case, the ends of the ranges must also be so aligned unless the XFS_EXCHANGE_RANGE_TO_EOF flag is set. The field flags control the behavior of the exchange operation. XFS_EXCHANGE_RANGE_TO_EOF Ignore the length parameter. All bytes in file1_fd from file1_offset to EOF are moved to file2_fd, and file2's size is set to (file2_offset+(file1_length- file1_offset)). Meanwhile, all bytes in file2 from file2_offset to EOF are moved to file1 and file1's size is set to (file1_offset+(file2_length- file2_offset)). XFS_EXCHANGE_RANGE_DSYNC Ensure that all modified in-core data in both file ranges and all metadata updates pertaining to the exchange operation are flushed to persistent storage before the call returns. Opening either file de‐ scriptor with O_SYNC or O_DSYNC will have the same effect. XFS_EXCHANGE_RANGE_FILE1_WRITTEN Only exchange sub-ranges of file1_fd that are known to contain data written by application software. Each sub-range may be expanded (both upwards and downwards) to align with the file allocation unit. For files on the data device, this is one filesystem block. For files on the realtime device, this is the realtime extent size. This facility can be used to implement fast atomic scatter-gather writes of any complexity for software-defined storage targets if all writes are aligned to the file allocation unit. XFS_EXCHANGE_RANGE_DRY_RUN Check the parameters and the feasibility of the op‐ eration, but do not change anything. RETURN VALUE On error, -1 is returned, and errno is set to indicate the er‐ ror. ERRORS Error codes can be one of, but are not limited to, the follow‐ ing: EBADF file1_fd is not open for reading and writing or is open for append-only writes; or file2_fd is not open for reading and writing or is open for append-only writes. EBUSY The file2 inode number and timestamps supplied do not match file2_fd. EINVAL The parameters are not correct for these files. This error can also appear if either file descriptor repre‐ sents a device, FIFO, or socket. Disk filesystems gen‐ erally require the offset and length arguments to be aligned to the fundamental block sizes of both files. EIO An I/O error occurred. EISDIR One of the files is a directory. ENOMEM The kernel was unable to allocate sufficient memory to perform the operation. ENOSPC There is not enough free space in the filesystem ex‐ change the contents safely. EOPNOTSUPP The filesystem does not support exchanging bytes between the two files. EPERM file1_fd or file2_fd are immutable. ETXTBSY One of the files is a swap file. EUCLEAN The filesystem is corrupt. EXDEV file1_fd and file2_fd are not on the same mounted filesystem. CONFORMING TO This API is XFS-specific. USE CASES Several use cases are imagined for this system call. Coordina‐ tion between multiple threads is performed by the kernel. The first is a filesystem defragmenter, which copies the con‐ tents of a file into another file and wishes to exchange the space mappings of the two files, provided that the original file has not changed. An example program might look like this: int fd = open("/some/file", O_RDWR); int temp_fd = open("/some", O_TMPFILE | O_RDWR); struct stat sb; struct xfs_commit_range args = { .flags = XFS_EXCHANGE_RANGE_TO_EOF, }; /* gather file2's freshness information */ ioctl(fd, XFS_IOC_START_COMMIT, &args); fstat(fd, &sb); /* make a fresh copy of the file with terrible alignment to avoid reflink */ clone_file_range(fd, NULL, temp_fd, NULL, 1, 0); clone_file_range(fd, NULL, temp_fd, NULL, sb.st_size - 1, 0); /* commit the entire update */ args.file1_fd = temp_fd; ret = ioctl(fd, XFS_IOC_COMMIT_RANGE, &args); if (ret && errno == EBUSY) printf("file changed while defrag was underway "); The second is a data storage program that wants to commit non- contiguous updates to a file atomically. This program cannot coordinate updates to the file and therefore relies on the ker‐ nel to reject the COMMIT_RANGE command if the file has been up‐ dated by someone else. This can be done by creating a tempo‐ rary file, calling FICLONE(2) to share the contents, and stag‐ ing the updates into the temporary file. The FULL_FILES flag is recommended for this purpose. The temporary file can be deleted or punched out afterwards. An example program might look like this: int fd = open("/some/file", O_RDWR); int temp_fd = open("/some", O_TMPFILE | O_RDWR); struct xfs_commit_range args = { .flags = XFS_EXCHANGE_RANGE_TO_EOF, }; /* gather file2's freshness information */ ioctl(fd, XFS_IOC_START_COMMIT, &args); ioctl(temp_fd, FICLONE, fd); /* append 1MB of records */ lseek(temp_fd, 0, SEEK_END); write(temp_fd, data1, 1000000); /* update record index */ pwrite(temp_fd, data1, 600, 98765); pwrite(temp_fd, data2, 320, 54321); pwrite(temp_fd, data2, 15, 0); /* commit the entire update */ args.file1_fd = temp_fd; ret = ioctl(fd, XFS_IOC_COMMIT_RANGE, &args); if (ret && errno == EBUSY) printf("file changed before commit; will roll back "); NOTES Some filesystems may limit the amount of data or the number of extents that can be exchanged in a single call. SEE ALSO ioctl(2) XFS 2024-02-18 IOCTL-XFS-COMMIT-RANGE(2) With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> * tag 'atomic-file-commits-6.12_2024-09-02' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: introduce new file range commit ioctls
2024-09-01xfs: standardize the btree maxrecs function parametersDarrick J. Wong14-33/+40
Standardize the parameters in xfs_{alloc,bm,ino,rmap,refcount}bt_maxrecs so that we have consistent calling conventions. This doesn't affect the kernel that much, but enables us to clean up userspace a bit. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: fix a sloppy memory handling bug in xfs_iroot_reallocDarrick J. Wong1-5/+5
While refactoring code, I noticed that when xfs_iroot_realloc tries to shrink a bmbt root block, it allocates a smaller new block and then copies "records" and pointers to the new block. However, bmbt root blocks cannot ever be leaves, which means that it's not technically correct to copy records. We /should/ be copying keys. Note that this has never resulted in actual memory corruption because sizeof(bmbt_rec) == (sizeof(bmbt_key) + sizeof(bmbt_ptr)). However, this will no longer be true when we start adding realtime rmap stuff, so fix this now. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: refactor loading quota inodes in the regular caseDarrick J. Wong4-36/+81
Create a helper function to load quota inodes in the case where the dqtype and the sb quota inode fields correspond. This is true for nearly all the iget callsites in the quota code, except for when we're switching the group and project quota inodes. We'll need this in subsequent patches to make the metadir handling less convoluted. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: replace shouty XFS_BM{BT,DR} macrosDarrick J. Wong9-122/+198
Replace all the shouty bmap btree and bmap disk root macros with actual functions. sed \ -e 's/XFS_BMBT_BLOCK_LEN/xfs_bmbt_block_len/g' \ -e 's/XFS_BMBT_REC_ADDR/xfs_bmbt_rec_addr/g' \ -e 's/XFS_BMBT_KEY_ADDR/xfs_bmbt_key_addr/g' \ -e 's/XFS_BMBT_PTR_ADDR/xfs_bmbt_ptr_addr/g' \ -e 's/XFS_BMDR_REC_ADDR/xfs_bmdr_rec_addr/g' \ -e 's/XFS_BMDR_KEY_ADDR/xfs_bmdr_key_addr/g' \ -e 's/XFS_BMDR_PTR_ADDR/xfs_bmdr_ptr_addr/g' \ -e 's/XFS_BMAP_BROOT_PTR_ADDR/xfs_bmap_broot_ptr_addr/g' \ -e 's/XFS_BMAP_BROOT_SPACE_CALC/xfs_bmap_broot_space_calc/g' \ -e 's/XFS_BMAP_BROOT_SPACE/xfs_bmap_broot_space/g' \ -e 's/XFS_BMDR_SPACE_CALC/xfs_bmdr_space_calc/g' \ -e 's/XFS_BMAP_BMDR_SPACE/xfs_bmap_bmdr_space/g' \ -i $(git ls-files fs/xfs/*.[ch] fs/xfs/libxfs/*.[ch] fs/xfs/scrub/*.[ch]) Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: fix FITRIM reporting againDarrick J. Wong1-1/+1
Don't report FITRIMming more bytes than possibly exist in the filesystem. Fixes: 410e8a18f8e93 ("xfs: don't bother reporting blocks trimmed via FITRIM") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: fix C++ compilation errors in xfs_fs.hDarrick J. Wong1-2/+3
Several people reported C++ compilation errors due to things that C compilers allow but C++ compilers do not. Fix both of these problems, and hope there aren't more of these brown paper bags in 2 months when we finally get these fixes through the process into a released xfsprogs. NOTE: I am submitting this bugfix over the objections of a former maintainer, who insists that we should remove this function from the published userspace ABI instead of fixing the C++ compilation errors. No deprecation period, no discussion, just a hard drop of an already provided and correct C function, which would be in contravention of Linus' rules. IOWs, removing ABI that have already shipped in a released kernel requires a careful deprecation period, so I will let that maintainer run that process. Reported-by: kernel@mattwhitlock.name Reported-by: sam@gentoo.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219203 Fixes: 233f4e12bbb2c ("xfs: add parent pointer ioctls") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: move xfs_ioc_getfsmap out of xfs_ioctl.cDarrick J. Wong3-136/+134
Move this function out of xfs_ioctl.c to reduce the clutter in there, and make the entire getfsmap implementation self-contained in a single file. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: simplify xfs_rtalloc_query_rangeChristoph Hellwig4-41/+30
There isn't much of a good reason to pass the xfs_rtalloc_rec structures that describe extents to xfs_rtalloc_query_range as we really just want a lower and upper bound xfs_rtxnum_t. Pass the rtxnum directly and simply the interface. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: push transaction join out of xfs_rtbitmap_lock and xfs_rtgroup_lockChristoph Hellwig4-16/+23
To prepare for being able to join an already locked rtbitmap inode to a transaction split out separate helpers for joining the transaction from the locking helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: pass the icreate args object to xfs_diallocDarrick J. Wong6-8/+11
Pass the xfs_icreate_args object to xfs_dialloc since we can extract the relevant mode (really just the file type) and parent inumber from there. This simplifies the calling convention in preparation for the next patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: introduce new file range commit ioctlsDarrick J. Wong5-3/+243
This patch introduces two more new ioctls to manage atomic updates to file contents -- XFS_IOC_START_COMMIT and XFS_IOC_COMMIT_RANGE. The commit mechanism here is exactly the same as what XFS_IOC_EXCHANGE_RANGE does, but with the additional requirement that file2 cannot have changed since some sampling point. The start-commit ioctl performs the sampling of file attributes. Note: This patch currently samples i_ctime during START_COMMIT and checks that it hasn't changed during COMMIT_RANGE. This isn't entirely safe in kernels prior to 6.12 because ctime only had coarse grained granularity and very fast updates could collide with a COMMIT_RANGE. With the multi-granularity ctime introduced by Jeff Layton, it's now possible to update ctime such that this does not happen. It is critical, then, that this patch must not be backported to any kernel that does not support fine-grained file change timestamps. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: rearrange xfs_fsmap.c a little bitDarrick J. Wong1-134/+134
The order of the functions in this file has gotten a little confusing over the years. Specifically, the two data device implementations (bnobt and rmapbt) could be adjacent in the source code instead of split in two by the logdev and rtdev fsmap implementations. We're about to add more functionality to this file, so rearrange things now. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: remove xfs_rtb_to_rtxremChristoph Hellwig2-23/+4
Simplify the number of block number conversion helpers by removing xfs_rtb_to_rtxrem. Any recent compiler is smart enough to eliminate the double divisions if using separate xfs_rtb_to_rtx and xfs_rtb_to_rtxoff calls. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: factor out rtbitmap/summary initialization helpersChristoph Hellwig3-117/+133
Add helpers to libxfs that can be shared by growfs and mkfs for initializing the rtbitmap and summary, and by passing the optional data pointer also by repair for rebuilding them. This will become even more useful when the rtgroups feature adds a metadata header to each block, which means even more shared code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> [djwong: minor documentation and data advance tweaks] Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: match on the global RT inode numbers in xfs_is_metadata_inodeChristoph Hellwig1-3/+4
Match the inode number instead of the inode pointers, as the inode pointers in the superblock will go away soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> [djwong: port to my tree, make the parameter a const pointer] Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: replace m_rsumsize with m_rsumblocksChristoph Hellwig7-25/+19
Track the RT summary file size in blocks, just like the RT bitmap file. While we have users of both units, blocks are used slightly more often and this matches the bitmap file for consistency. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: fix broken variable-sized allocation detection in ↵Darrick J. Wong1-6/+9
xfs_rtallocate_extent_block This function tries to find a suitable free space extent starting from a particular rtbitmap block. Some time ago, I added a clamping function to prevent the free space scans from running off the end of the bitmap, but I didn't quite get the logic right. Let's say there's an allocation request with a minlen of 5 and a maxlen of 32 and we're scanning the last rtbitmap block. If we come within 4 rtx of the end of the rt volume, maxlen will get clamped to 4. If the next 3 rtx are free, we could have satisfied the allocation, but the code setting partial besti/bestlen for "minlen < maxlen" will think that we're doing a non-variable allocation and ignore it. The root of this problem is overwriting maxlen; I should have stuffed the results in a different variable, which would not have introduced this bug. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: factor out a xfs_last_rt_bmblock helperChristoph Hellwig1-10/+19
Add helper to calculate the last currently used rt bitmap block to better structure the growfs code and prepare for future changes to it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: validate inumber in xfs_igetDarrick J. Wong1-1/+1
Actually use the inumber validator to check the argument passed in here. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2024-09-01xfs: remove xfs_{rtbitmap,rtsummary}_wordcountChristoph Hellwig2-38/+0
xfs_rtbitmap_wordcount and xfs_rtsummary_wordcount are currently unused, so remove them to simplify refactoring other rtbitmap helpers. They can be added back or simply open coded when actually needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: reduce excessive clamping of maxlen in xfs_rtallocate_extent_nearDarrick J. Wong1-11/+12
The near rt allocator employs two allocation strategies -- first it tries to allocate at exactly @start. If that fails, it will pivot back and forth around that starting point looking for an appropriately sized free space. However, I clamped maxlen ages ago to prevent the exact allocation scan from running off the end of the rt volume. This, I realize, was excessive. If the allocation request is (say) for 32 rtx but the start position is 5 rtx from the end of the volume, we clamp maxlen to 5. If the exact allocation fails, we then pivot back and forth looking for 5 rtx, even though the original intent was to try to get 32 rtx. If we then find 5 rtx when we could have gotten 32 rtx, we've not done as well as we could have. This may be moot if the caller immediately comes back for more space, but it might not be. Either way, we can do better here. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: factor out a xfs_growfs_rt_bmblock helperChristoph Hellwig1-159/+158
Add a helper to contain the per-rtbitmap block logic in xfs_growfs_rt. Note that this helper now allocates a new fake mount structure for each rtbitmap block iteration instead of reusing the memory for an entire growfs call. Compared to all the other work done when freeing the blocks the overhead for this is in the noise and it keeps the code nicely modular. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: add xchk_setup_nothing and xchk_nothing helpersDarrick J. Wong2-40/+18
Add common helpers for no-op scrubbing methods. Signed-off-by: Darrick J. Wong <djwong@kernel.org> [hch: split from a larger patch] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: clean up xfs_rtallocate_extent_exact a bitDarrick J. Wong1-20/+21
Before we start doing more surgery on the rt allocator, let's clean up the exact allocator so that it doesn't change its arguments and uses the helper introduced in the previous patch. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: push the calls to xfs_rtallocate_range out to xfs_bmap_rtallocChristoph Hellwig1-20/+18
Currently the various low-level RT allocator functions call into xfs_rtallocate_range directly, which ties them into the locking protocol for the RT bitmap. As these helpers already return the allocated range, lift the call to xfs_rtallocate_range into xfs_bmap_rtalloc so that it happens as high as possible in the stack, which will simplify future changes to the locking protocol. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: make the rtalloc start hint a xfs_rtblock_tChristoph Hellwig1-7/+11
0 is a valid start RT extent, and with pending changes it will become both more common and non-unique. Switch to pass a xfs_rtblock_t instead so that we can use NULLRTBLOCK to determine if a hint was set or not. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: refactor aligning bestlen to prodDarrick J. Wong1-11/+15
There are two places in xfs_rtalloc.c where we want to make sure that a count of rt extents is aligned with a particular prod(uct) factor. In one spot, we actually use rounddown(), albeit unnecessarily if prod < 2. In the other case, we open-code this rounding inefficiently by promoting the 32-bit length value to a 64-bit value and then performing a 64-bit division to figure out the subtraction. Refactor this into a single helper that uses the correct types and division method for the type, and skips the division entirely unless prod is large enough to make a difference. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: cleanup the calling convention for xfs_rtpick_extentChristoph Hellwig1-8/+4
xfs_rtpick_extent never returns an error. Do away with the error return and directly return the picked extent instead of doing that through a call by reference argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: factor out a xfs_rtallocate_align helperChristoph Hellwig1-34/+59
Split the code to calculate the aligned allocation request from xfs_bmap_rtalloc into a separate self-contained helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: don't scan off the end of the rt volume in xfs_rtallocate_extent_blockDarrick J. Wong1-9/+7
The loop conditional here is not quite correct because an rtbitmap block can represent rtextents beyond the end of the rt volume. There's no way that it makes sense to scan for free space beyond EOFS, so don't do it. This overrun has been present since v2.6.0. Also fix the type of bestlen, which was incorrectly converted. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: add bounds checking to xfs_rt{bitmap,summary}_read_bufChristoph Hellwig2-21/+32
Add a corruption check for passing an invalid block number, which is a lot easier to understand than the xfs_bmapi_read failure later on. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: rework the rtalloc fallback handlingChristoph Hellwig1-35/+34
xfs_rtallocate currently has two fallbacks, when an allocation fails: 1) drop the requested extent size alignment, if any, and retry 2) ignore the locality hint Oddly enough it does those in order, as trying a different location is more in line with what the user asked for, and does it in a very unstructured way. Lift the fallback to try to allocate without the locality hint into xfs_rtallocate to both perform them in a more sensible order and to clean up the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: don't return too-short extents from xfs_rtallocate_extent_blockDarrick J. Wong1-10/+11
If xfs_rtallocate_extent_block is asked for a variable-sized allocation, it will try to return the best-sized free extent, which is apparently the largest one that it finds starting in this rtbitmap block. It will then trim the size of the extent as needed to align it with prod. However, it misses one thing -- rounding down the best-fit candidate to the required alignment could make the extent shorter than minlen. In the case where minlen > 1, we'd rather the caller relaxed its alignment requirements and tried again, as the allocator already supports that. Returning a too-short extent that causes xfs_bmapi_write to return ENOSR if there aren't enough nmaps to handle multiple new allocations, which can then cause filesystem shutdowns. I haven't seen this happen on any production systems, but then I don't think it's very common to set a per-file extent size hint on realtime files. I tripped it while working on the rtgroups feature and pounding on the realtime allocator enthusiastically. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-09-01xfs: assert a valid limit in xfs_rtfind_forwChristoph Hellwig1-0/+2
Protect against developers passing stupid limits when refactoring the RT code once again. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: factor out a xfs_rtallocate helperChristoph Hellwig1-31/+50
Split out a helper from xfs_rtallocate that performs the actual allocation. This keeps the scope of the xfs_rtalloc_args structure contained, and prepares for rtgroups support. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: ensure rtx mask/shift are correct after growfsChristoph Hellwig3-4/+15
When growfs sets an extent size, it doesn't updated the m_rtxblklog and m_rtxblkmask values, which could lead to incorrect usage of them if they were set before and can't be used for the new extent size. Add a xfs_mount_sb_set_rextsize helper that updates the two fields, and also use it when calculating the new RT geometry instead of disabling the optimization there. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: remove the limit argument to xfs_rtfind_backChristoph Hellwig3-7/+6
All callers pass a 0 limit to xfs_rtfind_back, so remove the argument and hard code it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: clean up the ISVALID macro in xfs_bmap_adjacentChristoph Hellwig1-24/+33
Turn the ISVALID macro defined and used inside in xfs_bmap_adjacent that relies on implict context into a proper inline function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-09-01xfs: use the recalculated transaction reservation in xfs_growfs_rt_bmblockChristoph Hellwig1-3/+5
After going great length to calculate the transaction reservation for the new geometry, we should also use it to allocate the transaction it was calculated for. Fixes: 578bd4ce7100 ("xfs: recompute growfsrtfree transaction reservation while growing rt volume") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>