summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2021-08-21Merge tag 'locks-v5.14' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull mandatory file locking deprecation warning from Jeff Layton: "As discussed on the list, this patch just adds a new warning for folks who still have mandatory locking enabled and actually mount with '-o mand'. I'd like to get this in for v5.14 so we can push this out into stable kernels and hopefully reach folks who have mounts with -o mand. For now, I'm operating under the assumption that we'll fully remove this support in v5.15, but we can move that out if any legitimate users of this facility speak up between now and then" * tag 'locks-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: fs: warn about impending deprecation of mandatory locks
2021-08-21lockd: lockd server-side shouldn't set fl_opsJ. Bruce Fields1-18/+12
Locks have two sets of op arrays, fl_lmops for the lock manager (lockd or nfsd), fl_ops for the filesystem. The server-side lockd code has been setting its own fl_ops, which leads to confusion (and crashes) in the reexport case, where the filesystem expects to be the only one setting fl_ops. And there's no reason for it that I can see-the lm_get/put_owner ops do the same job. Reported-by: Daire Byrne <daire@dneg.com> Tested-by: Daire Byrne <daire@dneg.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-21Merge tag 'io_uring-5.14-2021-08-20' of git://git.kernel.dk/linux-blockLinus Torvalds1-6/+10
Pull io_uring fixes from Jens Axboe: "A few small fixes that should go into this release: - Fix never re-assigning an initial error value for io_uring_enter() for SQPOLL, if asked to do nothing - Fix xa_alloc_cycle() return value checking, for cases where we have wrapped around - Fix for a ctx pin issue introduced in this cycle (Pavel)" * tag 'io_uring-5.14-2021-08-20' of git://git.kernel.dk/linux-block: io_uring: fix xa_alloc_cycle() error return value check io_uring: pin ctx on fallback execution io_uring: only assign io_uring_enter() SQPOLL error in actual error case
2021-08-21ksmbd: fix permission check issue on chown and chmodNamjae Jeon2-6/+23
When commanding chmod and chown on cifs&ksmbd, ksmbd allows it without file permissions check. There is code to check it in settattr_prepare. Instead of setting the inode directly, update the mode and uid/gid through notify_change. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-21fs: warn about impending deprecation of mandatory locksJeff Layton1-1/+5
We've had CONFIG_MANDATORY_FILE_LOCKING since 2015 and a lot of distros have disabled it. Warn the stragglers that still use "-o mand" that we'll be dropping support for that mount option. Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@kernel.org>
2021-08-20io_uring: fix xa_alloc_cycle() error return value checkJens Axboe1-4/+5
We currently check for ret != 0 to indicate error, but '1' is a valid return and just indicates that the allocation succeeded with a wrap. Correct the check to be for < 0, like it was before the xarray conversion. Cc: stable@vger.kernel.org Fixes: 61cf93700fe6 ("io_uring: Convert personality_idr to XArray") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-20xfs: fix perag structure refcounting error when scrub failsDarrick J. Wong4-4/+7
The kernel test robot found the following bug when running xfs/355 to scrub a bmap btree: XFS: Assertion failed: !sa->pag, file: fs/xfs/scrub/common.c, line: 412 ------------[ cut here ]------------ kernel BUG at fs/xfs/xfs_message.c:110! invalid opcode: 0000 [#1] SMP PTI CPU: 2 PID: 1415 Comm: xfs_scrub Not tainted 5.14.0-rc4-00021-g48c6615cc557 #1 Hardware name: Hewlett-Packard p6-1451cx/2ADA, BIOS 8.15 02/05/2013 RIP: 0010:assfail+0x23/0x28 [xfs] RSP: 0018:ffffc9000aacb890 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffffc9000aacbcc8 RCX: 0000000000000000 RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffc09e7dcd RBP: ffffc9000aacbc80 R08: ffff8881fdf17d50 R09: 0000000000000000 R10: 000000000000000a R11: f000000000000000 R12: 0000000000000000 R13: ffff88820c7ed000 R14: 0000000000000001 R15: ffffc9000aacb980 FS: 00007f185b955700(0000) GS:ffff8881fdf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7f6ef43000 CR3: 000000020de38002 CR4: 00000000001706e0 Call Trace: xchk_ag_read_headers+0xda/0x100 [xfs] xchk_ag_init+0x15/0x40 [xfs] xchk_btree_check_block_owner+0x76/0x180 [xfs] xchk_btree_get_block+0xd0/0x140 [xfs] xchk_btree+0x32e/0x440 [xfs] xchk_bmap_btree+0xd4/0x140 [xfs] xchk_bmap+0x1eb/0x3c0 [xfs] xfs_scrub_metadata+0x227/0x4c0 [xfs] xfs_ioc_scrub_metadata+0x50/0xc0 [xfs] xfs_file_ioctl+0x90c/0xc40 [xfs] __x64_sys_ioctl+0x83/0xc0 do_syscall_64+0x3b/0xc0 The unusual handling of errors while initializing struct xchk_ag is the root cause here. Since the beginning of xfs_scrub, the goal of xchk_ag_read_headers has been to read all three AG header buffers and attach them both to the xchk_ag structure and the scrub transaction. Corruption errors on any of the three headers doesn't necessarily trigger an immediate return to userspace, because xfs_scrub can also tell us to /fix/ the problem. In other words, it's possible for the xchk_ag init functions to return an error code and a partially filled out structure so that scrub can use however much information it managed to pull. Before 5.15, it was sufficient to cancel (or commit) the scrub transaction on the way out of the scrub code to release the buffers. Ccommit 48c6615cc557 added a reference to the perag structure to struct xchk_ag. Since perag structures are not attached to transactions like buffers are, this adds the requirement that the perag ref be released explicitly. The scrub teardown function xchk_teardown was amended to do this for the xchk_ag embedded in struct xfs_scrub. Unfortunately, I forgot that certain parts of the scrub code probe multiple AGs and therefore handle the initialization and cleanup on their own. Specifically, the bmbt scrubber will initialize it long enough to cross-reference AG metadata for btree blocks and for the extent mappings in the bmbt. If one of the AG headers is corrupt, the init function returns with a live perag structure reference and some of the AG header buffers. If an error occurs, the cross referencing will be noted as XCORRUPTion and skipped, but the main scrub process will move on to the next record. It is now necessary to release the perag reference before we try to analyze something from a different AG, or else we'll trip over the assertion noted above. Fixes: 48c6615cc557 ("xfs: grab active perag ref when reading AG headers") Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
2021-08-20erofs: support reading chunk-based uncompressed filesGao Xiang3-11/+102
Add runtime support for chunk-based uncompressed files described in the previous patch. Link: https://lore.kernel.org/r/20210820100019.208490-2-hsiangkao@linux.alibaba.com Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-08-20erofs: introduce chunk-based file on-disk formatGao Xiang1-2/+45
Currently, uncompressed data except for tail-packing inline is consecutive on disk. In order to support chunk-based data deduplication, add a new corresponding inode data layout. In the future, the data source of chunks can be either (un)compressed. Link: https://lore.kernel.org/r/20210820100019.208490-1-hsiangkao@linux.alibaba.com Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-08-20gfs2: Remove redundant check from gfs2_glock_dqBob Peterson1-6/+5
In function gfs2_glock_dq, it checks to see if this is the fast path. Before this patch, it checked both "find_first_holder(gl) == NULL" and list_empty(&gl->gl_holders), which is redundant. If gl_holders is empty then find_first_holder must return NULL. This patch removes the redundancy. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Delay withdraw from atomic contextBob Peterson1-1/+1
Before this patch, if function __gfs2_ail_flush detected an error syncing the ail list, it call gfs2_ail_error which called gfs2_withdraw. Since __gfs2_ail_flush deals with a specific glock, we shouldn't withdraw immediately because the withdraw code (signal_our_withdraw) uses glocks in its processing. This patch changes the call from gfs2_withdraw to gfs2_withdraw_delayed which defers the withdraw until a more appropriate context, such as the logd daemon, discovers the intent to withdraw. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Don't call dlm after protocol is unmountedBob Peterson1-0/+5
In the gfs2 withdraw sequence, the dlm protocol is unmounted with a call to lm_unmount. After a withdraw, users are allowed to unmount the withdrawn file system. But at that point we may still have glocks left over that we need to free via unmount's call to gfs2_gl_hash_clear. These glocks may have never been completed because of whatever problem caused the withdraw (IO errors or whatever). Before this patch, function gdlm_put_lock would still try to call into dlm to unlock these leftover glocks, which resulted in dlm returning -EINVAL because the lock space was abandoned. These glocks were never freed because there was no mechanism after that to free them. This patch adds a check to gdlm_put_lock to see if the locking protocol was inactive (DFL_UNMOUNT flag) and if so, free the glock and not make the invalid call into dlm. I could have combined this "if" with the one that follows, related to leftover glock LVBs, but I felt the code was more readable with its own if clause. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: don't stop reads while withdraw in progressBob Peterson2-4/+8
When gfs2 withdraws a file system, it calls signal_our_withdraw which triggers another node to replay the withdrawing node's journal. Then it waits until it knows the journal has been replayed. Part of this wait is to repeatedly call check_journal_clean which calls gfs2_jdesc_check, which checks to see if the journal is sane. As part of its sanity checks it needs to re-read its journal's metadata. But with today's code, any attempt to re-read the metadata results in -EIO because of a check for the file system withdraw in function gfs2_meta_wait. This patch adds an additional check for SDF_WITHDRAW_IN_PROG, to tell if the read is done while the withdraw is in progress. In that case we allow the metadata read to not be rejected. Therefore the metadata check is done properly, so the withdraw sequence can finish normally. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Mark journal inodes as "don't cache"Bob Peterson2-0/+2
Before this patch, journal inodes were considered regular inodes, which meant that instead of evicting them, function iput_final would just put them on the lru for later processing. If the file system withdrew for whatever reason, the withdraw would never be seen until the inode was evicted, which could be indefinitely. This patch marks all journal inodes as "don't cache" which means function iput_final will evict them immediately, allowing us to properly recover the journal on other cluster nodes. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: nit: gfs2_drop_inode shouldn't return boolBob Peterson1-1/+1
Today, gfs2_drop_inode can return "false" for an int value. I'm sure this was just an oversight. Change to int value. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Eliminate vestigial HIF_FIRSTBob Peterson2-3/+0
Holder flag HIF_FIRST is no longer used or needed, so remove it. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Make recovery error more readableBob Peterson1-1/+1
Before this patch, withdraws could cause an error that looked like: Journal recovery skipped for 0 until next mount. This patch changes it to a more readable: Journal recovery skipped for jid 0 until next mount. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: Don't release and reacquire local statfs bhBob Peterson5-41/+25
Before this patch, several functions in gfs2 related to the updating of the statfs file used a newly acquired/read buffer_head for the local statfs file. This is completely unnecessary, because other nodes should never update it. Recreating the buffer is a waste of time. This patch allows gfs2 to read in the local statefs buffer_head at mount time and keep it around until unmount time. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20gfs2: init system threads before freeze lockBob Peterson2-55/+48
Patch 96b1454f2e ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") changed the gfs2 mount sequence so that it holds the freeze lock before calling gfs2_make_fs_rw. Before this patch, gfs2_make_fs_rw called init_threads to initialize the quotad and logd threads. That is a problem if the system needs to withdraw due to IO errors early in the mount sequence, for example, while initializing the system statfs inode: 1. An IO error causes the statfs glock to not sync properly after recovery, and leaves items on the ail list. 2. The leftover items on the ail list causes its do_xmote call to fail, which makes it want to withdraw. But since the glock code cannot withdraw (because the withdraw sequence uses glocks) it relies upon the logd daemon to initiate the withdraw. 3. The withdraw can never be performed by the logd daemon because all this takes place before the logd daemon is started. This patch moves function init_threads from super.c to ops_fstype.c and it changes gfs2_fill_super to start its threads before holding the freeze lock, and if there's an error, stop its threads after releasing it. This allows the logd to run unblocked by the freeze lock. Thus, the logd daemon can perform its withdraw sequence properly. Fixes: 96b1454f2e8e ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-20ARM: 9108/1: oabi-compat: rework epoll_wait/epoll_pwait emulationArnd Bergmann1-3/+2
The epoll_wait() system call wrapper is one of the remaining users of the set_fs() infrasturcture for Arm. Changing it to not require set_fs() is rather complex unfortunately. The approach I'm taking here is to allow architectures to override the code that copies the output to user space, and let the oabi-compat implementation check whether it is getting called from an EABI or OABI system call based on the thread_info->syscall value. The in_oabi_syscall() check here mirrors the in_compat_syscall() and in_x32_syscall() helpers for 32-bit compat implementations on other architectures. Overall, the amount of code goes down, at least with the newly added sys_oabi_epoll_pwait() helper getting removed again. The downside is added complexity in the source code for the native implementation. There should be no difference in runtime performance except for Arm kernels with CONFIG_OABI_COMPAT enabled that now have to go through an external function call to check which of the two variants to use. Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-08-20ksmbd: don't set FILE DELETE and FILE_DELETE_CHILD in access mask by defaultNamjae Jeon1-2/+0
When there is no dacl in request, ksmbd send dacl that coverted by using file permission. This patch don't set FILE DELETE and FILE_DELETE_CHILD in access mask by default. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-19gfs2: tiny cleanup in gfs2_log_reserveBob Peterson1-1/+1
Function gfs2_log_reserve was setting revoke_blks to 0. There's no need because it calculates it shortly thereafter. This patch removes the unnecessary set. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-19gfs2: trivial clean up of gfs2_ail_errorBob Peterson1-4/+6
This patch does not change function. It adds variable sdp to clean up function gfs2_ail_error and make it more readable. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-19gfs2: be more verbose replaying invalid rgrp blocksBob Peterson1-15/+29
This patch adds some crucial information when journal replay detects a replay of an obsolete rgrp block. For example, it wasn't printing the journal id or the generation number played. This just supplements what is logged in this unusual case. The function that actually complains about the replaying of an obsolete rgrp block has been split off to avoid long lines and sparse warnings. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2021-08-19xfs: rename buffer cache index variable b_bnDave Chinner2-25/+12
To stop external users from using b_bn as the disk address of the buffer, rename it to b_rhash_key to indicate that it is the buffer cache index, not the block number of the buffer. Code that needs the disk address should use xfs_buf_daddr() to obtain it. Do the rename and clean up any of the remaining internal b_bn users. Also clean up any remaining b_bn cruft that is now unused. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: convert bp->b_bn references to xfs_buf_daddr()Dave Chinner21-61/+58
Stop directly referencing b_bn in code outside the buffer cache, as b_bn is supposed to be used only as an internal cache index. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: introduce xfs_buf_daddr()Dave Chinner15-22/+26
Introduce a helper function xfs_buf_daddr() to extract the disk address of the buffer from the struct xfs_buf. This will replace direct accesses to bp->b_bn and bp->b_maps[0].bm_bn, as well as the XFS_BUF_ADDR() macro. This patch introduces the helper function and replaces all uses of XFS_BUF_ADDR() as this is just a simple sed replacement. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: kill xfs_sb_version_has_v3inode()Dave Chinner4-19/+15
All callers to xfs_dinode_good_version() and XFS_DINODE_SIZE() in both the kernel and userspace have a xfs_mount structure available which means they can use mount features checks instead looking directly are the superblock. Convert these functions to take a mount and use a xfs_has_v3inodes() check and move it out of the libxfs/xfs_format.h file as it really doesn't have anything to do with the definition of the on-disk format. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: introduce xfs_sb_is_v5 helperDave Chinner5-38/+38
Rather than open coding XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 checks everywhere, add a simple wrapper to encapsulate this and make the code easier to read. This allows us to remove the xfs_sb_version_has_v3inode() wrapper which is only used in xfs_format.h now and is just a version number check. There are a couple of places where we should be checking the mount feature bits rather than the superblock version (e.g. remount), so those are converted to use xfs_has_crc(mp) instead. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: remove unused xfs_sb_version_has wrappersDave Chinner1-152/+3
The vast majority of these wrappers are now unused. Remove them leaving just the small subset of wrappers that are used to either add feature bits or make the mount features field setup code simpler. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: convert xfs_sb_version_has checks to use mount featuresDave Chinner34-96/+90
This is a conversion of the remaining xfs_sb_version_has..(sbp) checks to use xfs_has_..(mp) feature checks. This was largely done with a vim replacement macro that did: :0,$s/xfs_sb_version_has\(.*\)&\(.*\)->m_sb/xfs_has_\1\2/g<CR> A couple of other variants were also used, and the rest touched up by hand. $ size -t fs/xfs/built-in.a text data bss dec hex filename before 1127533 311352 484 1439369 15f689 (TOTALS) after 1125360 311352 484 1437196 15ee0c (TOTALS) Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: convert scrub to use mount-based feature checksDave Chinner2-7/+7
The scrub feature checks are the last place that the superblock feature checks are used. Convert them to mount based feature checks. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: open code sb verifier feature checksDave Chinner3-65/+81
The superblock verifiers are one of the last places that use the sb version functions to do feature checks. This are all quite simple uses, and there aren't many of them so open code them all. Also, move the good version number check into xfs_sb.c instead of it being an inline function in xfs_format.h Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: convert xfs_fs_geometry to use mount feature checksDave Chinner4-25/+27
Reporting filesystem features to userspace is currently superblock based. Now we have a general mount-based feature infrastructure, switch to using the xfs_mount rather than the superblock directly. This reduces the size of the function by over 300 bytes. $ size -t fs/xfs/built-in.a text data bss dec hex filename before 1127855 311352 484 1439691 15f7cb (TOTALS) after 1127535 311352 484 1439371 15f68b (TOTALS) Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: replace XFS_FORCED_SHUTDOWN with xfs_is_shutdownDave Chinner29-90/+90
Remove the shouty macro and instead use the inline function that matches other state/feature check wrapper naming. This conversion was done with sed. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: convert remaining mount flags to state flagsDave Chinner18-83/+82
The remaining mount flags kept in m_flags are actually runtime state flags. These change dynamically, so they really should be updated atomically so we don't potentially lose an update due to racing modifications. Convert these remaining flags to be stored in m_opstate and use atomic bitops to set and clear the flags. This also adds a couple of simple wrappers for common state checks - read only and shutdown. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: convert mount flags to featuresDave Chinner23-179/+152
Replace m_flags feature checks with xfs_has_<feature>() calls and rework the setup code to set flags in m_features. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: consolidate mount option features in m_featuresDave Chinner1-0/+44
This provides separation of mount time feature flags from runtime mount flags and mount option state. It also makes the feature checks use the same interface as the superblock features. i.e. we don't care if the feature is enabled by superblock flags or mount options, we just care if it's enabled or not. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: replace xfs_sb_version checks with feature flag checksDave Chinner75-267/+267
Convert the xfs_sb_version_hasfoo() to checks against mp->m_features. Checks of the superblock itself during disk operations (e.g. in the read/write verifiers and the to/from disk formatters) are not converted - they operate purely on the superblock state. Everything else should use the mount features. Large parts of this conversion were done with sed with commands like this: for f in `git grep -l xfs_sb_version_has fs/xfs/*.c`; do sed -i -e 's/xfs_sb_version_has\(.*\)(&\(.*\)->m_sb)/xfs_has_\1(\2)/' $f done With manual cleanups for things like "xfs_has_extflgbit" and other little inconsistencies in naming. The result is ia lot less typing to check features and an XFS binary size reduced by a bit over 3kB: $ size -t fs/xfs/built-in.a text data bss dec hex filenam before 1130866 311352 484 1442702 16038e (TOTALS) after 1127727 311352 484 1439563 15f74b (TOTALS) Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: reflect sb features in xfs_mountDave Chinner7-1/+149
Currently on-disk feature checks require decoding the superblock fileds and so can be non-trivial. We have almost 400 hundred individual feature checks in the XFS code, so this is a significant amount of code. To reduce runtime check overhead, pre-process all the version flags into a features field in the xfs_mount at mount time so we can convert all the feature checks to a simple flag check. There is also a need to convert the dynamic feature flags to update the m_features field. This is required for attr, attr2 and quota features. New xfs_mount based wrappers are added for this. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: rework attr2 feature and mount optionsDave Chinner3-33/+17
The attr2 feature is somewhat unique in that it has both a superblock feature bit to enable it and mount options to enable and disable it. Back when it was first introduced in 2005, attr2 was disabled unless either the attr2 superblock feature bit was set, or the attr2 mount option was set. If the superblock feature bit was not set but the mount option was set, then when the first attr2 format inode fork was created, it would set the superblock feature bit. This is as it should be - the superblock feature bit indicated the presence of the attr2 on disk format. The noattr2 mount option, however, did not affect the superblock feature bit. If noattr2 was specified, the on-disk superblock feature bit was ignored and the code always just created attr1 format inode forks. If neither of the attr2 or noattr2 mounts option were specified, then the behaviour was determined by the superblock feature bit. This was all pretty sane. Fast foward 3 years, and we are dealing with fallout from the botched sb_features2 addition and having to deal with feature mismatches between the sb_features2 and sb_bad_features2 fields. The attr2 feature bit was one of these flags. The reconciliation was done well after mount option parsing and, unfortunately, the feature reconciliation had a bug where it ignored the noattr2 mount option. For reasons lost to the mists of time, it was decided that resolving this issue in commit 7c12f296500e ("[XFS] Fix up noattr2 so that it will properly update the versionnum and features2 fields.") required noattr2 to clear the superblock attr2 feature bit. This greatly complicated the attr2 behaviour and broke rules about feature bits needing to be set when those specific features are present in the filesystem. By complicated, I mean that it introduced problems due to feature bit interactions with log recovery. All of the superblock feature bit checks are done prior to log recovery, but if we crash after removing a feature bit, then on the next mount we see the feature bit in the unrecovered superblock, only to have it go away after the log has been replayed. This means our mount time feature processing could be all wrong. Hence you can mount with noattr2, crash shortly afterwards, and mount again without attr2 or noattr2 and still have attr2 enabled because the second mount sees attr2 still enabled in the superblock before recovery runs and removes the feature bit. It's just a mess. Further, this is all legacy code as the v5 format requires attr2 to be enabled at all times and it cannot be disabled. i.e. the noattr2 mount option returns an error when used on v5 format filesystems. To straighten this all out, this patch reverts the attr2/noattr2 mount option behaviour back to the original behaviour. There is no reason for disabling attr2 these days, so we will only do this when the noattr2 mount option is set. This will not remove the superblock feature bit. The superblock bit will provide the default behaviour and only track whether attr2 is present on disk or not. The attr2 mount option will enable the creation of attr2 format inode forks, and if the superblock feature bit is not set it will be added when the first attr2 inode fork is created. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: rename xfs_has_attr()Dave Chinner2-5/+3
xfs_has_attr() is poorly named. It has global scope as it is defined in a header file, but it has no namespace scope that tells us what it is checking has attributes. It's not even clear what "has_attr" means, because what it is actually doing is an attribute fork lookup to see if the attribute exists. Upcoming patches use this "xfs_has_<foo>" namespace for global filesystem features, which conflicts with this function. Rename xfs_has_attr() to xfs_attr_lookup() and make it a static function, freeing up the "xfs_has_" namespace for global scope usage. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: sb verifier doesn't handle uncached sb bufferDave Chinner2-2/+7
The verifier checks explicitly for bp->b_bn == XFS_SB_DADDR to match the primary superblock buffer, but the primary superblock is an uncached buffer and so bp->b_bn is always -1ULL. Hence this never matches and the CRC error reporting is wholly dependent on the mount superblock already being populated so CRC feature checks pass and allow CRC errors to be reported. Fix this so that the primary superblock CRC error reporting is not dependent on already having read the superblock into memory. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-19xfs: start documenting common units and tags used in tracepointsDarrick J. Wong2-0/+39
Because there are a lot of tracepoints that express numeric data with an associated unit and tag, document what they are to help everyone else keep these thigns straight. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2021-08-19xfs: decode scrub flags in ftrace outputDarrick J. Wong1-2/+12
When using pretty-printed scrub tracepoints, decode the meaning of the scrub flags as strings for easier reading. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2021-08-19xfs: standardize inode generation formatting in ftrace outputDarrick J. Wong2-2/+2
Always print inode generation in hexadecimal and preceded with the unit "gen". Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2021-08-19xfs: standardize remaining xfs_buf length tracepointsDarrick J. Wong1-12/+12
For the remaining xfs_buf tracepoints, convert all the tags to xfs_daddr_t units and retag them 'daddrcount' to match everything else. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2021-08-19xfs: resolve fork names in trace outputDarrick J. Wong3-11/+16
Emit whichfork values as text strings in the ftrace output. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2021-08-19xfs: rename i_disk_size fields in ftrace outputDarrick J. Wong1-8/+6
Whenever we record i_disk_size (i.e. the ondisk file size), use the "disize" tag and hexadecimal format consistently. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2021-08-19xfs: disambiguate units for ftrace fields tagged "count"Darrick J. Wong1-6/+6
Some of our tracepoints have a field known as "count". That name doesn't describe any units, which makes the fields not very useful. Rename the fields to capture units and ensure the format is hexadecimal when we're referring to blocks, extents, or IO operations. "fsbcount" are in units of fs blocks "bytecount" are in units of bytes Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>