summaryrefslogtreecommitdiff
path: root/fs/gfs2
AgeCommit message (Collapse)AuthorFilesLines
2026-03-04gfs2: fiemap page fault fixAndreas Gruenbacher1-0/+16
[ Upstream commit e411d74cc5ba290f85d0dd5e4d1df8f1d6d975d2 ] In gfs2_fiemap(), we are calling iomap_fiemap() while holding the inode glock. This can lead to recursive glock taking if the fiemap buffer is memory mapped to the same inode and accessing it triggers a page fault. Fix by disabling page faults for iomap_fiemap() and faulting in the buffer by hand if necessary. Fixes xfstest generic/742. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04gfs2: Fix use-after-free in iomap inline data write pathDeepanshu Kartikey1-1/+12
[ Upstream commit faddeb848305e79db89ee0479bb0e33380656321 ] The inline data buffer head (dibh) is being released prematurely in gfs2_iomap_begin() via release_metapath() while iomap->inline_data still points to dibh->b_data. This causes a use-after-free when iomap_write_end_inline() later attempts to write to the inline data area. The bug sequence: 1. gfs2_iomap_begin() calls gfs2_meta_inode_buffer() to read inode metadata into dibh 2. Sets iomap->inline_data = dibh->b_data + sizeof(struct gfs2_dinode) 3. Calls release_metapath() which calls brelse(dibh), dropping refcount to 0 4. kswapd reclaims the page (~39ms later in the syzbot report) 5. iomap_write_end_inline() tries to memcpy() to iomap->inline_data 6. KASAN detects use-after-free write to freed memory Fix by storing dibh in iomap->private and incrementing its refcount with get_bh() in gfs2_iomap_begin(). The buffer is then properly released in gfs2_iomap_end() after the inline write completes, ensuring the page stays alive for the entire iomap operation. Note: A C reproducer is not available for this issue. The fix is based on analysis of the KASAN report and code review showing the buffer head is freed before use. [agruenba: Take buffer head reference in gfs2_iomap_begin() to avoid leaks in gfs2_iomap_get() and gfs2_iomap_alloc().] Reported-by: syzbot+ea1cd4aa4d1e98458a55@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ea1cd4aa4d1e98458a55 Fixes: d0a22a4b03b8 ("gfs2: Fix iomap write page reclaim deadlock") Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04gfs2: Add metapath_dibh helperAndreas Gruenbacher1-1/+7
[ Upstream commit 92099f0c92270c8c7a79e6bc6e0312ad248ea331 ] Add a metapath_dibh() helper for extracting the inode's buffer head from a metapath. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: faddeb848305 ("gfs2: Fix use-after-free in iomap inline data write path") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04gfs2: Retries missing in gfs2_{rename,exchange}Andreas Gruenbacher3-14/+43
[ Upstream commit 11d763f0b0afc2cf5f92f4adae5dbbbbef712f8f ] Fix a bug in gfs2's asynchronous glock handling for rename and exchange operations. The original async implementation from commit ad26967b9afa ("gfs2: Use async glocks for rename") mentioned that retries were needed but never implemented them, causing operations to fail with -ESTALE instead of retrying on timeout. Also makes the waiting interruptible. In addition, the timeouts used were too high for situations in which timing out is a rare but expected scenario. Switch to shorter timeouts with randomization and exponentional backoff. Fixes: ad26967b9afa ("gfs2: Use async glocks for rename") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-01-30Revert "gfs2: Fix use of bio_chain"Andreas Gruenbacher1-1/+1
commit 469d71512d135907bf5ea0972dfab8c420f57848 upstream. This reverts commit 8a157e0a0aa5143b5d94201508c0ca1bb8cfb941. That commit incorrectly assumed that the bio_chain() arguments were swapped in gfs2. However, gfs2 intentionally constructs bio chains so that the first bio's bi_end_io callback is invoked when all bios in the chain have completed, unlike bio chains where the last bio's callback is invoked. Fixes: 8a157e0a0aa5 ("gfs2: Fix use of bio_chain") Cc: stable@vger.kernel.org Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-11gfs2: fix freeze error handlingAlexey Velichayshiy1-3/+1
[ Upstream commit 4cfc7d5a4a01d2133b278cdbb1371fba1b419174 ] After commit b77b4a4815a9 ("gfs2: Rework freeze / thaw logic"), the freeze error handling is broken because gfs2_do_thaw() overwrites the 'error' variable, causing incorrect processing of the original freeze error. Fix this by calling gfs2_do_thaw() when gfs2_lock_fs_check_clean() fails but ignoring its return value to preserve the original freeze error for proper reporting. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: b77b4a4815a9 ("gfs2: Rework freeze / thaw logic") Cc: stable@vger.kernel.org # v6.5+ Signed-off-by: Alexey Velichayshiy <a.velichayshiy@ispras.ru> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> [ gfs2_do_thaw() only takes one param ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-11gfs2: Fix use of bio_chainAndreas Gruenbacher1-1/+1
[ Upstream commit 8a157e0a0aa5143b5d94201508c0ca1bb8cfb941 ] In gfs2_chain_bio(), the call to bio_chain() has its arguments swapped. The result is leaked bios and incorrect synchronization (only the last bio will actually be waited for). This code is only used during mount and filesystem thaw, so the bug normally won't be noticeable. Reported-by: Stephen Zhang <starzhangzsd@gmail.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-01-11gfs2: fix remote evict for read-only filesystemsAndreas Gruenbacher1-2/+1
[ Upstream commit 64c10ed9274bc46416f502afea48b4ae11279669 ] When a node tries to delete an inode, it first requests exclusive access to the iopen glock. This triggers demote requests on all remote nodes currently holding the iopen glock. To satisfy those requests, the remote nodes evict the inode in question, or they poke the corresponding inode glock to signal that the inode is still in active use. This behavior doesn't depend on whether or not a filesystem is read-only, so remove the incorrect read-only check. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-15gfs2: Fix GLF_INVALIDATE_IN_PROGRESS flag clearing in do_xmoteAndreas Gruenbacher1-2/+0
[ Upstream commit 061df28b82af6b22fb5fa529a8f2ef00474ee004 ] Commit 865cc3e9cc0b ("gfs2: fix a deadlock on withdraw-during-mount") added a statement to do_xmote() to clear the GLF_INVALIDATE_IN_PROGRESS flag a second time after it has already been cleared. Fix that. Fixes: 865cc3e9cc0b ("gfs2: fix a deadlock on withdraw-during-mount") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Andrew Price <anprice@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28gfs2: Set .migrate_folio in gfs2_{rgrp,meta}_aopsAndrew Price1-0/+2
[ Upstream commit 5c8f12cf1e64e0e8e6cb80b0c935389973e8be8d ] Clears up the warning added in 7ee3647243e5 ("migrate: Remove call to ->writepage") that occurs in various xfstests, causing "something found in dmesg" failures. [ 341.136573] gfs2_meta_aops does not implement migrate_folio [ 341.136953] WARNING: CPU: 1 PID: 36 at mm/migrate.c:944 move_to_new_folio+0x2f8/0x300 Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15gfs2: No more self recoveryAndreas Gruenbacher1-20/+11
[ Upstream commit deb016c1669002e48c431d6fd32ea1c20ef41756 ] When a node withdraws and it turns out that it is the only node that has the filesystem mounted, gfs2 currently tries to replay the local journal to bring the filesystem back into a consistent state. Not only is that a very bad idea, it has also never worked because gfs2_recover_func() will refuse to do anything during a withdraw. However, before even getting to this point, gfs2_recover_func() dereferences sdp->sd_jdesc->jd_inode. This was a use-after-free before commit 04133b607a78 ("gfs2: Prevent double iput for journal on error") and is a NULL pointer dereference since then. Simply get rid of self recovery to fix that. Fixes: 601ef0d52e96 ("gfs2: Force withdraw to replay journals and wait for it to finish") Reported-by: Chunjie Zhu <chunjie.zhu@cloud.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-27gfs2: move msleep to sleepable contextAlexander Aring1-1/+2
commit ac5ee087d31ed93b6e45d2968a66828c6f621d8c upstream. This patch moves the msleep_interruptible() out of the non-sleepable context by moving the ls->ls_recover_spin spinlock around so msleep_interruptible() will be called in a sleepable context. Cc: stable@vger.kernel.org Fixes: 4a7727725dc7 ("GFS2: Fix recovery issues for spectators") Suggested-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19gfs2: gfs2_create_inode error handling fixAndreas Gruenbacher1-1/+2
[ Upstream commit af4044fd0b77e915736527dd83011e46e6415f01 ] When gfs2_create_inode() finds a directory, make sure to return -EISDIR. Fixes: 571a4b57975a ("GFS2: bugger off early if O_CREAT open finds a directory") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-04gfs2: Check for empty queue in run_queueAndreas Gruenbacher1-3/+8
[ Upstream commit d838605fea6eabae3746a276fd448f6719eb3926 ] In run_queue(), check if the queue of pending requests is empty instead of blindly assuming that it won't be. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-01gfs2: Truncate address space when flipping GFS2_DIF_JDATA flagAndreas Gruenbacher1-0/+1
commit 7c9d9223802fbed4dee1ae301661bf346964c9d2 upstream. Truncate an inode's address space when flipping the GFS2_DIF_JDATA flag: depending on that flag, the pages in the address space will either use buffer heads or iomap_folio_state structs, and we cannot mix the two. Reported-by: Kun Hu <huk23@m.fudan.edu.cn>, Jiaji Qin <jjtan24@m.fudan.edu.cn> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14KMSAN: uninit-value in inode_go_dump (5)Qianqiang Liu1-0/+2
[ Upstream commit f9417fcfca3c5e30a0b961e7250fab92cfa5d123 ] When mounting of a corrupted disk image fails, the error message printed can reference uninitialized inode fields. To prevent that from happening, always initialize those fields. Reported-by: syzbot+aa0730b0a42646eb1359@syzkaller.appspotmail.com Signed-off-by: Qianqiang Liu <qianqiang.liu@163.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-09gfs2: Remove and replace gfs2_glock_queue_workAndreas Gruenbacher1-20/+15
[ Upstream commit 1e86044402c45b70a9b31beeaefb5cc732a7470c ] There are no more callers of gfs2_glock_queue_work() left, so remove that helper. With that, we can now rename __gfs2_glock_queue_work() back to gfs2_glock_queue_work() to get rid of some unnecessary clutter. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-09gfs2: Don't set GLF_LOCK in gfs2_dispose_glock_lruAndreas Gruenbacher1-2/+1
[ Upstream commit 927cfc90d27cb7732a62464f95fd9aa7edfa9b70 ] In gfs2_dispose_glock_lru(), we want to skip glocks which are in the process of transitioning state (as indicated by the set GLF_LOCK flag), but we we don't need to set that flag for requesting a state transition. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 1e86044402c4 ("gfs2: Remove and replace gfs2_glock_queue_work") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-09gfs2: Fix unlinked inode cleanupAndreas Gruenbacher4-3/+4
[ Upstream commit 7c6f714d88475ceae5342264858a641eafa19632 ] Before commit f0e56edc2ec7 ("gfs2: Split the two kinds of glock "delete" work"), function delete_work_func() was used to trigger the eviction of in-memory inodes from remote as well as deleting unlinked inodes at a later point. These two kinds of work were then split into two kinds of work, and the two places in the code were deferred deletion of inodes is required accidentally ended up queuing the wrong kind of work. This caused unlinked inodes to be left behind, which could in the worst case fill up filesystems and require a filesystem check to recover. Fix that by queuing the right kind of work in try_rgrp_unlink() and gfs2_drop_inode(). Fixes: f0e56edc2ec7 ("gfs2: Split the two kinds of glock "delete" work") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-09gfs2: Allow immediate GLF_VERIFY_DELETE workAndreas Gruenbacher1-5/+6
[ Upstream commit 160bc9555d8654464cbbd7bb1f6687048471d2f6 ] Add an argument to gfs2_queue_verify_delete() that allows it to queue GLF_VERIFY_DELETE work for immediate execution. This is used in the next patch. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 7c6f714d8847 ("gfs2: Fix unlinked inode cleanup") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-09gfs2: Rename GLF_VERIFY_EVICT to GLF_VERIFY_DELETEAndreas Gruenbacher2-8/+8
[ Upstream commit 820ce8ed53ce2111aa5171f7349f289d7e9d0693 ] Rename the GLF_VERIFY_EVICT flag to GLF_VERIFY_DELETE: that flag indicates that we want to delete an inode / verify that it has been deleted. To match, rename gfs2_queue_verify_evict() to gfs2_queue_verify_delete(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 7c6f714d8847 ("gfs2: Fix unlinked inode cleanup") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-09gfs2: Replace gfs2_glock_queue_put with gfs2_glock_put_asyncAndreas Gruenbacher4-14/+21
[ Upstream commit ee2be7d7c7f32783f60ee5fe59b91548a4571f10 ] Function gfs2_glock_queue_put() puts a glock reference by enqueuing glock work instead of putting the reference directly. This ensures that the operation won't sleep, but it is costly and really only necessary when putting the final glock reference. Replace it with a new gfs2_glock_put_async() function that only queues glock work when putting the last glock reference. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 7c6f714d8847 ("gfs2: Fix unlinked inode cleanup") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-09gfs2: Get rid of gfs2_glock_queue_put in signal_our_withdrawAndreas Gruenbacher1-1/+1
[ Upstream commit f80d882edcf242d0256d9e51b09d5fb7a3a0d3b4 ] In function signal_our_withdraw(), we are calling gfs2_glock_queue_put() in a context in which we are actually allowed to sleep, so replace that with a simple call to gfs2_glock_put(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 7c6f714d8847 ("gfs2: Fix unlinked inode cleanup") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17gfs2: Revert "ignore negated quota changes"Andreas Gruenbacher1-11/+0
[ Upstream commit 4b4b6374dc6134849f2bdca81fa2945b6ed6d9fc ] Commit 4c6a08125f22 ("gfs2: ignore negated quota changes") skips quota changes with qd_change == 0 instead of writing them back, which leaves behind non-zero qd_change values in the affected slots. The kernel then assumes that those slots are unused, while the qd_change values on disk indicate that they are indeed still in use. The next time the filesystem is mounted, those invalid slots are read in from disk, which will cause inconsistencies. Revert that commit to avoid filesystem corruption. This reverts commit 4c6a08125f2249531ec01783a5f4317d7342add5. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17gfs2: qd_check_sync cleanupsAndreas Gruenbacher1-18/+22
[ Upstream commit 59ebc33201237bf38e5adca3794716100660c5b4 ] Rename qd_check_sync() to qd_grab_sync() and make it return a bool. Turn the sync_gen pointer into a regular u64 and pass in U64_MAX instead of a NULL pointer when sync generation checking isn't needed. Introduce a new qd_ungrab_sync() helper for undoing the effects of qd_grab_sync() if the subsequent bh_get() on the qd object fails. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 4b4b6374dc61 ("gfs2: Revert "ignore negated quota changes"") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17gfs2: Revert "introduce qd_bh_get_or_undo"Andreas Gruenbacher1-19/+17
[ Upstream commit 2aedfe847b4d91eabee11a44c27244055cef4eb3 ] The qd_bh_get_or_undo() helper introduced by that commit doesn't improve the code much, so revert it and clean things up in a more useful way in the next commit. This reverts commit 7dbc6ae60dd7089d8ed42892b6a66c138f0aa7a0. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 4b4b6374dc61 ("gfs2: Revert "ignore negated quota changes"") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08gfs2: Revert "Add quota_change type"Andreas Gruenbacher2-15/+10
[ Upstream commit ec4b5200c8af9ce021399d3192b3379c089396c3 ] Commit 432928c93779 ("gfs2: Add quota_change type") makes the incorrect assertion that function do_qc() should behave differently in the two contexts it is used in, but that isn't actually true. In all cases, do_qc() grabs a "reference" when it starts using a slot in the per-node quota changes file, and it releases that "reference" when no more residual changes remain. Revert that broken commit. There are some remaining issues with function do_qc() which are addressed in the next commit. This reverts commit 432928c9377959684c748a9bc6553ed2d3c2ea4f. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-29gfs2: Refcounting fix in gfs2_thaw_superAndreas Gruenbacher1-0/+2
[ Upstream commit 4e58543e7da4859c4ba61d15493e3522b6ad71fd ] It turns out that the .freeze_super and .thaw_super operations require the filesystem to manage the superblock refcount itself. We are using the freeze_super() and thaw_super() helpers to mostly take care of that for us, but this means that the superblock may no longer be around by when thaw_super() returns, and gfs2_thaw_super() will then access freed memory. Take an extra superblock reference in gfs2_thaw_super() to fix that. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-29gfs2: setattr_chown: Add missing initializationAndreas Gruenbacher1-1/+1
[ Upstream commit 2d8d7990619878a848b1d916c2f936d3012ee17d ] Add a missing initialization of variable ap in setattr_chown(). Without, chown() may be able to bypass quotas. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05gfs2: Fix NULL pointer dereference in gfs2_log_flushAndreas Gruenbacher2-1/+6
[ Upstream commit 35264909e9d1973ab9aaa2a1b07cda70f12bb828 ] In gfs2_jindex_free(), set sdp->sd_jdesc to NULL under the log flush lock to provide exclusion against gfs2_log_flush(). In gfs2_log_flush(), check if sdp->sd_jdesc is non-NULL before dereferencing it. Otherwise, we could run into a NULL pointer dereference when outstanding glock work races with an unmount (glock_work_func -> run_queue -> do_xmote -> inode_go_sync -> gfs2_log_flush). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12kthread: add kthread_stop_putAndreas Gruenbacher1-6/+3
[ Upstream commit 6309727ef27162deabd5c095c11af24970fba5a2 ] Add a kthread_stop_put() helper that stops a thread and puts its task struct. Use it to replace the various instances of kthread_stop() followed by put_task_struct(). Remove the kthread_stop_put() macro in usbip that is similar but doesn't return the result of kthread_stop(). [agruenba@redhat.com: fix kerneldoc comment] Link: https://lkml.kernel.org/r/20230911111730.2565537-1-agruenba@redhat.com [akpm@linux-foundation.org: document kthread_stop_put()'s argument] Link: https://lkml.kernel.org/r/20230907234048.2499820-1-agruenba@redhat.com Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: bb9025f4432f ("dma-mapping: benchmark: fix up kthread-related error handling") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: do_xmote fixesAndreas Gruenbacher1-19/+25
[ Upstream commit 9947a06d29c0a30da88cdc6376ca5fd87083e130 ] Function do_xmote() is called with the glock spinlock held. Commit 86934198eefa added a 'goto skip_inval' statement at the beginning of the function to further below where the glock spinlock is expected not to be held anymore. Then it added code there that requires the glock spinlock to be held. This doesn't make sense; fix this up by dropping and retaking the spinlock where needed. In addition, when ->lm_lock() returned an error, do_xmote() didn't fail the locking operation, and simply left the glock hanging; fix that as well. (This is a much older error.) Fixes: 86934198eefa ("gfs2: Clear flags when withdraw prevents xmote") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: finish_xmote cleanupAndreas Gruenbacher1-8/+13
[ Upstream commit 1cd28e15864054f3c48baee9eecda1c0441c48ac ] Currently, function finish_xmote() takes and releases the glock spinlock. However, all of its callers immediately take that spinlock again, so it makes more sense to take the spin lock before calling finish_xmote() already. With that, thaw_glock() is the only place that sets the GLF_HAVE_REPLY flag outside of the glock spinlock, but it also takes that spinlock immediately thereafter. Change that to set the bit when the spinlock is already held. This allows to switch from test_and_clear_bit() to test_bit() and clear_bit() in glock_work_func(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 9947a06d29c0 ("gfs2: do_xmote fixes") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Rename gfs2_withdrawn to gfs2_withdrawing_or_withdrawnAndreas Gruenbacher15-41/+46
[ Upstream commit 4d927b03a68846e4e791ccde6b4c274df02f11e9 ] This function checks whether the filesystem has been been marked to be withdrawn eventually or has been withdrawn already. Rename this function to avoid confusing code like checking for gfs2_withdrawing() when gfs2_withdrawn() has already returned true. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 9947a06d29c0 ("gfs2: do_xmote fixes") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Mark withdraws as unlikelyAndreas Gruenbacher8-15/+15
[ Upstream commit 015af1af44003fff797f8632e940824c07d282bf ] Mark the gfs2_withdrawn(), gfs2_withdrawing(), and gfs2_withdraw_in_prog() inline functions as likely to return %false. This allows to get rid of likely() and unlikely() annotations at the call sites of those functions. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: 9947a06d29c0 ("gfs2: do_xmote fixes") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Fix potential glock use-after-free on unmountAndreas Gruenbacher6-16/+57
[ Upstream commit d98779e687726d8f8860f1c54b5687eec5f63a73 ] When a DLM lockspace is released and there ares still locks in that lockspace, DLM will unlock those locks automatically. Commit fb6791d100d1b started exploiting this behavior to speed up filesystem unmount: gfs2 would simply free glocks it didn't want to unlock and then release the lockspace. This didn't take the bast callbacks for asynchronous lock contention notifications into account, which remain active until until a lock is unlocked or its lockspace is released. To prevent those callbacks from accessing deallocated objects, put the glocks that should not be unlocked on the sd_dead_glocks list, release the lockspace, and only then free those glocks. As an additional measure, ignore unexpected ast and bast callbacks if the receiving glock is dead. Fixes: fb6791d100d1b ("GFS2: skip dlm_unlock calls in unmount") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Remove ill-placed consistency checkAndreas Gruenbacher1-1/+0
[ Upstream commit 59f60005797b4018d7b46620037e0c53d690795e ] This consistency check was originally added by commit 9287c6452d2b1 ("gfs2: Fix occasional glock use-after-free"). It is ill-placed in gfs2_glock_free() because if it holds there, it must equally hold in __gfs2_glock_put() already. Either way, the check doesn't seem necessary anymore. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: d98779e68772 ("gfs2: Fix potential glock use-after-free on unmount") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: No longer use 'extern' in function declarationsAndreas Gruenbacher19-285/+289
[ Upstream commit 0b2355fe91ac3756a9e29c8b833ba33f9affb520 ] For non-static function declarations, external linkage is implied and the 'extern' keyword isn't needed. Some static checkers complain about the overuse of 'extern', so clean up all the function declarations. In addition, remove 'extern' from the definition of free_local_statfs_inodes(); it isn't needed there, either. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: d98779e68772 ("gfs2: Fix potential glock use-after-free on unmount") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Rename gfs2_lookup_{ simple => meta }Andreas Gruenbacher3-15/+16
[ Upstream commit 062fb903895a035ed382a0d3f9b9d459b2718217 ] Function gfs2_lookup_simple() is used for looking up inodes in the metadata directory tree, so rename it to gfs2_lookup_meta() to closer match its purpose. Clean the function up a little on the way. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: d98779e68772 ("gfs2: Fix potential glock use-after-free on unmount") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Convert gfs2_internal_read to foliosAndreas Gruenbacher2-20/+18
[ Upstream commit be7f6a6b0bca708999eef4f8e9f2b128c73b9e17 ] Change gfs2_internal_read() to use folios. Convert sizes to size_t. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: d98779e68772 ("gfs2: Fix potential glock use-after-free on unmount") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Get rid of gfs2_alloc_blocks generation parameterAndreas Gruenbacher6-13/+15
[ Upstream commit 4c7b3f7fb7c8c66d669d107e717f9de41ef81e92 ] Get rid of the generation parameter of gfs2_alloc_blocks(): we only ever set the generation of the current inode while creating it, so do so directly. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Stable-dep-of: d98779e68772 ("gfs2: Fix potential glock use-after-free on unmount") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Fix "ignore unlock failures after withdraw"Andreas Gruenbacher2-2/+3
[ Upstream commit 5d9231111966b6c5a65016d58dcbeab91055bc91 ] Commit 3e11e53041502 tries to suppress dlm_lock() lock conversion errors that occur when the lockspace has already been released. It does that by setting and checking the SDF_SKIP_DLM_UNLOCK flag. This conflicts with the intended meaning of the SDF_SKIP_DLM_UNLOCK flag, so check whether the lockspace is still allocated instead. (Given the current DLM API, checking for this kind of error after the fact seems easier that than to make sure that the lockspace is still allocated before calling dlm_lock(). Changing the DLM API so that users maintain the lockspace references themselves would be an option.) Fixes: 3e11e53041502 ("GFS2: ignore unlock failures after withdraw") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12gfs2: Don't forget to complete delayed withdrawAndreas Gruenbacher1-0/+3
[ Upstream commit b01189333ee91c1ae6cd96dfd1e3a3c2e69202f0 ] Commit fffe9bee14b0 ("gfs2: Delay withdraw from atomic context") switched from gfs2_withdraw() to gfs2_withdraw_delayed() in gfs2_ail_error(), but failed to then check if a delayed withdraw had occurred. Fix that by adding the missing check in __gfs2_ail_flush(), where the spin locks are already dropped and a withdraw is possible. Fixes: fffe9bee14b0 ("gfs2: Delay withdraw from atomic context") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17gfs2: Fix invalid metadata access in punch_holeAndrew Price1-2/+3
[ Upstream commit c95346ac918c5badf51b9a7ac58a26d3bd5bb224 ] In punch_hole(), when the offset lies in the final block for a given height, there is no hole to punch, but the maximum size check fails to detect that. Consequently, punch_hole() will try to punch a hole beyond the end of the metadata and fail. Fix the maximum size check. Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-26gfs2: fix kernel BUG in gfs2_quota_cleanupEdward Adam Davis1-1/+2
[ Upstream commit 71733b4922007500ae259af9e96017080f5d36d9 ] [Syz report] kernel BUG at fs/gfs2/quota.c:1508! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 5060 Comm: syz-executor505 Not tainted 6.7.0-rc3-syzkaller-00134-g994d5c58e50e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 RIP: 0010:gfs2_quota_cleanup+0x6b5/0x6c0 fs/gfs2/quota.c:1508 Code: fe e9 cf fd ff ff 44 89 e9 80 e1 07 80 c1 03 38 c1 0f 8c 2d fe ff ff 4c 89 ef e8 b6 19 23 fe e9 20 fe ff ff e8 ec 11 c7 fd 90 <0f> 0b e8 84 9c 4f 07 0f 1f 40 00 66 0f 1f 00 55 41 57 41 56 41 54 RSP: 0018:ffffc9000409f9e0 EFLAGS: 00010293 RAX: ffffffff83c76854 RBX: 0000000000000002 RCX: ffff888026001dc0 RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000 RBP: ffffc9000409fb00 R08: ffffffff83c762b0 R09: 1ffff1100fd38015 R10: dffffc0000000000 R11: ffffed100fd38016 R12: dffffc0000000000 R13: ffff88807e9c0828 R14: ffff888014693580 R15: ffff88807e9c0000 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f16d1bd70f8 CR3: 0000000027199000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> gfs2_put_super+0x2e1/0x940 fs/gfs2/super.c:611 generic_shutdown_super+0x13a/0x2c0 fs/super.c:696 kill_block_super+0x44/0x90 fs/super.c:1667 deactivate_locked_super+0xc1/0x130 fs/super.c:484 cleanup_mnt+0x426/0x4c0 fs/namespace.c:1256 task_work_run+0x24a/0x300 kernel/task_work.c:180 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xa34/0x2750 kernel/exit.c:871 do_group_exit+0x206/0x2c0 kernel/exit.c:1021 __do_sys_exit_group kernel/exit.c:1032 [inline] __se_sys_exit_group kernel/exit.c:1030 [inline] __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x45/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b ... [pid 5060] fsconfig(4, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0) = 0 [pid 5060] exit_group(1) = ? ... [Analysis] When the task exits, it will execute cleanup_mnt() to recycle the mounted gfs2 file system, but it performs a system call fsconfig(4, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0) before executing the task exit operation. This will execute the following kernel path to complete the setting of SDF_JOURNAL_LIVE for sd_flags: SYSCALL_DEFINE5(fsconfig, ..)-> vfs_fsconfig_locked()-> vfs_cmd_reconfigure()-> gfs2_reconfigure()-> gfs2_make_fs_rw()-> set_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags); [Fix] Add SDF_NORECOVERY check in gfs2_quota_cleanup() to avoid checking SDF_JOURNAL_LIVE on the path where gfs2 is being unmounted. Reported-and-tested-by: syzbot+3b6e67ac2b646da57862@syzkaller.appspotmail.com Fixes: f66af88e3321 ("gfs2: Stop using gfs2_make_fs_ro for withdraw") Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-26gfs2: Fix kernel NULL pointer dereference in gfs2_rgrp_dumpOsama Muhammad1-1/+1
[ Upstream commit 8877243beafa7c6bfc42022cbfdf9e39b25bd4fa ] Syzkaller has reported a NULL pointer dereference when accessing rgd->rd_rgl in gfs2_rgrp_dump(). This can happen when creating rgd->rd_gl fails in read_rindex_entry(). Add a NULL pointer check in gfs2_rgrp_dump() to prevent that. Reported-and-tested-by: syzbot+da0fc229cc1ff4bb2e6d@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=da0fc229cc1ff4bb2e6d Fixes: 72244b6bc752 ("gfs2: improve debug information when lvb mismatches are found") Signed-off-by: Osama Muhammad <osmtendev@gmail.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28gfs2: don't withdraw if init_threads() got interruptedAndreas Gruenbacher1-3/+1
commit 0cdc6f44e9fdc2d20d720145bf99a39f611f6d61 upstream. In gfs2_fill_super(), when mounting a gfs2 filesystem is interrupted, kthread_create() can return -EINTR. When that happens, we roll back what has already been done and abort the mount. Since commit 62dd0f98a0e5 ("gfs2: Flag a withdraw if init_threads() fails), we are calling gfs2_withdraw_delayed() in gfs2_fill_super(); first via gfs2_make_fs_rw(), then directly. But gfs2_withdraw_delayed() only marks the filesystem as withdrawing and relies on a caller further up the stack to do the actual withdraw, which doesn't exist in the gfs2_fill_super() case. Because the filesystem is marked as withdrawing / withdrawn, function gfs2_lm_unmount() doesn't release the dlm lockspace, so when we try to mount that filesystem again, we get: gfs2: fsid=gohan:gohan0: Trying to join cluster "lock_dlm", "gohan:gohan0" gfs2: fsid=gohan:gohan0: dlm_new_lockspace error -17 Since commit b77b4a4815a9 ("gfs2: Rework freeze / thaw logic"), the deadlock this gfs2_withdraw_delayed() call was supposed to work around cannot occur anymore because freeze_go_callback() won't take the sb->s_umount semaphore unconditionally anymore, so we can get rid of the gfs2_withdraw_delayed() in gfs2_fill_super() entirely. Reported-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: stable@vger.kernel.org # v6.5+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28gfs2: Silence "suspicious RCU usage in gfs2_permission" warningAndreas Gruenbacher1-3/+4
[ Upstream commit 074d7306a4fe22fcac0b53f699f92757ab1cee99 ] Commit 0abd1557e21c added rcu_dereference() for dereferencing ip->i_gl in gfs2_permission. This now causes lockdep to complain when gfs2_permission is called in non-RCU context: WARNING: suspicious RCU usage in gfs2_permission Switch to rcu_dereference_check() and check for the MAY_NOT_BLOCK flag to shut up lockdep when we know that dereferencing ip->i_gl is safe. Fixes: 0abd1557e21c ("gfs2: fix an oops in gfs2_permission") Reported-by: syzbot+3e5130844b0c0e2b4948@syzkaller.appspotmail.com Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28gfs2: Fix slab-use-after-free in gfs2_qd_deallocJuntong Deng1-4/+6
[ Upstream commit bdcb8aa434c6d36b5c215d02a9ef07551be25a37 ] In gfs2_put_super(), whether withdrawn or not, the quota should be cleaned up by gfs2_quota_cleanup(). Otherwise, struct gfs2_sbd will be freed before gfs2_qd_dealloc (rcu callback) has run for all gfs2_quota_data objects, resulting in use-after-free. Also, gfs2_destroy_threads() and gfs2_quota_cleanup() is already called by gfs2_make_fs_ro(), so in gfs2_put_super(), after calling gfs2_make_fs_ro(), there is no need to call them again. Reported-by: syzbot+29c47e9e51895928698c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=29c47e9e51895928698c Signed-off-by: Juntong Deng <juntong.deng@outlook.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28gfs2: fix an oops in gfs2_permissionAl Viro2-3/+10
[ Upstream commit 0abd1557e21c617bd13fc18f7725fc6363c05913 ] In RCU mode, we might race with gfs2_evict_inode(), which zeroes ->i_gl. Freeing of the object it points to is RCU-delayed, so if we manage to fetch the pointer before it's been replaced with NULL, we are fine. Check if we'd fetched NULL and treat that as "bail out and tell the caller to get out of RCU mode". Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>