<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/ext4/sysfs.c, branch v6.1.168</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-04-11T12:16:34+00:00</updated>
<entry>
<title>ext4: fix use-after-free in update_super_work when racing with umount</title>
<updated>2026-04-11T12:16:34+00:00</updated>
<author>
<name>Jiayuan Chen</name>
<email>jiayuan.chen@shopee.com</email>
</author>
<published>2026-04-02T16:31:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c4d829737329f2290dd41e290b7d75effdb2a7ff'/>
<id>urn:sha1:c4d829737329f2290dd41e290b7d75effdb2a7ff</id>
<content type='text'>
[ Upstream commit d15e4b0a418537aafa56b2cb80d44add83e83697 ]

Commit b98535d09179 ("ext4: fix bug_on in start_this_handle during umount
filesystem") moved ext4_unregister_sysfs() before flushing s_sb_upd_work
to prevent new error work from being queued via /proc/fs/ext4/xx/mb_groups
reads during unmount. However, this introduced a use-after-free because
update_super_work calls ext4_notify_error_sysfs() -&gt; sysfs_notify() which
accesses the kobject's kernfs_node after it has been freed by kobject_del()
in ext4_unregister_sysfs():

  update_super_work                ext4_put_super
  -----------------                --------------
                                   ext4_unregister_sysfs(sb)
                                     kobject_del(&amp;sbi-&gt;s_kobj)
                                       __kobject_del()
                                         sysfs_remove_dir()
                                           kobj-&gt;sd = NULL
                                         sysfs_put(sd)
                                           kernfs_put()  // RCU free
  ext4_notify_error_sysfs(sbi)
    sysfs_notify(&amp;sbi-&gt;s_kobj)
      kn = kobj-&gt;sd              // stale pointer
      kernfs_get(kn)             // UAF on freed kernfs_node
                                   ext4_journal_destroy()
                                     flush_work(&amp;sbi-&gt;s_sb_upd_work)

Instead of reordering the teardown sequence, fix this by making
ext4_notify_error_sysfs() detect that sysfs has already been torn down
by checking s_kobj.state_in_sysfs, and skipping the sysfs_notify() call
in that case. A dedicated mutex (s_error_notify_mutex) serializes
ext4_notify_error_sysfs() against kobject_del() in ext4_unregister_sysfs()
to prevent TOCTOU races where the kobject could be deleted between the
state_in_sysfs check and the sysfs_notify() call.

Fixes: b98535d09179 ("ext4: fix bug_on in start_this_handle during umount filesystem")
Cc: Jiayuan Chen &lt;jiayuan.chen@linux.dev&gt;
Suggested-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jiayuan Chen &lt;jiayuan.chen@shopee.com&gt;
Reviewed-by: Ritesh Harjani (IBM) &lt;ritesh.list@gmail.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://patch.msgid.link/20260319120336.157873-1-jiayuan.chen@linux.dev
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
[ adapted mutex_init placement ]
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ext4: Fix function prototype mismatch for ext4_feat_ktype</title>
<updated>2023-02-25T10:25:43+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2023-01-04T21:09:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0a1394e07c5d6bf1bfc25db8589ff1b1bfb6f46a'/>
<id>urn:sha1:0a1394e07c5d6bf1bfc25db8589ff1b1bfb6f46a</id>
<content type='text'>
commit 118901ad1f25d2334255b3d50512fa20591531cd upstream.

With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed.

ext4_feat_ktype was setting the "release" handler to "kfree", which
doesn't have a matching function prototype. Add a simple wrapper
with the correct prototype.

This was found as a result of Clang's new -Wcast-function-type-strict
flag, which is more sensitive than the simpler -Wcast-function-type,
which only checks for type width mismatches.

Note that this code is only reached when ext4 is a loadable module and
it is being unloaded:

 CFI failure at kobject_put+0xbb/0x1b0 (target: kfree+0x0/0x180; expected type: 0x7c4aa698)
 ...
 RIP: 0010:kobject_put+0xbb/0x1b0
 ...
 Call Trace:
  &lt;TASK&gt;
  ext4_exit_sysfs+0x14/0x60 [ext4]
  cleanup_module+0x67/0xedb [ext4]

Fixes: b99fee58a20a ("ext4: create ext4_feat kobject dynamically")
Cc: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: Eric Biggers &lt;ebiggers@kernel.org&gt;
Cc: stable@vger.kernel.org
Build-tested-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Reviewed-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Reviewed-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Link: https://lore.kernel.org/r/20230103234616.never.915-kees@kernel.org
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Eric Biggers &lt;ebiggers@google.com&gt;
Link: https://lore.kernel.org/r/20230104210908.gonna.388-kees@kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>unicode: clean up the Kconfig symbol confusion</title>
<updated>2022-01-21T00:57:24+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2022-01-18T06:56:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5298d4bfe80f6ae6ae2777bcd1357b0022d98573'/>
<id>urn:sha1:5298d4bfe80f6ae6ae2777bcd1357b0022d98573</id>
<content type='text'>
Turn the CONFIG_UNICODE symbol into a tristate that generates some always
built in code and remove the confusing CONFIG_UNICODE_UTF8_DATA symbol.

Note that a lot of the IS_ENABLED() checks could be turned from cpp
statements into normal ifs, but this change is intended to be fairly
mechanic, so that should be cleaned up later.

Fixes: 2b3d04787012 ("unicode: Add utf8-data module")
Reported-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Reviewed-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Gabriel Krisman Bertazi &lt;krisman@collabora.com&gt;
</content>
</entry>
<entry>
<title>ext4: allow to change s_last_trim_minblks via sysfs</title>
<updated>2022-01-10T18:25:55+00:00</updated>
<author>
<name>Lukas Czerner</name>
<email>lczerner@redhat.com</email>
</author>
<published>2021-11-03T14:51:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4a69aecbfb30a3fc85bf8028386c047d5607a97a'/>
<id>urn:sha1:4a69aecbfb30a3fc85bf8028386c047d5607a97a</id>
<content type='text'>
Ext4 has an optimization mechanism for batched disacrd (FITRIM) that
should help speed up subsequent calls of FITRIM ioctl by skipping the
groups that were previously trimmed. However because the FITRIM allows
to set the minimum size of an extent to trim, ext4 stores the last
minimum extent size and only avoids trimming the group if it was
previously trimmed with minimum extent size equal to, or smaller than
the current call.

There is currently no way to bypass the optimization without
umount/mount cycle. This becomes a problem when the file system is
live migrated to a different storage, because the optimization will
prevent possibly useful discard calls to the storage.

Fix it by exporting the s_last_trim_minblks via sysfs interface which
will allow us to set the minimum size to the number of blocks larger
than subsequent FITRIM call, effectively bypassing the optimization.

By setting the s_last_trim_minblks to ULONG_MAX the optimization will be
effectively cleared regardless of the previous state, or file system
configuration.

For example:
getconf ULONG_MAX &gt; /sys/fs/ext4/dm-1/last_trim_minblks

Signed-off-by: Lukas Czerner &lt;lczerner@redhat.com&gt;
Reported-by: Laurent GUERBY &lt;laurent@guerby.net&gt;
Reviewed-by: Andreas Dilger &lt;adilger@dilger.ca&gt;
Link: https://lore.kernel.org/r/20211103145122.17338-2-lczerner@redhat.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: replace snprintf in show functions with sysfs_emit</title>
<updated>2022-01-10T18:25:55+00:00</updated>
<author>
<name>Qing Wang</name>
<email>wangqing@vivo.com</email>
</author>
<published>2021-10-13T03:28:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dfac1a167068d60b36cc8f2081394a28b6fc424b'/>
<id>urn:sha1:dfac1a167068d60b36cc8f2081394a28b6fc424b</id>
<content type='text'>
coccicheck complains about the use of snprintf() in sysfs show functions.

Fix the coccicheck warning:
WARNING: use scnprintf or sprintf.

Use sysfs_emit instead of scnprintf or sprintf makes more sense.

Signed-off-by: Qing Wang &lt;wangqing@vivo.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/1634095731-4528-1-git-send-email-wangqing@vivo.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: notify sysfs on errors_count value change</title>
<updated>2021-06-30T01:06:02+00:00</updated>
<author>
<name>Jonathan Davies</name>
<email>jonathan.davies@nutanix.com</email>
</author>
<published>2021-06-11T14:02:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d578b99443fde0968246cc7cbf3bc3016123c2f4'/>
<id>urn:sha1:d578b99443fde0968246cc7cbf3bc3016123c2f4</id>
<content type='text'>
After s_error_count is incremented, signal the change in the
corresponding sysfs attribute via sysfs_notify. This allows userspace to
poll() on changes to /sys/fs/ext4/*/errors_count.

[ Moved call of ext4_notify_error_sysfs() to flush_stashed_error_work()
  to avoid BUG's caused by calling sysfs_notify trying to sleep after
  being called from an invalid context. -- TYT ]

Signed-off-by: Jonathan Davies &lt;jonathan.davies@nutanix.com&gt;
Link: https://lore.kernel.org/r/20210611140209.28903-1-jonathan.davies@nutanix.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: Only advertise encrypted_casefold when encryption and unicode are enabled</title>
<updated>2021-06-06T14:10:23+00:00</updated>
<author>
<name>Daniel Rosenberg</name>
<email>drosen@google.com</email>
</author>
<published>2021-06-03T09:48:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e71f99f2dfb45f4e7203a0732e85f71ef1d04dab'/>
<id>urn:sha1:e71f99f2dfb45f4e7203a0732e85f71ef1d04dab</id>
<content type='text'>
Encrypted casefolding is only supported when both encryption and
casefolding are both enabled in the config.

Fixes: 471fbbea7ff7 ("ext4: handle casefolding with encryption")
Cc: stable@vger.kernel.org # 5.13+
Signed-off-by: Daniel Rosenberg &lt;drosen@google.com&gt;
Link: https://lore.kernel.org/r/20210603094849.314342-1-drosen@google.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: add proc files to monitor new structures</title>
<updated>2021-04-09T15:34:59+00:00</updated>
<author>
<name>Harshad Shirwadkar</name>
<email>harshadshirwadkar@gmail.com</email>
</author>
<published>2021-04-01T17:21:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f68f4063855903fd3a279e646451eab04db0655f'/>
<id>urn:sha1:f68f4063855903fd3a279e646451eab04db0655f</id>
<content type='text'>
This patch adds a new file "mb_structs_summary" which allows us to see
the summary of the new allocator structures added in this
series. Here's the sample output of file:

optimize_scan: 1
max_free_order_lists:
        list_order_0_groups: 0
        list_order_1_groups: 0
        list_order_2_groups: 0
        list_order_3_groups: 0
        list_order_4_groups: 0
        list_order_5_groups: 0
        list_order_6_groups: 0
        list_order_7_groups: 0
        list_order_8_groups: 0
        list_order_9_groups: 0
        list_order_10_groups: 0
        list_order_11_groups: 0
        list_order_12_groups: 0
        list_order_13_groups: 40
fragment_size_tree:
        tree_min: 16384
        tree_max: 32768
        tree_nodes: 40

Signed-off-by: Harshad Shirwadkar &lt;harshadshirwadkar@gmail.com&gt;
Reviewed-by: Andreas Dilger &lt;adilger@dilger.ca&gt;
Reviewed-by: Ritesh Harjani &lt;ritesh.list@gmail.com&gt;
Link: https://lore.kernel.org/r/20210401172129.189766-7-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: improve cr 0 / cr 1 group scanning</title>
<updated>2021-04-09T15:34:59+00:00</updated>
<author>
<name>Harshad Shirwadkar</name>
<email>harshadshirwadkar@gmail.com</email>
</author>
<published>2021-04-01T17:21:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=196e402adf2e4cd66f101923409f1970ec5f1af3'/>
<id>urn:sha1:196e402adf2e4cd66f101923409f1970ec5f1af3</id>
<content type='text'>
Instead of traversing through groups linearly, scan groups in specific
orders at cr 0 and cr 1. At cr 0, we want to find groups that have the
largest free order &gt;= the order of the request. So, with this patch,
we maintain lists for each possible order and insert each group into a
list based on the largest free order in its buddy bitmap. During cr 0
allocation, we traverse these lists in the increasing order of largest
free orders. This allows us to find a group with the best available cr
0 match in constant time. If nothing can be found, we fallback to cr 1
immediately.

At CR1, the story is slightly different. We want to traverse in the
order of increasing average fragment size. For CR1, we maintain a rb
tree of groupinfos which is sorted by average fragment size. Instead
of traversing linearly, at CR1, we traverse in the order of increasing
average fragment size, starting at the most optimal group. This brings
down cr 1 search complexity to log(num groups).

For cr &gt;= 2, we just perform the linear search as before. Also, in
case of lock contention, we intermittently fallback to linear search
even in CR 0 and CR 1 cases. This allows us to proceed during the
allocation path even in case of high contention.

There is an opportunity to do optimization at CR2 too. That's because
at CR2 we only consider groups where bb_free counter (number of free
blocks) is greater than the request extent size. That's left as future
work.

All the changes introduced in this patch are protected under a new
mount option "mb_optimize_scan".

With this patchset, following experiment was performed:

Created a highly fragmented disk of size 65TB. The disk had no
contiguous 2M regions. Following command was run consecutively for 3
times:

time dd if=/dev/urandom of=file bs=2M count=10

Here are the results with and without cr 0/1 optimizations introduced
in this patch:

|---------+------------------------------+---------------------------|
|         | Without CR 0/1 Optimizations | With CR 0/1 Optimizations |
|---------+------------------------------+---------------------------|
| 1st run | 5m1.871s                     | 2m47.642s                 |
| 2nd run | 2m28.390s                    | 0m0.611s                  |
| 3rd run | 2m26.530s                    | 0m1.255s                  |
|---------+------------------------------+---------------------------|

Signed-off-by: Harshad Shirwadkar &lt;harshadshirwadkar@gmail.com&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Reviewed-by: Andreas Dilger &lt;adilger@dilger.ca&gt;
Link: https://lore.kernel.org/r/20210401172129.189766-6-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: add mballoc stats proc file</title>
<updated>2021-04-09T15:34:59+00:00</updated>
<author>
<name>Harshad Shirwadkar</name>
<email>harshadshirwadkar@gmail.com</email>
</author>
<published>2021-04-01T17:21:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a6c75eaf11032f4a3d2b3ce2265a194ac6e4a7f0'/>
<id>urn:sha1:a6c75eaf11032f4a3d2b3ce2265a194ac6e4a7f0</id>
<content type='text'>
Add new stats for measuring the performance of mballoc. This patch is
forked from Artem Blagodarenko's work that can be found here:

https://github.com/lustre/lustre-release/blob/master/ldiskfs/kernel_patches/patches/rhel8/ext4-simple-blockalloc.patch

This patch reorganizes the stats by cr level. This is how the output
looks like:

mballoc:
	reqs: 0
	success: 0
	groups_scanned: 0
	cr0_stats:
		hits: 0
		groups_considered: 0
		useless_loops: 0
		bad_suggestions: 0
	cr1_stats:
		hits: 0
		groups_considered: 0
		useless_loops: 0
		bad_suggestions: 0
	cr2_stats:
		hits: 0
		groups_considered: 0
		useless_loops: 0
	cr3_stats:
		hits: 0
		groups_considered: 0
		useless_loops: 0
	extents_scanned: 0
		goal_hits: 0
		2^n_hits: 0
		breaks: 0
		lost: 0
	buddies_generated: 0/40
	buddies_time_used: 0
	preallocated: 0
	discarded: 0

Signed-off-by: Harshad Shirwadkar &lt;harshadshirwadkar@gmail.com&gt;
Reviewed-by: Andreas Dilger &lt;adilger@dilger.ca&gt;
Reviewed-by: Ritesh Harjani &lt;ritesh.list@gmail.com&gt;
Link: https://lore.kernel.org/r/20210401172129.189766-4-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
</feed>
