<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/ext4/sysfs.c, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-04-02T11:09:48+00:00</updated>
<entry>
<title>ext4: fix use-after-free in update_super_work when racing with umount</title>
<updated>2026-04-02T11:09:48+00:00</updated>
<author>
<name>Jiayuan Chen</name>
<email>jiayuan.chen@shopee.com</email>
</author>
<published>2026-03-19T12:03:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=034053378dd81837fd6c7a43b37ee2e58d4f0b4e'/>
<id>urn:sha1:034053378dd81837fd6c7a43b37ee2e58d4f0b4e</id>
<content type='text'>
commit d15e4b0a418537aafa56b2cb80d44add83e83697 upstream.

Commit b98535d09179 ("ext4: fix bug_on in start_this_handle during umount
filesystem") moved ext4_unregister_sysfs() before flushing s_sb_upd_work
to prevent new error work from being queued via /proc/fs/ext4/xx/mb_groups
reads during unmount. However, this introduced a use-after-free because
update_super_work calls ext4_notify_error_sysfs() -&gt; sysfs_notify() which
accesses the kobject's kernfs_node after it has been freed by kobject_del()
in ext4_unregister_sysfs():

  update_super_work                ext4_put_super
  -----------------                --------------
                                   ext4_unregister_sysfs(sb)
                                     kobject_del(&amp;sbi-&gt;s_kobj)
                                       __kobject_del()
                                         sysfs_remove_dir()
                                           kobj-&gt;sd = NULL
                                         sysfs_put(sd)
                                           kernfs_put()  // RCU free
  ext4_notify_error_sysfs(sbi)
    sysfs_notify(&amp;sbi-&gt;s_kobj)
      kn = kobj-&gt;sd              // stale pointer
      kernfs_get(kn)             // UAF on freed kernfs_node
                                   ext4_journal_destroy()
                                     flush_work(&amp;sbi-&gt;s_sb_upd_work)

Instead of reordering the teardown sequence, fix this by making
ext4_notify_error_sysfs() detect that sysfs has already been torn down
by checking s_kobj.state_in_sysfs, and skipping the sysfs_notify() call
in that case. A dedicated mutex (s_error_notify_mutex) serializes
ext4_notify_error_sysfs() against kobject_del() in ext4_unregister_sysfs()
to prevent TOCTOU races where the kobject could be deleted between the
state_in_sysfs check and the sysfs_notify() call.

Fixes: b98535d09179 ("ext4: fix bug_on in start_this_handle during umount filesystem")
Cc: Jiayuan Chen &lt;jiayuan.chen@linux.dev&gt;
Suggested-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jiayuan Chen &lt;jiayuan.chen@shopee.com&gt;
Reviewed-by: Ritesh Harjani (IBM) &lt;ritesh.list@gmail.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://patch.msgid.link/20260319120336.157873-1-jiayuan.chen@linux.dev
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ext4: add positive int attr pointer to avoid sysfs variables overflow</title>
<updated>2024-05-03T03:48:30+00:00</updated>
<author>
<name>Baokun Li</name>
<email>libaokun1@huawei.com</email>
</author>
<published>2024-03-19T11:33:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=63bfe841053f8dda09c9d059d543486d9dc16104'/>
<id>urn:sha1:63bfe841053f8dda09c9d059d543486d9dc16104</id>
<content type='text'>
The following variables controlled by the sysfs interface are of type
int and are normally used in the range [0, INT_MAX], but are declared as
attr_pointer_ui, and thus may be set to values that exceed INT_MAX and
result in overflows to get negative values.

  err_ratelimit_burst
  msg_ratelimit_burst
  warning_ratelimit_burst
  err_ratelimit_interval_ms
  msg_ratelimit_interval_ms
  warning_ratelimit_interval_ms

Therefore, we add attr_pointer_pi (aka positive int attr pointer) with a
value range of 0-INT_MAX to avoid overflow.

Signed-off-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20240319113325.3110393-7-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: add new attr pointer attr_mb_order</title>
<updated>2024-05-03T03:48:30+00:00</updated>
<author>
<name>Baokun Li</name>
<email>libaokun1@huawei.com</email>
</author>
<published>2024-03-19T11:33:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b7b2a5799b8fafe95fcd5455c32ba2c643c86f99'/>
<id>urn:sha1:b7b2a5799b8fafe95fcd5455c32ba2c643c86f99</id>
<content type='text'>
The s_mb_best_avail_max_trim_order is of type unsigned int, and has a
range of values well beyond the normal use of the mb_order. Although the
mballoc code is careful enough that large numbers don't matter there, but
this can mislead the sysadmin into thinking that it's normal to set such
values. Hence add a new attr_id attr_mb_order with values in the range
[0, 64] to avoid storing garbage values and make us more resilient to
surprises in the future.

Suggested-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20240319113325.3110393-6-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: fix slab-out-of-bounds in ext4_mb_find_good_group_avg_frag_lists()</title>
<updated>2024-05-03T03:48:30+00:00</updated>
<author>
<name>Baokun Li</name>
<email>libaokun1@huawei.com</email>
</author>
<published>2024-03-19T11:33:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=13df4d44a3aaabe61cd01d277b6ee23ead2a5206'/>
<id>urn:sha1:13df4d44a3aaabe61cd01d277b6ee23ead2a5206</id>
<content type='text'>
We can trigger a slab-out-of-bounds with the following commands:

    mkfs.ext4 -F /dev/$disk 10G
    mount /dev/$disk /tmp/test
    echo 2147483647 &gt; /sys/fs/ext4/$disk/mb_group_prealloc
    echo test &gt; /tmp/test/file &amp;&amp; sync

==================================================================
BUG: KASAN: slab-out-of-bounds in ext4_mb_find_good_group_avg_frag_lists+0x8a/0x200 [ext4]
Read of size 8 at addr ffff888121b9d0f0 by task kworker/u2:0/11
CPU: 0 PID: 11 Comm: kworker/u2:0 Tainted: GL 6.7.0-next-20240118 #521
Call Trace:
 dump_stack_lvl+0x2c/0x50
 kasan_report+0xb6/0xf0
 ext4_mb_find_good_group_avg_frag_lists+0x8a/0x200 [ext4]
 ext4_mb_regular_allocator+0x19e9/0x2370 [ext4]
 ext4_mb_new_blocks+0x88a/0x1370 [ext4]
 ext4_ext_map_blocks+0x14f7/0x2390 [ext4]
 ext4_map_blocks+0x569/0xea0 [ext4]
 ext4_do_writepages+0x10f6/0x1bc0 [ext4]
[...]
==================================================================

The flow of issue triggering is as follows:

// Set s_mb_group_prealloc to 2147483647 via sysfs
ext4_mb_new_blocks
  ext4_mb_normalize_request
    ext4_mb_normalize_group_request
      ac-&gt;ac_g_ex.fe_len = EXT4_SB(sb)-&gt;s_mb_group_prealloc
  ext4_mb_regular_allocator
    ext4_mb_choose_next_group
      ext4_mb_choose_next_group_best_avail
        mb_avg_fragment_size_order
          order = fls(len) - 2 = 29
        ext4_mb_find_good_group_avg_frag_lists
          frag_list = &amp;sbi-&gt;s_mb_avg_fragment_size[order]
          if (list_empty(frag_list)) // Trigger SOOB!

At 4k block size, the length of the s_mb_avg_fragment_size list is 14,
but an oversized s_mb_group_prealloc is set, causing slab-out-of-bounds
to be triggered by an attempt to access an element at index 29.

Add a new attr_id attr_clusters_in_group with values in the range
[0, sbi-&gt;s_clusters_per_group] and declare mb_group_prealloc as
that type to fix the issue. In addition avoid returning an order
from mb_avg_fragment_size_order() greater than MB_NUM_ORDERS(sb)
and reduce some useless loops.

Fixes: 7e170922f06b ("ext4: Add allocation criteria 1.5 (CR1_5)")
CC: stable@vger.kernel.org
Signed-off-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: Ojaswin Mujoo &lt;ojaswin@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20240319113325.3110393-5-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: refactor out ext4_generic_attr_show()</title>
<updated>2024-05-03T03:48:30+00:00</updated>
<author>
<name>Baokun Li</name>
<email>libaokun1@huawei.com</email>
</author>
<published>2024-03-19T11:33:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=57341fe3179c7694c92dcf99e7f836cee4c800dd'/>
<id>urn:sha1:57341fe3179c7694c92dcf99e7f836cee4c800dd</id>
<content type='text'>
Refactor out the function ext4_generic_attr_show() to handle the reading
of values of various common types, with no functional changes.

Signed-off-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Reviewed-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20240319113325.3110393-4-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: refactor out ext4_generic_attr_store()</title>
<updated>2024-05-03T03:48:30+00:00</updated>
<author>
<name>Baokun Li</name>
<email>libaokun1@huawei.com</email>
</author>
<published>2024-03-19T11:33:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f536808adcc37a546bf9cc819c349bd55a28159b'/>
<id>urn:sha1:f536808adcc37a546bf9cc819c349bd55a28159b</id>
<content type='text'>
Refactor out the function ext4_generic_attr_store() to handle the setting
of values of various common types, with no functional changes.

Signed-off-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20240319113325.3110393-3-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: avoid overflow when setting values via sysfs</title>
<updated>2024-05-03T03:48:30+00:00</updated>
<author>
<name>Baokun Li</name>
<email>libaokun1@huawei.com</email>
</author>
<published>2024-03-19T11:33:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9e8e819f8f272c4e5dcd0bd6c7450e36481ed139'/>
<id>urn:sha1:9e8e819f8f272c4e5dcd0bd6c7450e36481ed139</id>
<content type='text'>
When setting values of type unsigned int through sysfs, we use kstrtoul()
to parse it and then truncate part of it as the final set value, when the
set value is greater than UINT_MAX, the set value will not match what we
see because of the truncation. As follows:

  $ echo 4294967296 &gt; /sys/fs/ext4/sda/mb_max_linear_groups
  $ cat /sys/fs/ext4/sda/mb_max_linear_groups
    0

So we use kstrtouint() to parse the attr_pointer_ui type to avoid the
inconsistency described above. In addition, a judgment is added to avoid
setting s_resv_clusters less than 0.

Signed-off-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20240319113325.3110393-2-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: Give symbolic names to mballoc criterias</title>
<updated>2023-06-26T23:34:56+00:00</updated>
<author>
<name>Ojaswin Mujoo</name>
<email>ojaswin@linux.ibm.com</email>
</author>
<published>2023-05-30T12:33:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f52f3d2b9fbab73c776f4d3386393e9bbc83b87d'/>
<id>urn:sha1:f52f3d2b9fbab73c776f4d3386393e9bbc83b87d</id>
<content type='text'>
mballoc criterias have historically been called by numbers
like CR0, CR1... however this makes it confusing to understand
what each criteria is about.

Change these criterias from numbers to symbolic names and add
relevant comments. While we are at it, also reformat and add some
comments to ext4_seq_mb_stats_show() for better readability.

Additionally, define CR_FAST which signifies the criteria
below which we can make quicker decisions like:
  * quitting early if (free block &lt; requested len)
  * avoiding to scan free extents smaller than required len.
  * avoiding to initialize buddy cache and work with existing cache
  * limiting prefetches

Suggested-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Ojaswin Mujoo &lt;ojaswin@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/a2dc6ec5aea5e5e68cf8e788c2a964ffead9c8b0.1685449706.git.ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: Add allocation criteria 1.5 (CR1_5)</title>
<updated>2023-06-26T23:34:56+00:00</updated>
<author>
<name>Ojaswin Mujoo</name>
<email>ojaswin@linux.ibm.com</email>
</author>
<published>2023-05-30T12:33:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7e170922f06bf46effa7c57f6035fc463d6edc7e'/>
<id>urn:sha1:7e170922f06bf46effa7c57f6035fc463d6edc7e</id>
<content type='text'>
CR1_5 aims to optimize allocations which can't be satisfied in CR1. The
fact that we couldn't find a group in CR1 suggests that it would be
difficult to find a continuous extent to compleltely satisfy our
allocations. So before falling to the slower CR2, in CR1.5 we
proactively trim the the preallocations so we can find a group with
(free / fragments) big enough.  This speeds up our allocation at the
cost of slightly reduced preallocation.

The patch also adds a new sysfs tunable:

* /sys/fs/ext4/&lt;partition&gt;/mb_cr1_5_max_trim_order

This controls how much CR1.5 can trim a request before falling to CR2.
For example, for a request of order 7 and max trim order 2, CR1.5 can
trim this upto order 5.

Suggested-by: Ritesh Harjani (IBM) &lt;ritesh.list@gmail.com&gt;
Signed-off-by: Ojaswin Mujoo &lt;ojaswin@linux.ibm.com&gt;
Reviewed-by: Ritesh Harjani (IBM) &lt;ritesh.list@gmail.com&gt;
Link: https://lore.kernel.org/r/150fdf65c8e4cc4dba71e020ce0859bcf636a5ff.1685449706.git.ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>ext4: Remove the logic to trim inode PAs</title>
<updated>2023-04-06T05:13:13+00:00</updated>
<author>
<name>Ojaswin Mujoo</name>
<email>ojaswin@linux.ibm.com</email>
</author>
<published>2023-03-25T08:13:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=361eb69fc99f1a8f1d653d69ecd742f3cbb896be'/>
<id>urn:sha1:361eb69fc99f1a8f1d653d69ecd742f3cbb896be</id>
<content type='text'>
Earlier, inode PAs were stored in a linked list. This caused a need to
periodically trim the list down inorder to avoid growing it to a very
large size, as this would severly affect performance during list
iteration.

Recent patches changed this list to an rbtree, and since the tree scales
up much better, we no longer need to have the trim functionality, hence
remove it.

Signed-off-by: Ojaswin Mujoo &lt;ojaswin@linux.ibm.com&gt;
Reviewed-by: Ritesh Harjani (IBM) &lt;ritesh.list@gmail.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/c409addceaa3ade4b40328e28e3b54b2f259689e.1679731817.git.ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
</feed>
