<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/mm/slub.c, branch v7.0.10</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.10</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.10'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-05-07T04:14:17+00:00</updated>
<entry>
<title>mm/slab: return NULL early from kmalloc_nolock() in NMI on UP</title>
<updated>2026-05-07T04:14:17+00:00</updated>
<author>
<name>Harry Yoo (Oracle)</name>
<email>harry@kernel.org</email>
</author>
<published>2026-04-27T07:09:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d66553204a15bdb257d9ef8aca1e12f5fbb910b2'/>
<id>urn:sha1:d66553204a15bdb257d9ef8aca1e12f5fbb910b2</id>
<content type='text'>
commit 5b31044e649e3e54c2caef135c09b371c2fbcd08 upstream.

On UP kernels (!CONFIG_SMP), spin_trylock() is a no-op that
unconditionally succeeds even when the lock is already held. As a
result, kmalloc_nolock() called from NMI context can re-enter the slab
allocator and acquire n-&gt;list_lock that the interrupted context is
already holding, corrupting slab state.

With CONFIG_DEBUG_SPINLOCK on UP, the following BUG is triggered with
the slub_kunit test module:

  BUG: spinlock trylock failure on UP on CPU#0, kunit_try_catch/243
  [...]
  Call Trace:
   &lt;NMI&gt;
   dump_stack_lvl+0x3f/0x60
   do_raw_spin_trylock+0x41/0x50
   _raw_spin_trylock+0x24/0x50
   get_from_partial_node+0x120/0x4d0
   ___slab_alloc+0x8a/0x4c0
   kmalloc_nolock_noprof+0x164/0x310
   [...]
   &lt;/NMI&gt;

Fix this by returning NULL early when invoked from NMI on a UP kernel.

Link: https://lore.kernel.org/linux-mm/ad_cqe51pvr1WaDg@hyeyoo
Cc: stable@vger.kernel.org
Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
Signed-off-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Link: https://patch.msgid.link/20260427-nolock-api-fix-v2-2-a6b83a92d9a4@kernel.org
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>slub: fix data loss and overflow in krealloc()</title>
<updated>2026-05-07T04:13:58+00:00</updated>
<author>
<name>Marco Elver</name>
<email>elver@google.com</email>
</author>
<published>2026-04-16T13:25:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=550fa6b5aabb096554536ac1e3ec96b76cbb35fd'/>
<id>urn:sha1:550fa6b5aabb096554536ac1e3ec96b76cbb35fd</id>
<content type='text'>
commit 082a6d03a2d685a83a332666b500ad3966349588 upstream.

Commit 2cd8231796b5 ("mm/slub: allow to set node and align in
k[v]realloc") introduced the ability to force a reallocation if the
original object does not satisfy new alignment or NUMA node, even when
the object is being shrunk.

This introduced two bugs in the reallocation fallback path:

1. Data loss during NUMA migration: The jump to 'alloc_new' happens
   before 'ks' and 'orig_size' are initialized. As a result, the
   memcpy() in the 'alloc_new' block would copy 0 bytes into the new
   allocation.

2. Buffer overflow during shrinking: When shrinking an object while
   forcing a new alignment, 'new_size' is smaller than the old size.
   However, the memcpy() used the old size ('orig_size ?: ks'), leading
   to an out-of-bounds write.

The same overflow bug exists in the kvrealloc() fallback path, where the
old bucket size ksize(p) is copied into the new buffer without being
bounded by the new size.

A simple reproducer:

	// e.g. add to lkdtm as KREALLOC_SHRINK_OVERFLOW
	while (1) {
		void *p = kmalloc(128, GFP_KERNEL);
		p = krealloc_node_align(p, 64, 256, GFP_KERNEL, NUMA_NO_NODE);
		kfree(p);
	}

demonstrates the issue:

  ==================================================================
  BUG: KFENCE: out-of-bounds write in memcpy_orig+0x68/0x130

  Out-of-bounds write at 0xffff8883ad757038 (120B right of kfence-#47):
   memcpy_orig+0x68/0x130
   krealloc_node_align_noprof+0x1c8/0x340
   lkdtm_KREALLOC_SHRINK_OVERFLOW+0x8c/0xc0 [lkdtm]
   lkdtm_do_action+0x3a/0x60 [lkdtm]
   ...

  kfence-#47: 0xffff8883ad756fc0-0xffff8883ad756fff, size=64, cache=kmalloc-64

  allocated by task 316 on cpu 7 at 97.680481s (0.021813s ago):
   krealloc_node_align_noprof+0x19c/0x340
   lkdtm_KREALLOC_SHRINK_OVERFLOW+0x8c/0xc0 [lkdtm]
   lkdtm_do_action+0x3a/0x60 [lkdtm]
   ...
  ==================================================================

Fix it by moving the old size calculation to the top of __do_krealloc()
and bounding all copy lengths by the new allocation size.

Fixes: 2cd8231796b5 ("mm/slub: allow to set node and align in k[v]realloc")
Cc: stable@vger.kernel.org
Reported-by: https://sashiko.dev/#/patchset/20260415143735.2974230-1-elver%40google.com
Signed-off-by: Marco Elver &lt;elver@google.com&gt;
Link: https://patch.msgid.link/20260416132837.3787694-1-elver@google.com
Reviewed-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>slab: fix memory leak when refill_sheaf() fails</title>
<updated>2026-03-11T16:55:26+00:00</updated>
<author>
<name>Qing Wang</name>
<email>wangqing7171@gmail.com</email>
</author>
<published>2026-03-11T09:36:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=464b1c115852fe025635ae2065e00caced184d92'/>
<id>urn:sha1:464b1c115852fe025635ae2065e00caced184d92</id>
<content type='text'>
When refill_sheaf() partially fills one sheaf (e.g., fills 5 objects
but need to fill 10), it will update sheaf-&gt;size and return -ENOMEM.
However, the callers (alloc_full_sheaf() and __pcs_replace_empty_main())
directly call free_empty_sheaf() on failure, which only does kfree(sheaf),
causing the partially allocated objects memory in sheaf-&gt;objects[] leaked.

Fix this by calling sheaf_flush_unused() before free_empty_sheaf() to
free objects of sheaf-&gt;objects[]. And also add a WARN_ON() in
free_empty_sheaf() to catch any future cases where a non-empty sheaf is
being freed.

Fixes: ed30c4adfc2b ("slab: add optimized sheaf refill from partial list")
Signed-off-by: Qing Wang &lt;wangqing7171@gmail.com&gt;
Link: https://patch.msgid.link/20260311093617.4155965-1-wangqing7171@gmail.com
Reviewed-by: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Reviewed-by: Hao Li &lt;hao.li@linux.dev&gt;
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm/slab: fix an incorrect check in obj_exts_alloc_size()</title>
<updated>2026-03-10T10:02:54+00:00</updated>
<author>
<name>Harry Yoo</name>
<email>harry.yoo@oracle.com</email>
</author>
<published>2026-03-09T07:22:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8dafa9f5900c4855a65dbfee51e3bd00636deee1'/>
<id>urn:sha1:8dafa9f5900c4855a65dbfee51e3bd00636deee1</id>
<content type='text'>
obj_exts_alloc_size() prevents recursive allocation of slabobj_ext
array from the same cache, to avoid creating slabs that are never freed.

There is one mistake that returns the original size when memory
allocation profiling is disabled. The assumption was that
memcg-triggered slabobj_ext allocation is always served from
KMALLOC_CGROUP type. But this is wrong [1]: when the caller specifies
both __GFP_RECLAIMABLE and __GFP_ACCOUNT with SLUB_TINY enabled, the
allocation is served from normal kmalloc. This is because kmalloc_type()
prioritizes __GFP_RECLAIMABLE over __GFP_ACCOUNT, and SLUB_TINY aliases
KMALLOC_RECLAIM with KMALLOC_NORMAL.

As a result, the recursion guard is bypassed and the problematic slabs
can be created. Fix this by removing the mem_alloc_profiling_enabled()
check entirely. The remaining is_kmalloc_normal() check is still
sufficient to detect whether the cache is of KMALLOC_NORMAL type and
avoid bumping the size if it's not.

Without SLUB_TINY, no functional change intended.
With SLUB_TINY, allocations with __GFP_ACCOUNT|__GFP_RECLAIMABLE
now allocate a larger array if the sizes equal.

Reported-by: Zw Tang &lt;shicenci@gmail.com&gt;
Fixes: 280ea9c3154b ("mm/slab: avoid allocating slabobj_ext array from its own slab")
Closes: https://lore.kernel.org/linux-mm/CAPHJ_VKuMKSke8b11AZQw1PTSFN4n2C0gFxC6xGOG0ZLHgPmnA@mail.gmail.com [1]
Cc: stable@vger.kernel.org
Signed-off-by: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Link: https://patch.msgid.link/20260309072219.22653-1-harry.yoo@oracle.com
Tested-by: Zw Tang &lt;shicenci@gmail.com&gt;
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm/slab: allow sheaf refill if blocking is not allowed</title>
<updated>2026-03-04T10:03:54+00:00</updated>
<author>
<name>Vlastimil Babka (SUSE)</name>
<email>vbabka@kernel.org</email>
</author>
<published>2026-03-02T09:55:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fb1091febd668398aa84c161b8d9a1834321e021'/>
<id>urn:sha1:fb1091febd668398aa84c161b8d9a1834321e021</id>
<content type='text'>
Ming Lei reported [1] a regression in the ublk null target benchmark due
to sheaves. The profile shows that the alloc_from_pcs() fastpath fails
and allocations fall back to ___slab_alloc(). It also shows the
allocations happen through mempool_alloc().

The strategy of mempool_alloc() is to call the underlying allocator
(here slab) without __GFP_DIRECT_RECLAIM first. This does not play well
with __pcs_replace_empty_main() checking for gfpflags_allow_blocking()
to decide if it should refill an empty sheaf or fallback to the
slowpath, so we end up falling back.

We could change the mempool strategy but there might be other paths
doing the same ting. So instead allow sheaf refill when blocking is not
allowed, changing the condition to gfpflags_allow_spinning(). The
original condition was unnecessarily restrictive.

Note this doesn't fully resolve the regression [1] as another component
of that are memoryless nodes, which is to be addressed separately.

Reported-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Fixes: e47c897a2949 ("slab: add sheaves to most caches")
Link: https://lore.kernel.org/all/aZ0SbIqaIkwoW2mB@fedora/ [1]
Link: https://patch.msgid.link/20260302095536.34062-2-vbabka@kernel.org
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>slab: distinguish lock and trylock for sheaf_flush_main()</title>
<updated>2026-03-02T09:04:22+00:00</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2026-02-11T09:42:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=48647d3f9a644d1e81af6558102d43cdb260597b'/>
<id>urn:sha1:48647d3f9a644d1e81af6558102d43cdb260597b</id>
<content type='text'>
sheaf_flush_main() can be called from __pcs_replace_full_main() where
it's fine if the trylock fails, and pcs_flush_all() where it's not
expected to and for some flush callers (when destroying the cache or
memory hotremove) it would be actually a problem if it failed and left
the main sheaf not flushed. The flush callers can however safely use
local_lock() instead of trylock.

The trylock failure should not happen in practice on !PREEMPT_RT, but
can happen on PREEMPT_RT. The impact is limited in practice because when
a trylock fails in the kmem_cache_destroy() path, it means someone is
using the cache while destroying it, which is a bug on its own. The memory
hotremove path is unlikely to be employed in a production RT config, but
it's possible.

To fix this, split the function into sheaf_flush_main() (using
local_lock()) and sheaf_try_flush_main() (using local_trylock()) where
both call __sheaf_flush_main_batch() to flush a single batch of objects.
This will also allow lockdep to verify our context assumptions.

The problem was raised in an off-list question by Marcelo.

Fixes: 2d517aa09bbc ("slab: add opt-in caching layer of percpu sheaves")
Cc: stable@vger.kernel.org
Reported-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Reviewed-by: Hao Li &lt;hao.li@linux.dev&gt;
Link: https://patch.msgid.link/20260211-b4-sheaf-flush-v1-1-4e7f492f0055@suse.cz
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm/slab: initialize slab-&gt;stride early to avoid memory ordering issues</title>
<updated>2026-02-27T15:22:57+00:00</updated>
<author>
<name>Harry Yoo</name>
<email>harry.yoo@oracle.com</email>
</author>
<published>2026-02-23T07:58:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e9217ca77dc35b4978db0fe901685ddb3f1e223a'/>
<id>urn:sha1:e9217ca77dc35b4978db0fe901685ddb3f1e223a</id>
<content type='text'>
When alloc_slab_obj_exts() is called later (instead of during slab
allocation and initialization), slab-&gt;stride and slab-&gt;obj_exts are
updated after the slab is already accessible by multiple CPUs.

The current implementation does not enforce memory ordering between
slab-&gt;stride and slab-&gt;obj_exts. For correctness, slab-&gt;stride must be
visible before slab-&gt;obj_exts. Otherwise, concurrent readers may observe
slab-&gt;obj_exts as non-zero while stride is still stale.

With stale slab-&gt;stride, slab_obj_ext() could return the wrong obj_ext.
This could cause two problems:

  - obj_cgroup_put() is called on the wrong objcg, leading to
    a use-after-free due to incorrect reference counting [1] by
    decrementing the reference count more than it was incremented.

  - refill_obj_stock() is called on the wrong objcg, leading to
    a page_counter overflow [2] by uncharging more memory than charged.

Fix this by unconditionally initializing slab-&gt;stride in
alloc_slab_obj_exts_early(), before the need_slab_obj_exts() check.
In the case of SLAB_OBJ_EXT_IN_OBJ, it is overridden in the function.

This ensures updates to slab-&gt;stride become visible before the slab
can be accessed by other CPUs via the per-node partial slab list
(protected by spinlock with acquire/release semantics).

Thanks to Shakeel Butt for pointing out this issue [3].

[vbabka@kernel.org: the bug reports [1] and [2] are not yet fully fixed,
 with investigation ongoing, but it is nevertheless a step in the right
 direction to only set stride once after allocating the slab and not
 change it later ]

Fixes: 7a8e71bc619d ("mm/slab: use stride to access slabobj_ext")
Reported-by: Venkat Rao Bagalkote &lt;venkat88@linux.ibm.com&gt;
Link: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1]
Link: https://lore.kernel.org/all/ddff7c7d-c0c3-4780-808f-9a83268bbf0c@linux.ibm.com [2]
Link: https://lore.kernel.org/linux-mm/aZu9G9mVIVzSm6Ft@hyeyoo [3]
Signed-off-by: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm/slab: mark alloc tags empty for sheaves allocated with __GFP_NO_OBJ_EXT</title>
<updated>2026-02-26T16:30:32+00:00</updated>
<author>
<name>Suren Baghdasaryan</name>
<email>surenb@google.com</email>
</author>
<published>2026-02-25T16:34:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f3ec502b6755a3bfb12c1c47025ef989ff9efc72'/>
<id>urn:sha1:f3ec502b6755a3bfb12c1c47025ef989ff9efc72</id>
<content type='text'>
alloc_empty_sheaf() allocates sheaves from SLAB_KMALLOC caches using
__GFP_NO_OBJ_EXT to avoid recursion, however it does not mark their
allocation tags empty before freeing, which results in a warning when
CONFIG_MEM_ALLOC_PROFILING_DEBUG is set. Fix this by marking allocation
tags for such sheaves as empty.

The problem was technically introduced in commit 4c0a17e28340 but only
becomes possible to hit with commit 913ffd3a1bf5.

Fixes: 4c0a17e28340 ("slab: prevent recursive kmalloc() in alloc_empty_sheaf()")
Fixes: 913ffd3a1bf5 ("slab: handle kmalloc sheaves bootstrap")
Reported-by: David Wang &lt;00107082@163.com&gt;
Closes: https://lore.kernel.org/all/20260223155128.3849-1-00107082@163.com/
Analyzed-by: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Reviewed-by: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Tested-by: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Tested-by: David Wang &lt;00107082@163.com&gt;
Link: https://patch.msgid.link/20260225163407.2218712-1-surenb@google.com
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm/slab: pass __GFP_NOWARN to refill_sheaf() if fallback is available</title>
<updated>2026-02-26T16:30:06+00:00</updated>
<author>
<name>Harry Yoo</name>
<email>harry.yoo@oracle.com</email>
</author>
<published>2026-02-23T13:33:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=021ca6b670bebebc409d43845efcfe8c11c1dd54'/>
<id>urn:sha1:021ca6b670bebebc409d43845efcfe8c11c1dd54</id>
<content type='text'>
When refill_sheaf() is called, failing to refill the sheaf doesn't
necessarily mean the allocation will fail because a fallback path
might be available and serve the allocation request.

Suppress spurious warnings by passing __GFP_NOWARN along with
__GFP_NOMEMALLOC whenever a fallback path is available.

When the caller is alloc_full_sheaf() or __pcs_replace_empty_main(),
the kernel always falls back to the slowpath (__slab_alloc_node()).
For __prefill_sheaf_pfmemalloc(), the fallback path is available
only when gfp_pfmemalloc_allowed() returns true.

Reported-and-tested-by: Chris Bainbridge &lt;chris.bainbridge@gmail.com&gt;
Closes: https://lore.kernel.org/linux-mm/aZt2-oS9lkmwT7Ch@debian.local
Fixes: 1ce20c28eafd ("slab: handle pfmemalloc slabs properly with sheaves")
Link: https://lore.kernel.org/linux-mm/aZwSreGj9-HHdD-j@hyeyoo
Signed-off-by: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Link: https://patch.msgid.link/20260223133322.16705-1-harry.yoo@oracle.com
Tested-by: Mikhail Gavrilov &lt;mikhail.v.gavrilov@gmail.com&gt;
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>urn:sha1:bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
