<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/mm/slub.c, branch linux-7.1.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.1.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.1.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-05-22T13:23:56+00:00</updated>
<entry>
<title>Merge tag 'slab-for-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab</title>
<updated>2026-05-22T13:23:56+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-05-22T13:23:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=46de4087d38ece8477ab4b6b7630c13c82c27f1a'/>
<id>urn:sha1:46de4087d38ece8477ab4b6b7630c13c82c27f1a</id>
<content type='text'>
Pull slab fix from Vlastimil Babka:

 - Stable fix for a missing cpus_read_lock in one of the cpu sheaves
   flushing paths (Qing Wang)

* tag 'slab-for-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slub: hold cpus_read_lock around flush_rcu_sheaves_on_cache()
</content>
</entry>
<entry>
<title>mm/slub: hold cpus_read_lock around flush_rcu_sheaves_on_cache()</title>
<updated>2026-05-14T12:56:58+00:00</updated>
<author>
<name>Qing Wang</name>
<email>wangqing7171@gmail.com</email>
</author>
<published>2026-05-12T03:50:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=67ea9d353d0ba12bdbc9183ff568dead9e949b80'/>
<id>urn:sha1:67ea9d353d0ba12bdbc9183ff568dead9e949b80</id>
<content type='text'>
flush_rcu_sheaves_on_cache() calls queue_work_on() in a
for_each_online_cpu() loop, which requires the cpu to stay online.
But cpus_read_lock() is not held in kvfree_rcu_barrier_on_cache() and the
set of "online cpus" is subject to change.

There are two paths that call flush_rcu_sheaves_on_cache():

  // has cpus_read_lock()
  flush_all_rcu_sheaves()
    -&gt; flush_rcu_sheaves_on_cache()

  // no cpus_read_lock()
  kvfree_rcu_barrier_on_cache()
    -&gt; flush_rcu_sheaves_on_cache()

Fix this by holding cpus_read_lock() in kvfree_rcu_barrier_on_cache().

Why not move cpus_read_lock() from flush_all_rcu_sheaves() into
flush_rcu_sheaves_on_cache()? The reason is it would introduce a new lock
order (slab_mutex -&gt; cpu_hotplug_lock). The reverse order
(cpu_hotplug_lock -&gt; slab_mutex) is established by

- cpuhp_setup_state_nocalls(..., slub_cpu_setup, ...)
- kmem_cache_destroy()

The two orders together would form an AB-BA deadlock.

Finally, add lockdep_assert_cpus_held() in flush_rcu_sheaves_on_cache()
to catch the same problem in the future.

Fixes: 0f35040de593 ("mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction")
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Qing Wang &lt;wangqing7171@gmail.com&gt;
Link: https://patch.msgid.link/20260512035035.762317-1-wangqing7171@gmail.com
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm/slab: Add kvfree_atomic() helper</title>
<updated>2026-05-05T08:12:07+00:00</updated>
<author>
<name>Uladzislau Rezki (Sony)</name>
<email>urezki@gmail.com</email>
</author>
<published>2026-04-28T16:14:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dad0d91cc2c3e6b6fb285ccfe7ddf71525797198'/>
<id>urn:sha1:dad0d91cc2c3e6b6fb285ccfe7ddf71525797198</id>
<content type='text'>
kvmalloc() now supports non-sleeping GFP flags, including
the vmalloc fallback path. This means it may return vmalloc
memory even for GFP_ATOMIC and GFP_NOWAIT allocations.

Freeing such memory with kvfree() may then end up calling
vfree(), which is not safe for non-sleeping contexts.

Introduce kvfree_atomic() helper for such cases. It mirrors
kvfree(), but uses vfree_atomic() for vmalloced memory.

Signed-off-by: Uladzislau Rezki (Sony) &lt;urezki@gmail.com&gt;
Acked-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
Acked-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
</entry>
<entry>
<title>mm/slab: return NULL early from kmalloc_nolock() in NMI on UP</title>
<updated>2026-04-27T07:14:36+00:00</updated>
<author>
<name>Harry Yoo (Oracle)</name>
<email>harry@kernel.org</email>
</author>
<published>2026-04-27T07:09:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5b31044e649e3e54c2caef135c09b371c2fbcd08'/>
<id>urn:sha1:5b31044e649e3e54c2caef135c09b371c2fbcd08</id>
<content type='text'>
On UP kernels (!CONFIG_SMP), spin_trylock() is a no-op that
unconditionally succeeds even when the lock is already held. As a
result, kmalloc_nolock() called from NMI context can re-enter the slab
allocator and acquire n-&gt;list_lock that the interrupted context is
already holding, corrupting slab state.

With CONFIG_DEBUG_SPINLOCK on UP, the following BUG is triggered with
the slub_kunit test module:

  BUG: spinlock trylock failure on UP on CPU#0, kunit_try_catch/243
  [...]
  Call Trace:
   &lt;NMI&gt;
   dump_stack_lvl+0x3f/0x60
   do_raw_spin_trylock+0x41/0x50
   _raw_spin_trylock+0x24/0x50
   get_from_partial_node+0x120/0x4d0
   ___slab_alloc+0x8a/0x4c0
   kmalloc_nolock_noprof+0x164/0x310
   [...]
   &lt;/NMI&gt;

Fix this by returning NULL early when invoked from NMI on a UP kernel.

Link: https://lore.kernel.org/linux-mm/ad_cqe51pvr1WaDg@hyeyoo
Cc: stable@vger.kernel.org
Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
Signed-off-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Link: https://patch.msgid.link/20260427-nolock-api-fix-v2-2-a6b83a92d9a4@kernel.org
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>slub: fix data loss and overflow in krealloc()</title>
<updated>2026-04-17T09:07:48+00:00</updated>
<author>
<name>Marco Elver</name>
<email>elver@google.com</email>
</author>
<published>2026-04-16T13:25:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=082a6d03a2d685a83a332666b500ad3966349588'/>
<id>urn:sha1:082a6d03a2d685a83a332666b500ad3966349588</id>
<content type='text'>
Commit 2cd8231796b5 ("mm/slub: allow to set node and align in
k[v]realloc") introduced the ability to force a reallocation if the
original object does not satisfy new alignment or NUMA node, even when
the object is being shrunk.

This introduced two bugs in the reallocation fallback path:

1. Data loss during NUMA migration: The jump to 'alloc_new' happens
   before 'ks' and 'orig_size' are initialized. As a result, the
   memcpy() in the 'alloc_new' block would copy 0 bytes into the new
   allocation.

2. Buffer overflow during shrinking: When shrinking an object while
   forcing a new alignment, 'new_size' is smaller than the old size.
   However, the memcpy() used the old size ('orig_size ?: ks'), leading
   to an out-of-bounds write.

The same overflow bug exists in the kvrealloc() fallback path, where the
old bucket size ksize(p) is copied into the new buffer without being
bounded by the new size.

A simple reproducer:

	// e.g. add to lkdtm as KREALLOC_SHRINK_OVERFLOW
	while (1) {
		void *p = kmalloc(128, GFP_KERNEL);
		p = krealloc_node_align(p, 64, 256, GFP_KERNEL, NUMA_NO_NODE);
		kfree(p);
	}

demonstrates the issue:

  ==================================================================
  BUG: KFENCE: out-of-bounds write in memcpy_orig+0x68/0x130

  Out-of-bounds write at 0xffff8883ad757038 (120B right of kfence-#47):
   memcpy_orig+0x68/0x130
   krealloc_node_align_noprof+0x1c8/0x340
   lkdtm_KREALLOC_SHRINK_OVERFLOW+0x8c/0xc0 [lkdtm]
   lkdtm_do_action+0x3a/0x60 [lkdtm]
   ...

  kfence-#47: 0xffff8883ad756fc0-0xffff8883ad756fff, size=64, cache=kmalloc-64

  allocated by task 316 on cpu 7 at 97.680481s (0.021813s ago):
   krealloc_node_align_noprof+0x19c/0x340
   lkdtm_KREALLOC_SHRINK_OVERFLOW+0x8c/0xc0 [lkdtm]
   lkdtm_do_action+0x3a/0x60 [lkdtm]
   ...
  ==================================================================

Fix it by moving the old size calculation to the top of __do_krealloc()
and bounding all copy lengths by the new allocation size.

Fixes: 2cd8231796b5 ("mm/slub: allow to set node and align in k[v]realloc")
Cc: stable@vger.kernel.org
Reported-by: https://sashiko.dev/#/patchset/20260415143735.2974230-1-elver%40google.com
Signed-off-by: Marco Elver &lt;elver@google.com&gt;
Link: https://patch.msgid.link/20260416132837.3787694-1-elver@google.com
Reviewed-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'slab/for-7.1/misc' into slab/for-next</title>
<updated>2026-04-13T11:23:36+00:00</updated>
<author>
<name>Vlastimil Babka (SUSE)</name>
<email>vbabka@kernel.org</email>
</author>
<published>2026-04-07T12:39:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=44e0ebe4accd67c67134cf3b805917153041a300'/>
<id>urn:sha1:44e0ebe4accd67c67134cf3b805917153041a300</id>
<content type='text'>
Merge misc slab changes that are not related to sheaves. Various
improvements for sysfs, debugging and testing.
</content>
</entry>
<entry>
<title>slub: clarify kmem_cache_refill_sheaf() comments</title>
<updated>2026-04-07T12:39:11+00:00</updated>
<author>
<name>Hao Li</name>
<email>hao.li@linux.dev</email>
</author>
<published>2026-04-07T11:59:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=51274836193a661a3f39e7f10629d5978a61bbfb'/>
<id>urn:sha1:51274836193a661a3f39e7f10629d5978a61bbfb</id>
<content type='text'>
In the in-place refill case, some objects may already have been added
before the function returns -ENOMEM.
Clarify this behavior and polish the rest of the comment for readability.

Acked-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Signed-off-by: Hao Li &lt;hao.li@linux.dev&gt;
Link: https://patch.msgid.link/20260407120018.42692-1-hao.li@linux.dev
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>slub: use N_NORMAL_MEMORY in can_free_to_pcs to handle remote frees</title>
<updated>2026-04-07T09:10:52+00:00</updated>
<author>
<name>Hao Li</name>
<email>hao.li@linux.dev</email>
</author>
<published>2026-04-03T07:37:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7f9bb84fdb5ee7621fcd6519cd14d3dc9aa75c5c'/>
<id>urn:sha1:7f9bb84fdb5ee7621fcd6519cd14d3dc9aa75c5c</id>
<content type='text'>
Memory hotplug now keeps N_NORMAL_MEMORY up to date correctly, so make
can_free_to_pcs() use it.

As a result, when freeing objects on memoryless nodes, or on nodes that
have memory but only in ZONE_MOVABLE, the objects can be freed to the
sheaf instead of going through the slow path.

Signed-off-by: Hao Li &lt;hao.li@linux.dev&gt;
Acked-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Link: https://patch.msgid.link/20260403073958.8722-1-hao.li@linux.dev
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>slab: free remote objects to sheaves on memoryless nodes</title>
<updated>2026-03-19T12:22:49+00:00</updated>
<author>
<name>Vlastimil Babka (SUSE)</name>
<email>vbabka@kernel.org</email>
</author>
<published>2026-03-11T08:25:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e65d430111a5ba83598b03a4aca4799eb295eef1'/>
<id>urn:sha1:e65d430111a5ba83598b03a4aca4799eb295eef1</id>
<content type='text'>
On memoryless nodes we can now allocate from cpu sheaves and refill them
normally. But when a node is memoryless on a system without actual
CONFIG_HAVE_MEMORYLESS_NODES support, freeing always uses the slowpath
because all objects appear as remote. We could instead benefit from the
freeing fastpath, because the allocations can't obtain local objects
anyway if the node is memoryless.

Thus adapt the locality check when freeing, and move them to an inline
function can_free_to_pcs() for a single shared implementation.

On configurations with CONFIG_HAVE_MEMORYLESS_NODES=y continue using
numa_mem_id() so the percpu sheaves and barn on a memoryless node will
contain mostly objects from the closest memory node (returned by
numa_mem_id()). No change is thus intended for such configuration.

On systems with CONFIG_HAVE_MEMORYLESS_NODES=n use numa_node_id() (the
cpu's node) since numa_mem_id() just aliases it anyway. But if we are
freeing on a memoryless node, allow the freeing to use percpu sheaves
for objects from any node, since they are all remote anyway.

This way we avoid the slowpath and get more performant freeing. The
potential downside is that allocations will obtain objects with a larger
average distance. If we kept bypassing the sheaves on freeing, a refill
of sheaves from slabs would tend to get closer objects thanks to the
ordering of the zonelist. Architectures that allow de-facto memoryless
nodes without proper CONFIG_HAVE_MEMORYLESS_NODES support should perhaps
consider adding such support.

Link: https://patch.msgid.link/20260311-b4-slab-memoryless-barns-v1-3-70ab850be4ce@kernel.org
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
Reviewed-by: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Reviewed-by: Hao Li &lt;hao.li@linux.dev&gt;
</content>
</entry>
<entry>
<title>slab: create barns for online memoryless nodes</title>
<updated>2026-03-19T12:22:44+00:00</updated>
<author>
<name>Vlastimil Babka (SUSE)</name>
<email>vbabka@kernel.org</email>
</author>
<published>2026-03-11T08:25:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7f693882f00963fd7808e333b86a87e0f9b9873b'/>
<id>urn:sha1:7f693882f00963fd7808e333b86a87e0f9b9873b</id>
<content type='text'>
Ming Lei has reported [1] a performance regression due to replacing cpu
(partial) slabs with sheaves. With slub stats enabled, a large amount of
slowpath allocations were observed. The affected system has 8 online
NUMA nodes but only 2 have memory.

For sheaves to work effectively on given cpu, its NUMA node has to have
struct node_barn allocated. Those are currently only allocated on nodes
with memory (N_MEMORY) where kmem_cache_node also exist as the goal is
to cache only node-local objects. But in order to have good performance
on a memoryless node, we need its barn to exist and use sheaves to cache
non-local objects (as no local objects can exist anyway).

Therefore change the implementation to allocate barns on all online
nodes, tracked in a new nodemask slab_barn_nodes. Also add a cpu hotplug
callback as that's when a memoryless node can become online.

Change both get_barn() and rcu_sheaf-&gt;node assignment to numa_node_id()
so it's returned to the barn of the local cpu's (potentially memoryless)
node, and not to the nearest node with memory anymore.

On systems with CONFIG_HAVE_MEMORYLESS_NODES=y (which are not the main
target of this change) barns did not exist on memoryless nodes, but
get_barn() using numa_mem_id() meant a barn was returned from the
nearest node with memory. This works, but the barn lock contention
increases with every such memoryless node. With this change, barn will
be allocated also on the memoryless node, reducing this contention in
exchange for increased memory consumption.

Reported-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Link: https://lore.kernel.org/all/aZ0SbIqaIkwoW2mB@fedora/ [1]
Link: https://patch.msgid.link/20260311-b4-slab-memoryless-barns-v1-2-70ab850be4ce@kernel.org
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
Reviewed-by: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Reviewed-by: Hao Li &lt;hao.li@linux.dev&gt;
</content>
</entry>
</feed>
