<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/lib/debugobjects.c, branch v6.18.21</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-12T11:09:08+00:00</updated>
<entry>
<title>debugobject: Make it work with deferred page initialization - again</title>
<updated>2026-03-12T11:09:08+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@kernel.org</email>
</author>
<published>2026-02-07T13:27:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a473c09666f0c03de02158876fa6241fa866c9dd'/>
<id>urn:sha1:a473c09666f0c03de02158876fa6241fa866c9dd</id>
<content type='text'>
[ Upstream commit fd3634312a04f336dcbfb481060219f0cd320738 ]

debugobjects uses __GFP_HIGH for allocations as it might be invoked
within locked regions. That worked perfectly fine until v6.18. It still
works correctly when deferred page initialization is disabled and works
by chance when no page allocation is required before deferred page
initialization has completed.

Since v6.18 allocations w/o a reclaim flag cause new_slab() to end up in
alloc_frozen_pages_nolock_noprof(), which returns early when deferred
page initialization has not yet completed. As the deferred page
initialization takes quite a while the debugobject pool is depleted and
debugobjects are disabled.

This can be worked around when PREEMPT_COUNT is enabled as that allows
debugobjects to add __GFP_KSWAPD_RECLAIM to the GFP flags when the context
is preemtible. When PREEMPT_COUNT is disabled the context is unknown and
the reclaim bit can't be set because the caller might hold locks which
might deadlock in the allocator.

In preemptible context the reclaim bit is harmless and not a performance
issue as that's usually invoked from slow path initialization context.

That makes debugobjects depend on PREEMPT_COUNT || !DEFERRED_STRUCT_PAGE_INIT.

Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
Signed-off-by: Thomas Gleixner &lt;tglx@kernel.org&gt;
Tested-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Link: https://patch.msgid.link/87pl6gznti.ffs@tglx
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>debugobjects: Track object usage to avoid premature freeing of objects</title>
<updated>2024-10-15T15:30:33+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2024-10-13T18:45:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ff8d523cc4520a5ce86cde0fd57c304e2b4f61b3'/>
<id>urn:sha1:ff8d523cc4520a5ce86cde0fd57c304e2b4f61b3</id>
<content type='text'>
The freelist is freed at a constant rate independent of the actual usage
requirements. That's bad in scenarios where usage comes in bursts. The end
of a burst puts the objects on the free list and freeing proceeds even when
the next burst which requires objects started again.

Keep track of the usage with a exponentially wheighted moving average and
take that into account in the worker function which frees objects from the
free list.

This further reduces the kmem_cache allocation/free rate for a full kernel
compile:

   	    kmem_cache_alloc()	kmem_cache_free()
Baseline:   225k		173k
Usage:	    170k		117k

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Link: https://lore.kernel.org/all/87bjznhme2.ffs@tglx

</content>
</entry>
<entry>
<title>debugobjects: Refill per CPU pool more agressively</title>
<updated>2024-10-15T15:30:33+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2024-10-07T16:50:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=13f9ca723900ae3ae8e0a1e76ba86e7786e60645'/>
<id>urn:sha1:13f9ca723900ae3ae8e0a1e76ba86e7786e60645</id>
<content type='text'>
Right now the per CPU pools are only refilled when they become
empty. That's suboptimal especially when there are still non-freed objects
in the to free list.

Check whether an allocation from the per CPU pool emptied a batch and try
to allocate from the free pool if that still has objects available.

   	    kmem_cache_alloc()	kmem_cache_free()
Baseline:   295k		245k
Refill:	    225k		173k

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Link: https://lore.kernel.org/all/20241007164914.439053085@linutronix.de

</content>
</entry>
<entry>
<title>debugobjects: Double the per CPU slots</title>
<updated>2024-10-15T15:30:33+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2024-10-07T16:50:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a201a96b9682e5b42ed93108c4aeb6135c909661'/>
<id>urn:sha1:a201a96b9682e5b42ed93108c4aeb6135c909661</id>
<content type='text'>
In situations where objects are rapidly allocated from the pool and handed
back, the size of the per CPU pool turns out to be too small.

Double the size of the per CPU pool.

This reduces the kmem cache allocation and free operations during a kernel compile:

     	     alloc    	    free
Baseline:    380k           330k
Double size: 295k	    245k

Especially the reduction of allocations is important because that happens
in the hot path when objects are initialized.

The maximum increase in per CPU pool memory consumption is about 2.5K per
online CPU, which is acceptable.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Link: https://lore.kernel.org/all/20241007164914.378676302@linutronix.de

</content>
</entry>
<entry>
<title>debugobjects: Move pool statistics into global_pool struct</title>
<updated>2024-10-15T15:30:33+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2024-10-07T16:50:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2638345d22529cdc964d20a617bfd32d87f27e0f'/>
<id>urn:sha1:2638345d22529cdc964d20a617bfd32d87f27e0f</id>
<content type='text'>
Keep it along with the pool as that's a hot cache line anyway and it makes
the code more comprehensible.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Link: https://lore.kernel.org/all/20241007164914.318776207@linutronix.de

</content>
</entry>
<entry>
<title>debugobjects: Implement batch processing</title>
<updated>2024-10-15T15:30:33+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2024-10-07T16:50:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f57ebb92ba3e09a7e1082f147d6e1456d702d4b2'/>
<id>urn:sha1:f57ebb92ba3e09a7e1082f147d6e1456d702d4b2</id>
<content type='text'>
Adding and removing single objects in a loop is bad in terms of lock
contention and cache line accesses.

To implement batching, record the last object in a batch in the object
itself. This is trivialy possible as hlists are strictly stacks. At a batch
boundary, when the first object is added to the list the object stores a
pointer to itself in debug_obj::batch_last. When the next object is added
to the list then the batch_last pointer is retrieved from the first object
in the list and stored in the to be added one.

That means for batch processing the first object always has a pointer to
the last object in a batch, which allows to move batches in a cache line
efficient way and reduces the lock held time.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Link: https://lore.kernel.org/all/20241007164914.258995000@linutronix.de

</content>
</entry>
<entry>
<title>debugobjects: Prepare kmem_cache allocations for batching</title>
<updated>2024-10-15T15:30:32+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2024-10-07T16:50:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=aebbfe0779b271c099cc80c5e2995c2087b28dcf'/>
<id>urn:sha1:aebbfe0779b271c099cc80c5e2995c2087b28dcf</id>
<content type='text'>
Allocate a batch and then push it into the pool. Utilize the
debug_obj::last_node pointer for keeping track of the batch boundary.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/all/20241007164914.198647184@linutronix.de

</content>
</entry>
<entry>
<title>debugobjects: Prepare for batching</title>
<updated>2024-10-15T15:30:32+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2024-10-07T16:50:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=74fe1ad4132234f04fcc75e16600449496a67b5b'/>
<id>urn:sha1:74fe1ad4132234f04fcc75e16600449496a67b5b</id>
<content type='text'>
Move the debug_obj::object pointer into a union and add a pointer to the
last node in a batch. That allows to implement batch processing efficiently
by utilizing the stack property of hlist:

When the first object of a batch is added to the list, then the batch
pointer is set to the hlist node of the object itself. Any subsequent add
retrieves the pointer to the last node from the first object in the list
and uses that for storing the last node pointer in the newly added object.

Add the pointer to the data structure and ensure that all relevant pool
sizes are strictly batch sized. The actual batching implementation follows
in subsequent changes.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Link: https://lore.kernel.org/all/20241007164914.139204961@linutronix.de

</content>
</entry>
<entry>
<title>debugobjects: Use static key for boot pool selection</title>
<updated>2024-10-15T15:30:32+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2024-10-07T16:50:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=14077b9e583bbafc9a02734beab99c37bff68644'/>
<id>urn:sha1:14077b9e583bbafc9a02734beab99c37bff68644</id>
<content type='text'>
Get rid of the conditional in the hot path.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Link: https://lore.kernel.org/all/20241007164914.077247071@linutronix.de

</content>
</entry>
<entry>
<title>debugobjects: Rework free_object_work()</title>
<updated>2024-10-15T15:30:32+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2024-10-07T16:50:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9ce99c6d7bfbca71f1e5fa34045ea48cb768f54a'/>
<id>urn:sha1:9ce99c6d7bfbca71f1e5fa34045ea48cb768f54a</id>
<content type='text'>
Convert it to batch processing with intermediate helper functions. This
reduces the final changes for batch processing.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Link: https://lore.kernel.org/all/20241007164914.015906394@linutronix.de

</content>
</entry>
</feed>
