<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/mm/memcontrol.c, branch v7.2-rc1</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.2-rc1</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.2-rc1'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-06-22T15:28:48+00:00</updated>
<entry>
<title>Merge tag 'slab-for-7.2-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab</title>
<updated>2026-06-22T15:28:48+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-06-22T15:28:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=335c347686e76df9d2c7d7f61b5ea627a4c5cb4c'/>
<id>urn:sha1:335c347686e76df9d2c7d7f61b5ea627a4c5cb4c</id>
<content type='text'>
Pull more slab updates from Vlastimil Babka:

 - Introduce and wire up a new alloc_flags parameter for modifying
   slab-specific behavior without adding or reusing gfp flags. Also
   introduce slab_alloc_context to keep function parameter bloat in
   check. Both are similar to what the page allocator does.
   kmalloc_flags() exposes alloc_flags for mm-internal users.

     - SLAB_ALLOC_NOLOCK flag is used to implement kmalloc_nolock()
       behavior without relying on lack of __GFP_RECLAIM, which caused
       false positives with workarounds like fd3634312a04 ("debugobject:
       Make it work with deferred page initialization - again").

     - SLAB_ALLOC_NO_RECURSE replaces __GFP_NO_OBJ_EXT, which could have
       been removed, but pending memory allocation profiling changes in
       mm tree have grown a new user - there is however a work ongoing
       to replace that too, so __GFP_NO_OBJ_EXT should eventually be
       removed. (Vlastimil Babka)

 - Add kmem_buckets_alloc_track_caller() with a user to be added in the
   net tree (Pedro Falcato)

 - Fixes for kernel-doc and slabinfo (Randy Dunlap, Yichong Chen)

* tag 'slab-for-7.2-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  tools/mm/slabinfo: fix total_objects attribute name
  slab: recognize @GFP parameter as optional in kernel-doc
  mm/slab: add a node-track-caller variant for kmem buckets allocation
  mm/slab: replace __GFP_NO_OBJ_EXT with SLAB_ALLOC_NO_RECURSE for sheaves
  mm/slab: remove __GFP_NO_OBJ_EXT usage from alloc_slab_obj_exts()
  mm/slab: introduce kmalloc_flags()
  mm/slab: allow __GFP_NOMEMALLOC and __GFP_NOWARN for kmalloc_nolock()
  mm/slab: pass slab_alloc_context to __do_kmalloc_node()
  mm/slab: allow kmem_cache_alloc_bulk() with any gfp flags
  mm/slab: replace slab_alloc_node() parameters with slab_alloc_context
  mm/slab: pass alloc_flags through slab_post_alloc_hook() chain
  mm/slab: pass alloc_flags to new slab allocation
  mm/slab: add alloc_flags to slab_alloc_context
  mm/slab: replace struct partial_context with slab_alloc_context
  mm/slab: introduce alloc_flags and SLAB_ALLOC_NOLOCK
  mm/slab: introduce slab_alloc_context
  mm/slab: stop inlining __slab_alloc_node()
  mm/slab: do not init any kfence objects on allocation
</content>
</entry>
<entry>
<title>mm/slab: pass alloc_flags through slab_post_alloc_hook() chain</title>
<updated>2026-06-15T11:23:19+00:00</updated>
<author>
<name>Vlastimil Babka (SUSE)</name>
<email>vbabka@kernel.org</email>
</author>
<published>2026-06-10T15:40:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ef7f97aa2ce8d59eda0faa9bb40f6263f0d80d5f'/>
<id>urn:sha1:ef7f97aa2ce8d59eda0faa9bb40f6263f0d80d5f</id>
<content type='text'>
Convert the whole following call stack to pass either slab_alloc_context
(thus including alloc_flags) or just alloc_flags as necessary:

slab_post_alloc_hook()
  alloc_tagging_slab_alloc_hook()
    __alloc_tagging_slab_alloc_hook()
      prepare_slab_obj_exts_hook()
        alloc_slab_obj_exts()
  memcg_slab_post_alloc_hook()
    __memcg_slab_post_alloc_hook()
      alloc_slab_obj_exts()

Converting all these at once avoids unnecessary churn and is mostly
mechanical.

This ultimately allows to decide if spinning is allowed using
alloc_flags in alloc_slab_obj_exts(), as well as slab_post_alloc_hook().
Aside from alloc_from_pcs_bulk() (to be handled next) there is nothing
else in slab itself relying on gfpflags_allow_spinning() which can
be false even if not called from kmalloc_nolock().

A followup change will also use the alloc_flags availability in the call
stack above to remove the __GFP_NO_OBJ_EXT flag.

For alloc_slab_obj_exts(), also replace the suboptimal "bool new_slab"
parameter with a SLAB_ALLOC_NEW_SLAB flag with identical functionality.

To further reduce the number of parameters of slab_post_alloc_hook(),
also make 'struct list_lru *lru' (which is NULL for most callers) a new
field of slab_alloc_context.

Link: https://patch.msgid.link/20260610-slab_alloc_flags-v2-9-7190909db118@kernel.org
Reviewed-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Reviewed-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Reviewed-by: Hao Li &lt;hao.li@linux.dev&gt;
Signed-off-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm: switch deferred split shrinker to list_lru</title>
<updated>2026-06-09T01:21:25+00:00</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2026-05-27T20:45:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fafaeceb89a5e2e856ff04c2cacb6cae4a2ecb67'/>
<id>urn:sha1:fafaeceb89a5e2e856ff04c2cacb6cae4a2ecb67</id>
<content type='text'>
The deferred split queue handles cgroups in a suboptimal fashion.  The
queue is per-NUMA node or per-cgroup, not the intersection.  That means on
a cgrouped system, a node-restricted allocation entering reclaim can end
up splitting large pages on other nodes:

        alloc/unmap
          deferred_split_folio()
            list_add_tail(memcg-&gt;split_queue)
            set_shrinker_bit(memcg, node, deferred_shrinker_id)

        for_each_zone_zonelist_nodemask(restricted_nodes)
          mem_cgroup_iter()
            shrink_slab(node, memcg)
              shrink_slab_memcg(node, memcg)
                if test_shrinker_bit(memcg, node, deferred_shrinker_id)
                  deferred_split_scan()
                    walks memcg-&gt;split_queue

The shrinker bit adds an imperfect guard rail.  As soon as the cgroup has
a single large page on the node of interest, all large pages owned by that
memcg, including those on other nodes, will be split.

list_lru properly sets up per-node, per-cgroup lists.  As a bonus, it
streamlines a lot of the list operations and reclaim walks.  It's used
widely by other major shrinkers already.  Convert the deferred split queue
as well.

The list_lru per-memcg heads are instantiated on demand when the first
object of interest is allocated for a cgroup, by calling
folio_memcg_alloc_deferred().  Add calls to where splittable pages are
created: anon faults, swapin faults, khugepaged collapse.

These calls create all possible node heads for the cgroup at once, so the
migration code (between nodes) doesn't need any special care.

[akpm@linux-foundation.org: fix build with CONFIG_TRANSPARENT_HUGEPAGE=n]
  Link: https://lore.kernel.org/202605281620.lc3rtkBm-lkp@intel.com
[hannes@cmpxchg.org: fix cgroup.memory=nokmem handling]
  Link: https://lore.kernel.org/ah9PGv12mqai84ES@cmpxchg.org
Link: https://lore.kernel.org/20260527204757.2544958-10-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: Mikhail Zaslonko &lt;zaslonko@linux.ibm.com&gt;
Tested-by: Mikhail Zaslonko &lt;zaslonko@linux.ibm.com&gt;
Acked-by: Shakeel Butt &lt;shakeel.butt@linux.dev&gt;
Reviewed-by: Lorenzo Stoakes (Oracle) &lt;ljs@kernel.org&gt;
Acked-by: Usama Arif &lt;usama.arif@linux.dev&gt;
Reviewed-by: Kairui Song &lt;kasong@tencent.com&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: Barry Song &lt;baohua@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: David Hildenbrand (Arm) &lt;david@kernel.org&gt;
Cc: Dev Jain &lt;dev.jain@arm.com&gt;
Cc: Lance Yang &lt;lance.yang@linux.dev&gt;
Cc: Liam R. Howlett &lt;liam@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Nico Pache &lt;npache@redhat.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vlastimil Babka &lt;vbabka@kernel.org&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Cc: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memcg: multi objcg charge support</title>
<updated>2026-06-04T21:45:07+00:00</updated>
<author>
<name>Shakeel Butt</name>
<email>shakeel.butt@linux.dev</email>
</author>
<published>2026-05-26T03:39:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=29a1ea41456b79d657e5f5deced1239477d03af1'/>
<id>urn:sha1:29a1ea41456b79d657e5f5deced1239477d03af1</id>
<content type='text'>
Commit 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg
per-node type") split a memcg's single obj_cgroup into one per NUMA node
so that reparenting LRU folios can take per-node lru locks.  As a side
effect, the per-CPU obj_stock_pcp -- which caches exactly one cached_objcg
-- thrashes on workloads where threads of the same memcg run on different
NUMA nodes.  The kernel test robot reported a 67.7% regression on
stress-ng.switch.ops_per_sec from this pattern.

Mirror the multi-slot pattern already used by memcg_stock_pcp: turn
nr_bytes and cached_objcg into NR_OBJ_STOCK-element arrays, scan all slots
on consume/refill/account, prefer empty slots when inserting, and evict a
slot round-robin only when full.  With multiple slots a CPU can hold the
per-node objcg variants of one memcg plus a few siblings without ever
forcing a drain.

A single int8_t index records which slot the cached slab stats belong to;
the stats are flushed on slot or pgdat change.  With NR_OBJ_STOCK = 5 the
layout (verified with pahole) is:

  offset 0  : lock(1) + index(1) + node_id(2) + slab stats(4) = 8B
  offset 8  : nr_bytes[5]                                     = 10B
  offset 18 : padding                                         = 6B
  offset 24 : cached[5]                                       = 40B
  offset 64 : (line 2) work_struct + flags (cold)

so consume_obj_stock, refill_obj_stock and the slab account path each
touch exactly one 64-byte cache line on non-debug 64-bit builds.

Link: https://lore.kernel.org/20260526033931.1760588-5-shakeel.butt@linux.dev
Signed-off-by: Shakeel Butt &lt;shakeel.butt@linux.dev&gt;
Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Closes: https://lore.kernel.org/oe-lkp/202605121641.b6a60cb0-lkp@intel.com
Fixes: 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg per-node type")
Tested-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Reviewed-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Cc: Alexandre Ghiti &lt;alex@ghiti.fr&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joshua Hahn &lt;joshua.hahnjy@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Qi Zheng &lt;qi.zheng@linux.dev&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memcg: int16_t for cached slab stats</title>
<updated>2026-06-04T21:45:07+00:00</updated>
<author>
<name>Shakeel Butt</name>
<email>shakeel.butt@linux.dev</email>
</author>
<published>2026-05-26T03:39:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7a09fb91c285ba1b253f7c72f86cf37b373afb10'/>
<id>urn:sha1:7a09fb91c285ba1b253f7c72f86cf37b373afb10</id>
<content type='text'>
Currently struct obj_stock_pcp stores cached slab stats in 'int' which is
4 bytes per counter on 64-bit machines.  Switch them to int16_t to shrink
the cached metadata.

The existing PAGE_SIZE flush in __account_obj_stock() bounds *bytes at
PAGE_SIZE on 4KiB and 16KiB page archs, well within int16_t.  On 64KiB
pages PAGE_SIZE is well above S16_MAX so that flush never fires, and a
sufficiently long run of accumulations would overflow the cache.  Add an
explicit S16_MAX guard before each add: when the next add would push
abs(*bytes) past S16_MAX, fold the cached value into @nr and flush
directly via mod_objcg_mlstate() before the accumulation.

Link: https://lore.kernel.org/20260526033931.1760588-4-shakeel.butt@linux.dev
Fixes: 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg per-node type")
Signed-off-by: Shakeel Butt &lt;shakeel.butt@linux.dev&gt;
Tested-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Reviewed-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Acked-by: Qi Zheng &lt;qi.zheng@linux.dev&gt;
Acked-by: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Alexandre Ghiti &lt;alex@ghiti.fr&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joshua Hahn &lt;joshua.hahnjy@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memcg: uint16_t for nr_bytes in obj_stock_pcp</title>
<updated>2026-06-04T21:45:07+00:00</updated>
<author>
<name>Shakeel Butt</name>
<email>shakeel.butt@linux.dev</email>
</author>
<published>2026-05-26T03:39:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=37a7f91e44f41f1b4cd60d1f89a4de7cf871d158'/>
<id>urn:sha1:37a7f91e44f41f1b4cd60d1f89a4de7cf871d158</id>
<content type='text'>
Currently struct obj_stock_pcp stores nr_bytes in an 'unsigned int' which
is 4 bytes on 64-bit machines.  Switch the field to uint16_t to shrink the
per-CPU cache.

The kernel supports PAGE_SIZE_4KB, _8KB, _16KB, _32KB, _64KB and _256KB
(see HAVE_PAGE_SIZE_* in arch/Kconfig).  After the PAGE_SIZE-aligned flush
in __refill_obj_stock(), the sub-page remainder fits in uint16_t up
through 64KiB pages where PAGE_SIZE - 1 == U16_MAX, but on 256KiB pages
PAGE_SIZE - 1 == 0x3FFFF exceeds U16_MAX.  The accumulator also needs to
stay within uint16_t between page-aligned flushes on 64KiB pages where
PAGE_SIZE itself is U16_MAX + 1.

Accumulate the new total in an 'unsigned int' local, then on PAGE_SHIFT &lt;=
16 flush whenever the accumulator would hit U16_MAX; together with the
existing allow_uncharge flush at PAGE_SIZE this keeps the uint16_t safe.

On configs with PAGE_SHIFT &gt; 16 (PAGE_SIZE_256KB on hexagon and powerpc
44x, both 32-bit), uint16_t cannot represent the sub-page remainder. 
Define obj_stock_bytes_t as 'unsigned int' on those archs so nr_bytes can
hold the full remainder and the normal page-boundary flush in
__refill_obj_stock() and the page extraction in drain_obj_stock() both
work correctly.

The single-cache-line layout target only applies to PAGE_SHIFT &lt;= 16;
those archs are 32-bit embedded and not the optimization target.

Link: https://lore.kernel.org/20260526033931.1760588-3-shakeel.butt@linux.dev
Fixes: 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg per-node type")
Signed-off-by: Shakeel Butt &lt;shakeel.butt@linux.dev&gt;
Tested-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Reviewed-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Acked-by: Qi Zheng &lt;qi.zheng@linux.dev&gt;
Acked-by: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Alexandre Ghiti &lt;alex@ghiti.fr&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joshua Hahn &lt;joshua.hahnjy@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memcg: store node_id instead of pglist_data pointer</title>
<updated>2026-06-04T21:45:06+00:00</updated>
<author>
<name>Shakeel Butt</name>
<email>shakeel.butt@linux.dev</email>
</author>
<published>2026-05-26T03:39:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e0d4e7405f267ae31ffafd5673ce14d0d9e4cbe0'/>
<id>urn:sha1:e0d4e7405f267ae31ffafd5673ce14d0d9e4cbe0</id>
<content type='text'>
Patch series "memcg: shrink obj_stock_pcp and cache multiple objcgs", v3.

Commit 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg
per-node type") split a memcg's single obj_cgroup into one per NUMA node
so that reparenting LRU folios can take per-node lru locks.  As a side
effect, the per-CPU obj_stock_pcp -- which caches a single cached_objcg
pointer -- thrashes on workloads where threads of the same memcg run on
different NUMA nodes.  The kernel test robot reported a 67.7% regression
on stress-ng.switch.ops_per_sec from this pattern.

Commit d0211878ce06 ("memcg: cache obj_stock by memcg, not by objcg
pointer") landed as a temporary fix by treating sibling per-node objcgs as
equivalent for the cache lookup, intended to be reverted once per-node
kmem accounting is introduced.  This series takes a more general approach:
cache multiple objcgs per CPU using the multi-slot pattern memcg_stock_pcp
already uses, so the per-node objcg variants of one memcg can all coexist
in the stock without ever forcing a drain.  The temporary fix can then be
reverted.

To avoid increasing the per-CPU cache footprint, the first three patches
shrink the existing single-slot obj_stock_pcp fields.  The final patch
converts cached_objcg and nr_bytes into NR_OBJ_STOCK=5 slot arrays and
reorders the struct so the entire consume/refill/account hot path fits
within a single 64-byte cache line on non-debug 64-bit builds (verified
with pahole).


This patch (of 4):

The struct obj_stock_pcp stores a pointer to pglist_data for the slab
stats cached on the cpu.  On 64-bit machines, this costs 8 bytes.  The
pointer is not strictly required: NODE_DATA() can recover it from the node
id.  Replace cached_pgdat with int16_t node_id and use NUMA_NO_NODE as the
"no stats cached" sentinel.

At the moment all the archs limit MAX_NUMNODES to 1024 so int16_t is
plenty; a BUILD_BUG_ON() makes sure we notice if that ever changes.

Link: https://lore.kernel.org/20260526033931.1760588-1-shakeel.butt@linux.dev
Link: https://lore.kernel.org/20260526033931.1760588-2-shakeel.butt@linux.dev
Fixes: 01b9da291c49 ("mm: memcontrol: convert objcg to be per-memcg per-node type")
Signed-off-by: Shakeel Butt &lt;shakeel.butt@linux.dev&gt;
Tested-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Acked-by: Muchun Song &lt;muchun.song@linux.dev&gt;
Reviewed-by: Harry Yoo (Oracle) &lt;harry@kernel.org&gt;
Acked-by: Qi Zheng &lt;qi.zheng@linux.dev&gt;
Cc: Alexandre Ghiti &lt;alex@ghiti.fr&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joshua Hahn &lt;joshua.hahnjy@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memcg: remove no longer used swap cgroup array</title>
<updated>2026-06-02T22:22:23+00:00</updated>
<author>
<name>Kairui Song</name>
<email>kasong@tencent.com</email>
</author>
<published>2026-05-17T15:39:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4e8e1c498de9f97628207d1ef84506058b06bb51'/>
<id>urn:sha1:4e8e1c498de9f97628207d1ef84506058b06bb51</id>
<content type='text'>
Now all swap cgroup records are stored in the swap cluster directly, the
static array is no longer needed.

Link: https://lore.kernel.org/20260517-swap-table-p4-v5-11-88ae43e064c7@tencent.com
Signed-off-by: Kairui Song &lt;kasong@tencent.com&gt;
Acked-by: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Barry Song &lt;baohua@kernel.org&gt;
Cc: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Cc: David Hildenbrand &lt;david@kernel.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Kemeng Shi &lt;shikemeng@huaweicloud.com&gt;
Cc: Lorenzo Stoakes &lt;ljs@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Nhat Pham &lt;nphamcs@gmail.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Shakeel Butt &lt;shakeel.butt@linux.dev&gt;
Cc: Youngjun Park &lt;youngjun.park@lge.com&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memcg, swap: store cgroup id in cluster table directly</title>
<updated>2026-06-02T22:22:23+00:00</updated>
<author>
<name>Kairui Song</name>
<email>kasong@tencent.com</email>
</author>
<published>2026-05-17T15:39:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b197d41462c2076bc88c79fead7f400e48881c19'/>
<id>urn:sha1:b197d41462c2076bc88c79fead7f400e48881c19</id>
<content type='text'>
Drop the usage of the swap_cgroup_ctrl, and use the dynamic cluster table
instead.

The per-cluster memcg table is 1024 / 512 bytes on most archs, and does
not need RCU protection: the cgroup data is only read and written under
the cluster lock.  That keeps things simple, lets the allocation use plain
kmalloc with immediate kfree (no deferred free), and keeps fragmentation
acceptable.

[akpm@linux-foundation.org: memcgv1: don't compile swap functions when CONFIG_SWAP=n]
  Link: https://lore.kernel.org/202605281711.bSeZlErK-lkp@intel.com
[akpm@linux-foundation.org: fix CONFIG_SWAP=n build]
Link: https://lore.kernel.org/20260517-swap-table-p4-v5-10-88ae43e064c7@tencent.com
Signed-off-by: Kairui Song &lt;kasong@tencent.com&gt;
Acked-by: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Barry Song &lt;baohua@kernel.org&gt;
Cc: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Cc: David Hildenbrand &lt;david@kernel.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Kemeng Shi &lt;shikemeng@huaweicloud.com&gt;
Cc: Lorenzo Stoakes &lt;ljs@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Nhat Pham &lt;nphamcs@gmail.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Shakeel Butt &lt;shakeel.butt@linux.dev&gt;
Cc: Youngjun Park &lt;youngjun.park@lge.com&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, swap: delay and unify memcg lookup and charging for swapin</title>
<updated>2026-06-02T22:22:22+00:00</updated>
<author>
<name>Kairui Song</name>
<email>kasong@tencent.com</email>
</author>
<published>2026-05-17T15:39:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bc34e87a51d9e51d398ef6d8c2c35cf1a4ff38b9'/>
<id>urn:sha1:bc34e87a51d9e51d398ef6d8c2c35cf1a4ff38b9</id>
<content type='text'>
Instead of checking the cgroup private ID during page table walk in
swap_pte_batch(), move the memcg lookup into __swap_cache_add_check()
under the cluster lock.

The first pre-alloc check is speculative and skips the memcg check since
the post-alloc stable check ensures all slots covered by the folio belong
to the same memcg.  It is very rare for contiguous and aligned entries
across a contiguous region of a page table of the same process or shmem
mapping to belong to different memcgs.

This also prepares for recording the memcg info in the cluster's table. 
Also make the order check and fallback more compact.

There should be no user-observable behavior change.

Link: https://lore.kernel.org/20260517-swap-table-p4-v5-8-88ae43e064c7@tencent.com
Signed-off-by: Kairui Song &lt;kasong@tencent.com&gt;
Acked-by: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Barry Song &lt;baohua@kernel.org&gt;
Cc: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Cc: David Hildenbrand &lt;david@kernel.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Kemeng Shi &lt;shikemeng@huaweicloud.com&gt;
Cc: Lorenzo Stoakes &lt;ljs@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Nhat Pham &lt;nphamcs@gmail.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Shakeel Butt &lt;shakeel.butt@linux.dev&gt;
Cc: Youngjun Park &lt;youngjun.park@lge.com&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
