<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/mm/shrinker.c, branch v7.0-rc7</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0-rc7</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0-rc7'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-02-22T01:09:51+00:00</updated>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>urn:sha1:bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: avoid memleak in alloc_shrinker_info</title>
<updated>2024-11-01T03:27:04+00:00</updated>
<author>
<name>Chen Ridong</name>
<email>chenridong@huawei.com</email>
</author>
<published>2024-10-25T06:09:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=15e8156713cc38031642fafc8baf7d53f19f2e83'/>
<id>urn:sha1:15e8156713cc38031642fafc8baf7d53f19f2e83</id>
<content type='text'>
A memleak was found as below:

unreferenced object 0xffff8881010d2a80 (size 32):
  comm "mkdir", pid 1559, jiffies 4294932666
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  @...............
  backtrace (crc 2e7ef6fa):
    [&lt;ffffffff81372754&gt;] __kmalloc_node_noprof+0x394/0x470
    [&lt;ffffffff813024ab&gt;] alloc_shrinker_info+0x7b/0x1a0
    [&lt;ffffffff813b526a&gt;] mem_cgroup_css_online+0x11a/0x3b0
    [&lt;ffffffff81198dd9&gt;] online_css+0x29/0xa0
    [&lt;ffffffff811a243d&gt;] cgroup_apply_control_enable+0x20d/0x360
    [&lt;ffffffff811a5728&gt;] cgroup_mkdir+0x168/0x5f0
    [&lt;ffffffff8148543e&gt;] kernfs_iop_mkdir+0x5e/0x90
    [&lt;ffffffff813dbb24&gt;] vfs_mkdir+0x144/0x220
    [&lt;ffffffff813e1c97&gt;] do_mkdirat+0x87/0x130
    [&lt;ffffffff813e1de9&gt;] __x64_sys_mkdir+0x49/0x70
    [&lt;ffffffff81f8c928&gt;] do_syscall_64+0x68/0x140
    [&lt;ffffffff8200012f&gt;] entry_SYSCALL_64_after_hwframe+0x76/0x7e

alloc_shrinker_info(), when shrinker_unit_alloc() returns an errer, the
info won't be freed.  Just fix it.

Link: https://lkml.kernel.org/r/20241025060942.1049263-1-chenridong@huaweicloud.com
Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}")
Signed-off-by: Chen Ridong &lt;chenridong@huawei.com&gt;
Acked-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Acked-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Reviewed-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Wang Weiyang &lt;wangweiyang2@huawei.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: use kvzalloc_node() from expand_one_shrinker_info()</title>
<updated>2024-01-05T17:58:32+00:00</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2024-01-03T01:52:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7fba9420b726561966e1671004df60a08b39beb3'/>
<id>urn:sha1:7fba9420b726561966e1671004df60a08b39beb3</id>
<content type='text'>
syzbot is reporting uninit-value at shrinker_alloc(), for commit
307bececcd12 ("mm: shrinker: add a secondary array for
shrinker_info::{map, nr_deferred}") which assumed that the -&gt;unit was
allocated with __GFP_ZERO forgot to replace kvmalloc_node() in
expand_one_shrinker_info() with kvzalloc_node().

Link: https://lkml.kernel.org/r/9226cc0a-10e0-4489-80c5-58c3b5b4359c@I-love.SAKURA.ne.jp
Reported-by: syzbot &lt;syzbot+1e0ed05798af62917464@syzkaller.appspotmail.com&gt;
Closes: https://syzkaller.appspot.com/bug?extid=1e0ed05798af62917464
Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}")
Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: convert shrinker_rwsem to mutex</title>
<updated>2023-10-04T17:32:26+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:44:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8a0e8bb112af28362514624fc46e20426b4bdf1f'/>
<id>urn:sha1:8a0e8bb112af28362514624fc46e20426b4bdf1f</id>
<content type='text'>
Now there are no readers of shrinker_rwsem, so we can simply replace it
with mutex lock.

[akpm@linux-foundation.org: update the fix to alloc_shrinker_info()]
Link: https://lkml.kernel.org/r/20230911094444.68966-46-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: hold write lock to reparent shrinker nr_deferred</title>
<updated>2023-10-04T17:32:26+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:44:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=604b8b655020816b66ae474681d1100c5c0ba8c4'/>
<id>urn:sha1:604b8b655020816b66ae474681d1100c5c0ba8c4</id>
<content type='text'>
For now, reparent_shrinker_deferred() is the only holder of read lock of
shrinker_rwsem. And it already holds the global cgroup_mutex, so it will
not be called in parallel.

Therefore, in order to convert shrinker_rwsem to shrinker_mutex later,
here we change to hold the write lock of shrinker_rwsem to reparent.

Link: https://lkml.kernel.org/r/20230911094444.68966-45-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: make memcg slab shrink lockless</title>
<updated>2023-10-04T17:32:26+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:44:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=50d09da8e1199092a128eebd8a986d1fb0335fe4'/>
<id>urn:sha1:50d09da8e1199092a128eebd8a986d1fb0335fe4</id>
<content type='text'>
Like global slab shrink, this commit also uses refcount+RCU method to make
memcg slab shrink lockless.

Use the following script to do slab shrink stress test:

```

DIR="/root/shrinker/memcg/mnt"

do_create()
{
    mkdir -p /sys/fs/cgroup/memory/test
    echo 4G &gt; /sys/fs/cgroup/memory/test/memory.limit_in_bytes
    for i in `seq 0 $1`;
    do
        mkdir -p /sys/fs/cgroup/memory/test/$i;
        echo $$ &gt; /sys/fs/cgroup/memory/test/$i/cgroup.procs;
        mkdir -p $DIR/$i;
    done
}

do_mount()
{
    for i in `seq $1 $2`;
    do
        mount -t tmpfs $i $DIR/$i;
    done
}

do_touch()
{
    for i in `seq $1 $2`;
    do
        echo $$ &gt; /sys/fs/cgroup/memory/test/$i/cgroup.procs;
        dd if=/dev/zero of=$DIR/$i/file$i bs=1M count=1 &amp;
    done
}

case "$1" in
  touch)
    do_touch $2 $3
    ;;
  test)
    do_create 4000
    do_mount 0 4000
    do_touch 0 3000
    ;;
  *)
    exit 1
    ;;
esac
```

Save the above script, then run test and touch commands. Then we can use
the following perf command to view hotspots:

perf top -U -F 999

1) Before applying this patchset:

  33.15%  [kernel]          [k] down_read_trylock
  25.38%  [kernel]          [k] shrink_slab
  21.75%  [kernel]          [k] up_read
   4.45%  [kernel]          [k] _find_next_bit
   2.27%  [kernel]          [k] do_shrink_slab
   1.80%  [kernel]          [k] intel_idle_irq
   1.79%  [kernel]          [k] shrink_lruvec
   0.67%  [kernel]          [k] xas_descend
   0.41%  [kernel]          [k] mem_cgroup_iter
   0.40%  [kernel]          [k] shrink_node
   0.38%  [kernel]          [k] list_lru_count_one

2) After applying this patchset:

  64.56%  [kernel]          [k] shrink_slab
  12.18%  [kernel]          [k] do_shrink_slab
   3.30%  [kernel]          [k] __rcu_read_unlock
   2.61%  [kernel]          [k] shrink_lruvec
   2.49%  [kernel]          [k] __rcu_read_lock
   1.93%  [kernel]          [k] intel_idle_irq
   0.89%  [kernel]          [k] shrink_node
   0.81%  [kernel]          [k] mem_cgroup_iter
   0.77%  [kernel]          [k] mem_cgroup_calculate_protection
   0.66%  [kernel]          [k] list_lru_count_one

We can see that the first perf hotspot becomes shrink_slab, which is what
we expect.

Link: https://lkml.kernel.org/r/20230911094444.68966-44-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: make global slab shrink lockless</title>
<updated>2023-10-04T17:32:26+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:44:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ca1d36b823944f24b5755311e95883fb5fdb807b'/>
<id>urn:sha1:ca1d36b823944f24b5755311e95883fb5fdb807b</id>
<content type='text'>
The shrinker_rwsem is a global read-write lock in shrinkers subsystem,
which protects most operations such as slab shrink, registration and
unregistration of shrinkers, etc. This can easily cause problems in the
following cases.

1) When the memory pressure is high and there are many filesystems
   mounted or unmounted at the same time, slab shrink will be affected
   (down_read_trylock() failed).

   Such as the real workload mentioned by Kirill Tkhai:

   ```
   One of the real workloads from my experience is start
   of an overcommitted node containing many starting
   containers after node crash (or many resuming containers
   after reboot for kernel update). In these cases memory
   pressure is huge, and the node goes round in long reclaim.
   ```

2) If a shrinker is blocked (such as the case mentioned
   in [1]) and a writer comes in (such as mount a fs),
   then this writer will be blocked and cause all
   subsequent shrinker-related operations to be blocked.

Even if there is no competitor when shrinking slab, there may still be a
problem. The down_read_trylock() may become a perf hotspot with frequent
calls to shrink_slab(). Because of the poor multicore scalability of
atomic operations, this can lead to a significant drop in IPC
(instructions per cycle).

We used to implement the lockless slab shrink with SRCU [2], but then
kernel test robot reported -88.8% regression in
stress-ng.ramfs.ops_per_sec test case [3], so we reverted it [4].

This commit uses the refcount+RCU method [5] proposed by Dave Chinner
to re-implement the lockless global slab shrink. The memcg slab shrink is
handled in the subsequent patch.

For now, all shrinker instances are converted to dynamically allocated and
will be freed by call_rcu(). So we can use rcu_read_{lock,unlock}() to
ensure that the shrinker instance is valid.

And the shrinker instance will not be run again after unregistration. So
the structure that records the pointer of shrinker instance can be safely
freed without waiting for the RCU read-side critical section.

In this way, while we implement the lockless slab shrink, we don't need to
be blocked in unregister_shrinker().

The following are the test results:

stress-ng --timeout 60 --times --verify --metrics-brief --ramfs 9 &amp;

1) Before applying this patchset:

setting to a 60 second run per stressor
dispatching hogs: 9 ramfs
stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s
                          (secs)    (secs)    (secs)   (real time) (usr+sys time)
ramfs            473062     60.00      8.00    279.13      7884.12        1647.59
for a 60.01s run time:
   1440.34s available CPU time
      7.99s user time   (  0.55%)
    279.13s system time ( 19.38%)
    287.12s total time  ( 19.93%)
load average: 7.12 2.99 1.15
successful run completed in 60.01s (1 min, 0.01 secs)

2) After applying this patchset:

setting to a 60 second run per stressor
dispatching hogs: 9 ramfs
stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s
                          (secs)    (secs)    (secs)   (real time) (usr+sys time)
ramfs            477165     60.00      8.13    281.34      7952.55        1648.40
for a 60.01s run time:
   1440.33s available CPU time
      8.12s user time   (  0.56%)
    281.34s system time ( 19.53%)
    289.46s total time  ( 20.10%)
load average: 6.98 3.03 1.19
successful run completed in 60.01s (1 min, 0.01 secs)

We can see that the ops/s has hardly changed.

[1]. https://lore.kernel.org/lkml/20191129214541.3110-1-ptikhomirov@virtuozzo.com/
[2]. https://lore.kernel.org/lkml/20230313112819.38938-1-zhengqi.arch@bytedance.com/
[3]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/
[4]. https://lore.kernel.org/all/20230609081518.3039120-1-qi.zheng@linux.dev/
[5]. https://lore.kernel.org/lkml/ZIJhou1d55d4H1s0@dread.disaster.area/

Link: https://lkml.kernel.org/r/20230911094444.68966-43-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: rename {prealloc|unregister}_memcg_shrinker() to shrinker_memcg_{alloc|remove}()</title>
<updated>2023-10-04T17:32:26+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:44:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=48a7a0996a0089f4161ad437a810a10596e8d278'/>
<id>urn:sha1:48a7a0996a0089f4161ad437a810a10596e8d278</id>
<content type='text'>
With the new shrinker APIs, there is no action such as prealloc, so rename
{prealloc|unregister}_memcg_shrinker() to shrinker_memcg_{alloc|remove}(),
which corresponds to the idr_{alloc|remove}() inside the function.

Link: https://lkml.kernel.org/r/20230911094444.68966-42-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}</title>
<updated>2023-10-04T17:32:26+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:44:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=307bececcd1205bcb67a3c0d53a69db237ccc9d4'/>
<id>urn:sha1:307bececcd1205bcb67a3c0d53a69db237ccc9d4</id>
<content type='text'>
Currently, we maintain two linear arrays per node per memcg, which are
shrinker_info::map and shrinker_info::nr_deferred. And we need to resize
them when the shrinker_nr_max is exceeded, that is, allocate a new array,
and then copy the old array to the new array, and finally free the old
array by RCU.

For shrinker_info::map, we do set_bit() under the RCU lock, so we may set
the value into the old map which is about to be freed. This may cause the
value set to be lost. The current solution is not to copy the old map when
resizing, but to set all the corresponding bits in the new map to 1. This
solves the data loss problem, but bring the overhead of more pointless
loops while doing memcg slab shrink.

For shrinker_info::nr_deferred, we will only modify it under the read lock
of shrinker_rwsem, so it will not run concurrently with the resizing. But
after we make memcg slab shrink lockless, there will be the same data loss
problem as shrinker_info::map, and we can't work around it like the map.

For such resizable arrays, the most straightforward idea is to change it
to xarray, like we did for list_lru [1]. We need to do xa_store() in the
list_lru_add()--&gt;set_shrinker_bit(), but this will cause memory
allocation, and the list_lru_add() doesn't accept failure. A possible
solution is to pre-allocate, but the location of pre-allocation is not
well determined (such as deferred_split_shrinker case).

Therefore, this commit chooses to introduce the following secondary array
for shrinker_info::{map, nr_deferred}:

+---------------+--------+--------+-----+
| shrinker_info | unit 0 | unit 1 | ... | (secondary array)
+---------------+--------+--------+-----+
                     |
                     v
                +---------------+-----+
                | nr_deferred[] | map | (leaf array)
                +---------------+-----+
                (shrinker_info_unit)

The leaf array is never freed unless the memcg is destroyed. The secondary
array will be resized every time the shrinker id exceeds shrinker_nr_max.
So the shrinker_info_unit can be indexed from both the old and the new
shrinker_info-&gt;unit[x]. Then even if we get the old secondary array under
the RCU lock, the found map and nr_deferred are also true, so the updated
nr_deferred and map will not be lost.

[1]. https://lore.kernel.org/all/20220228122126.37293-13-songmuchun@bytedance.com/

[zhengqi.arch@bytedance.com: unlock the &amp;shrinker_rwsem before the call to free_shrinker_info()]
  Link: https://lkml.kernel.org/r/20230928141517.12164-1-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/20230911094444.68966-41-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
