<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/shrinker.h, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-10-18T21:34:18+00:00</updated>
<entry>
<title>mm: add printf attribute to shrinker_debugfs_name_alloc</title>
<updated>2023-10-18T21:34:18+00:00</updated>
<author>
<name>Lucy Mielke</name>
<email>lucymielke@icloud.com</email>
</author>
<published>2023-10-06T20:30:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f04eba134e598d4a5af2eeecff0562df5042cbfc'/>
<id>urn:sha1:f04eba134e598d4a5af2eeecff0562df5042cbfc</id>
<content type='text'>
This fixes a compiler warning when compiling an allyesconfig with W=1:

mm/internal.h:1235:9: error: function might be a candidate for `gnu_printf'
format attribute [-Werror=suggest-attribute=format]

[akpm@linux-foundation.org: fix shrinker_alloc() as welll per Qi Zheng]
  Link: https://lkml.kernel.org/r/822387b7-4895-4e64-5806-0f56b5d6c447@bytedance.com
Link: https://lkml.kernel.org/r/ZSBue-3kM6gI6jCr@mainframe
Fixes: c42d50aefd17 ("mm: shrinker: add infrastructure for dynamically allocating shrinker")
Signed-off-by: Lucy Mielke &lt;lucymielke@icloud.com&gt;
Cc: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: make global slab shrink lockless</title>
<updated>2023-10-04T17:32:26+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:44:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ca1d36b823944f24b5755311e95883fb5fdb807b'/>
<id>urn:sha1:ca1d36b823944f24b5755311e95883fb5fdb807b</id>
<content type='text'>
The shrinker_rwsem is a global read-write lock in shrinkers subsystem,
which protects most operations such as slab shrink, registration and
unregistration of shrinkers, etc. This can easily cause problems in the
following cases.

1) When the memory pressure is high and there are many filesystems
   mounted or unmounted at the same time, slab shrink will be affected
   (down_read_trylock() failed).

   Such as the real workload mentioned by Kirill Tkhai:

   ```
   One of the real workloads from my experience is start
   of an overcommitted node containing many starting
   containers after node crash (or many resuming containers
   after reboot for kernel update). In these cases memory
   pressure is huge, and the node goes round in long reclaim.
   ```

2) If a shrinker is blocked (such as the case mentioned
   in [1]) and a writer comes in (such as mount a fs),
   then this writer will be blocked and cause all
   subsequent shrinker-related operations to be blocked.

Even if there is no competitor when shrinking slab, there may still be a
problem. The down_read_trylock() may become a perf hotspot with frequent
calls to shrink_slab(). Because of the poor multicore scalability of
atomic operations, this can lead to a significant drop in IPC
(instructions per cycle).

We used to implement the lockless slab shrink with SRCU [2], but then
kernel test robot reported -88.8% regression in
stress-ng.ramfs.ops_per_sec test case [3], so we reverted it [4].

This commit uses the refcount+RCU method [5] proposed by Dave Chinner
to re-implement the lockless global slab shrink. The memcg slab shrink is
handled in the subsequent patch.

For now, all shrinker instances are converted to dynamically allocated and
will be freed by call_rcu(). So we can use rcu_read_{lock,unlock}() to
ensure that the shrinker instance is valid.

And the shrinker instance will not be run again after unregistration. So
the structure that records the pointer of shrinker instance can be safely
freed without waiting for the RCU read-side critical section.

In this way, while we implement the lockless slab shrink, we don't need to
be blocked in unregister_shrinker().

The following are the test results:

stress-ng --timeout 60 --times --verify --metrics-brief --ramfs 9 &amp;

1) Before applying this patchset:

setting to a 60 second run per stressor
dispatching hogs: 9 ramfs
stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s
                          (secs)    (secs)    (secs)   (real time) (usr+sys time)
ramfs            473062     60.00      8.00    279.13      7884.12        1647.59
for a 60.01s run time:
   1440.34s available CPU time
      7.99s user time   (  0.55%)
    279.13s system time ( 19.38%)
    287.12s total time  ( 19.93%)
load average: 7.12 2.99 1.15
successful run completed in 60.01s (1 min, 0.01 secs)

2) After applying this patchset:

setting to a 60 second run per stressor
dispatching hogs: 9 ramfs
stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s
                          (secs)    (secs)    (secs)   (real time) (usr+sys time)
ramfs            477165     60.00      8.13    281.34      7952.55        1648.40
for a 60.01s run time:
   1440.33s available CPU time
      8.12s user time   (  0.56%)
    281.34s system time ( 19.53%)
    289.46s total time  ( 20.10%)
load average: 6.98 3.03 1.19
successful run completed in 60.01s (1 min, 0.01 secs)

We can see that the ops/s has hardly changed.

[1]. https://lore.kernel.org/lkml/20191129214541.3110-1-ptikhomirov@virtuozzo.com/
[2]. https://lore.kernel.org/lkml/20230313112819.38938-1-zhengqi.arch@bytedance.com/
[3]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/
[4]. https://lore.kernel.org/all/20230609081518.3039120-1-qi.zheng@linux.dev/
[5]. https://lore.kernel.org/lkml/ZIJhou1d55d4H1s0@dread.disaster.area/

Link: https://lkml.kernel.org/r/20230911094444.68966-43-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}</title>
<updated>2023-10-04T17:32:26+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:44:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=307bececcd1205bcb67a3c0d53a69db237ccc9d4'/>
<id>urn:sha1:307bececcd1205bcb67a3c0d53a69db237ccc9d4</id>
<content type='text'>
Currently, we maintain two linear arrays per node per memcg, which are
shrinker_info::map and shrinker_info::nr_deferred. And we need to resize
them when the shrinker_nr_max is exceeded, that is, allocate a new array,
and then copy the old array to the new array, and finally free the old
array by RCU.

For shrinker_info::map, we do set_bit() under the RCU lock, so we may set
the value into the old map which is about to be freed. This may cause the
value set to be lost. The current solution is not to copy the old map when
resizing, but to set all the corresponding bits in the new map to 1. This
solves the data loss problem, but bring the overhead of more pointless
loops while doing memcg slab shrink.

For shrinker_info::nr_deferred, we will only modify it under the read lock
of shrinker_rwsem, so it will not run concurrently with the resizing. But
after we make memcg slab shrink lockless, there will be the same data loss
problem as shrinker_info::map, and we can't work around it like the map.

For such resizable arrays, the most straightforward idea is to change it
to xarray, like we did for list_lru [1]. We need to do xa_store() in the
list_lru_add()--&gt;set_shrinker_bit(), but this will cause memory
allocation, and the list_lru_add() doesn't accept failure. A possible
solution is to pre-allocate, but the location of pre-allocation is not
well determined (such as deferred_split_shrinker case).

Therefore, this commit chooses to introduce the following secondary array
for shrinker_info::{map, nr_deferred}:

+---------------+--------+--------+-----+
| shrinker_info | unit 0 | unit 1 | ... | (secondary array)
+---------------+--------+--------+-----+
                     |
                     v
                +---------------+-----+
                | nr_deferred[] | map | (leaf array)
                +---------------+-----+
                (shrinker_info_unit)

The leaf array is never freed unless the memcg is destroyed. The secondary
array will be resized every time the shrinker id exceeds shrinker_nr_max.
So the shrinker_info_unit can be indexed from both the old and the new
shrinker_info-&gt;unit[x]. Then even if we get the old secondary array under
the RCU lock, the found map and nr_deferred are also true, so the updated
nr_deferred and map will not be lost.

[1]. https://lore.kernel.org/all/20220228122126.37293-13-songmuchun@bytedance.com/

[zhengqi.arch@bytedance.com: unlock the &amp;shrinker_rwsem before the call to free_shrinker_info()]
  Link: https://lkml.kernel.org/r/20230928141517.12164-1-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/20230911094444.68966-41-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: remove old APIs</title>
<updated>2023-10-04T17:32:26+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:44:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f2383e01507eeee8a1c1283d61a117a97d6c4ebe'/>
<id>urn:sha1:f2383e01507eeee8a1c1283d61a117a97d6c4ebe</id>
<content type='text'>
Now no users are using the old APIs, just remove them.

Link: https://lkml.kernel.org/r/20230911094444.68966-40-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinker: add infrastructure for dynamically allocating shrinker</title>
<updated>2023-10-04T17:32:23+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:44:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c42d50aefd17a6bad3ed617769edbbb579137545'/>
<id>urn:sha1:c42d50aefd17a6bad3ed617769edbbb579137545</id>
<content type='text'>
Patch series "use refcount+RCU method to implement lockless slab shrink",
v6.

1. Background
=============

We used to implement the lockless slab shrink with SRCU [1], but then kernel
test robot reported -88.8% regression in stress-ng.ramfs.ops_per_sec test
case [2], so we reverted it [3].

This patch series aims to re-implement the lockless slab shrink using the
refcount+RCU method proposed by Dave Chinner [4].

[1]. https://lore.kernel.org/lkml/20230313112819.38938-1-zhengqi.arch@bytedance.com/
[2]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/
[3]. https://lore.kernel.org/all/20230609081518.3039120-1-qi.zheng@linux.dev/
[4]. https://lore.kernel.org/lkml/ZIJhou1d55d4H1s0@dread.disaster.area/

2. Implementation
=================

Currently, the shrinker instances can be divided into the following three types:

a) global shrinker instance statically defined in the kernel, such as
   workingset_shadow_shrinker.

b) global shrinker instance statically defined in the kernel modules, such as
   mmu_shrinker in x86.

c) shrinker instance embedded in other structures.

For case a, the memory of shrinker instance is never freed. For case b, the
memory of shrinker instance will be freed after synchronize_rcu() when the
module is unloaded. For case c, the memory of shrinker instance will be freed
along with the structure it is embedded in.

In preparation for implementing lockless slab shrink, we need to dynamically
allocate those shrinker instances in case c, then the memory can be dynamically
freed alone by calling kfree_rcu().

This patchset adds the following new APIs for dynamically allocating shrinker,
and add a private_data field to struct shrinker to record and get the original
embedded structure.

1. shrinker_alloc()
2. shrinker_register()
3. shrinker_free()

In order to simplify shrinker-related APIs and make shrinker more independent of
other kernel mechanisms, this patchset uses the above APIs to convert all
shrinkers (including case a and b) to dynamically allocated, and then remove all
existing APIs. This will also have another advantage mentioned by Dave Chinner:

```
The other advantage of this is that it will break all the existing out of tree
code and third party modules using the old API and will no longer work with a
kernel using lockless slab shrinkers. They need to break (both at the source and
binary levels) to stop bad things from happening due to using uncoverted
shrinkers in the new setup.
```

Then we free the shrinker by calling call_rcu(), and use rcu_read_{lock,unlock}()
to ensure that the shrinker instance is valid. And the shrinker::refcount
mechanism ensures that the shrinker instance will not be run again after
unregistration. So the structure that records the pointer of shrinker instance
can be safely freed without waiting for the RCU read-side critical section.

In this way, while we implement the lockless slab shrink, we don't need to be
blocked in unregister_shrinker() to wait RCU read-side critical section.

PATCH 1: introduce new APIs
PATCH 2~38: convert all shrinnkers to use new APIs
PATCH 39: remove old APIs
PATCH 40~41: some cleanups and preparations
PATCH 42-43: implement the lockless slab shrink
PATCH 44~45: convert shrinker_rwsem to mutex

3. Testing
==========

3.1 slab shrink stress test
---------------------------

We can reproduce the down_read_trylock() hotspot through the following script:

```

DIR="/root/shrinker/memcg/mnt"

do_create()
{
    mkdir -p /sys/fs/cgroup/memory/test
    echo 4G &gt; /sys/fs/cgroup/memory/test/memory.limit_in_bytes
    for i in `seq 0 $1`;
    do
        mkdir -p /sys/fs/cgroup/memory/test/$i;
        echo $$ &gt; /sys/fs/cgroup/memory/test/$i/cgroup.procs;
        mkdir -p $DIR/$i;
    done
}

do_mount()
{
    for i in `seq $1 $2`;
    do
        mount -t tmpfs $i $DIR/$i;
    done
}

do_touch()
{
    for i in `seq $1 $2`;
    do
        echo $$ &gt; /sys/fs/cgroup/memory/test/$i/cgroup.procs;
        dd if=/dev/zero of=$DIR/$i/file$i bs=1M count=1 &amp;
    done
}

case "$1" in
  touch)
    do_touch $2 $3
    ;;
  test)
    do_create 4000
    do_mount 0 4000
    do_touch 0 3000
    ;;
  *)
    exit 1
    ;;
esac
```

Save the above script, then run test and touch commands. Then we can use the
following perf command to view hotspots:

perf top -U -F 999

1) Before applying this patchset:

  33.15%  [kernel]          [k] down_read_trylock
  25.38%  [kernel]          [k] shrink_slab
  21.75%  [kernel]          [k] up_read
   4.45%  [kernel]          [k] _find_next_bit
   2.27%  [kernel]          [k] do_shrink_slab
   1.80%  [kernel]          [k] intel_idle_irq
   1.79%  [kernel]          [k] shrink_lruvec
   0.67%  [kernel]          [k] xas_descend
   0.41%  [kernel]          [k] mem_cgroup_iter
   0.40%  [kernel]          [k] shrink_node
   0.38%  [kernel]          [k] list_lru_count_one

2) After applying this patchset:

  64.56%  [kernel]          [k] shrink_slab
  12.18%  [kernel]          [k] do_shrink_slab
   3.30%  [kernel]          [k] __rcu_read_unlock
   2.61%  [kernel]          [k] shrink_lruvec
   2.49%  [kernel]          [k] __rcu_read_lock
   1.93%  [kernel]          [k] intel_idle_irq
   0.89%  [kernel]          [k] shrink_node
   0.81%  [kernel]          [k] mem_cgroup_iter
   0.77%  [kernel]          [k] mem_cgroup_calculate_protection
   0.66%  [kernel]          [k] list_lru_count_one

We can see that the first perf hotspot becomes shrink_slab, which is what we
expect.

3.2 registration and unregistration stress test
-----------------------------------------------

Run the command below to test:

stress-ng --timeout 60 --times --verify --metrics-brief --ramfs 9 &amp;

1) Before applying this patchset:

setting to a 60 second run per stressor
dispatching hogs: 9 ramfs
stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s
                          (secs)    (secs)    (secs)   (real time) (usr+sys time)
ramfs            473062     60.00      8.00    279.13      7884.12        1647.59
for a 60.01s run time:
   1440.34s available CPU time
      7.99s user time   (  0.55%)
    279.13s system time ( 19.38%)
    287.12s total time  ( 19.93%)
load average: 7.12 2.99 1.15
successful run completed in 60.01s (1 min, 0.01 secs)

2) After applying this patchset:

setting to a 60 second run per stressor
dispatching hogs: 9 ramfs
stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s
                          (secs)    (secs)    (secs)   (real time) (usr+sys time)
ramfs            477165     60.00      8.13    281.34      7952.55        1648.40
for a 60.01s run time:
   1440.33s available CPU time
      8.12s user time   (  0.56%)
    281.34s system time ( 19.53%)
    289.46s total time  ( 20.10%)
load average: 6.98 3.03 1.19
successful run completed in 60.01s (1 min, 0.01 secs)

We can see that the ops/s has hardly changed.


This patch (of 45):

Currently, the shrinker instances can be divided into the following three
types:

a) global shrinker instance statically defined in the kernel, such as
   workingset_shadow_shrinker.

b) global shrinker instance statically defined in the kernel modules, such
   as mmu_shrinker in x86.

c) shrinker instance embedded in other structures.

For case a, the memory of shrinker instance is never freed. For case b,
the memory of shrinker instance will be freed after synchronize_rcu() when
the module is unloaded. For case c, the memory of shrinker instance will
be freed along with the structure it is embedded in.

In preparation for implementing lockless slab shrink, we need to
dynamically allocate those shrinker instances in case c, then the memory
can be dynamically freed alone by calling kfree_rcu().

So this commit adds the following new APIs for dynamically allocating
shrinker, and add a private_data field to struct shrinker to record and
get the original embedded structure.

1. shrinker_alloc()

Used to allocate shrinker instance itself and related memory, it will
return a pointer to the shrinker instance on success and NULL on failure.

2. shrinker_register()

Used to register the shrinker instance, which is same as the current
register_shrinker_prepared().

3. shrinker_free()

Used to unregister (if needed) and free the shrinker instance.

In order to simplify shrinker-related APIs and make shrinker more
independent of other kernel mechanisms, subsequent submissions will use
the above API to convert all shrinkers (including case a and b) to
dynamically allocated, and then remove all existing APIs.

This will also have another advantage mentioned by Dave Chinner:

```
The other advantage of this is that it will break all the existing
out of tree code and third party modules using the old API and will
no longer work with a kernel using lockless slab shrinkers. They
need to break (both at the source and binary levels) to stop bad
things from happening due to using unconverted shrinkers in the new
setup.
```

[zhengqi.arch@bytedance.com: mm: shrinker: some cleanup]
  Link: https://lkml.kernel.org/r/20230919024607.65463-1-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/20230911094444.68966-1-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/20230911094444.68966-2-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Darrick J. Wong &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>drm/ttm: introduce pool_shrink_rwsem</title>
<updated>2023-10-04T17:32:23+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:25:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0b2f5ea1aa39c0ed34bdadb53faf519e3d84ac4a'/>
<id>urn:sha1:0b2f5ea1aa39c0ed34bdadb53faf519e3d84ac4a</id>
<content type='text'>
Currently, synchronize_shrinkers() is only used by TTM pool.  It only
requires that no shrinkers run in parallel.

After we use RCU+refcount method to implement the lockless slab shrink, we
can not use shrinker_rwsem or synchronize_rcu() to guarantee that all
shrinker invocations have seen an update before freeing memory.

So we introduce a new pool_shrink_rwsem to implement a private
ttm_pool_synchronize_shrinkers(), so as to achieve the same purpose.

Link: https://lkml.kernel.org/r/20230911092517.64141-5-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Darrick J. Wong &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: move some shrinker-related function declarations to mm/internal.h</title>
<updated>2023-10-04T17:32:22+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-09-11T09:25:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3ee0aa9f06756e959633ffda37856c6741d948ed'/>
<id>urn:sha1:3ee0aa9f06756e959633ffda37856c6741d948ed</id>
<content type='text'>
Patch series "cleanups for lockless slab shrink", v4.

This series is some cleanups for lockless slab shrink.


This patch (of 4):

The following functions are only used inside the mm subsystem, so it's
better to move their declarations to the mm/internal.h file.

1. shrinker_debugfs_add()
2. shrinker_debugfs_detach()
3. shrinker_debugfs_remove()

Link: https://lkml.kernel.org/r/20230911092517.64141-1-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/20230911092517.64141-2-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Chuck Lever &lt;cel@kernel.org&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Darrick J. Wong &lt;djwong@kernel.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Kirill Tkhai &lt;tkhai@ya.ru&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Cc: Abhinav Kumar &lt;quic_abhinavk@quicinc.com&gt;
Cc: Alasdair Kergon &lt;agk@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alyssa Rosenzweig &lt;alyssa.rosenzweig@collabora.com&gt;
Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Anna Schumaker &lt;anna@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Carlos Llamas &lt;cmllamas@google.com&gt;
Cc: Chandan Babu R &lt;chandan.babu@oracle.com&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Coly Li &lt;colyli@suse.de&gt;
Cc: Dai Ngo &lt;Dai.Ngo@oracle.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dmitry Baryshkov &lt;dmitry.baryshkov@linaro.org&gt;
Cc: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Cc: Huang Rui &lt;ray.huang@amd.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Jeffle Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Marijn Suijten &lt;marijn.suijten@somainline.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Oleksandr Tyshchenko &lt;oleksandr_tyshchenko@epam.com&gt;
Cc: Olga Kornievskaia &lt;kolga@netapp.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rob Clark &lt;robdclark@gmail.com&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Sean Paul &lt;sean@poorly.run&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tomeu Vizoso &lt;tomeu.vizoso@collabora.com&gt;
Cc: Tom Talpey &lt;tom@talpey.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@linux.intel.com&gt;
Cc: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Cc: Yue Hu &lt;huyue2@coolpad.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinkers: fix race condition on debugfs cleanup</title>
<updated>2023-05-17T22:24:33+00:00</updated>
<author>
<name>Joan Bruguera Micó</name>
<email>joanbrugueram@gmail.com</email>
</author>
<published>2023-05-03T01:32:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=26e239b37ebdfd189a2e6f94d3407e70313348bc'/>
<id>urn:sha1:26e239b37ebdfd189a2e6f94d3407e70313348bc</id>
<content type='text'>
When something registers and unregisters many shrinkers, such as:
    for x in $(seq 10000); do unshare -Ui true; done

Sometimes the following error is printed to the kernel log:
    debugfs: Directory '...' with parent 'shrinker' already present!

This occurs since commit badc28d4924b ("mm: shrinkers: fix deadlock in
shrinker debugfs") / v6.2: Since the call to `debugfs_remove_recursive`
was moved outside the `shrinker_rwsem`/`shrinker_mutex` lock, but the call
to `ida_free` stayed inside, a newly registered shrinker can be
re-assigned that ID and attempt to create the debugfs directory before the
directory from the previous shrinker has been removed.

The locking changes in commit f95bdb700bc6 ("mm: vmscan: make global slab
shrink lockless") made the race condition more likely, though it existed
before then.

Commit badc28d4924b ("mm: shrinkers: fix deadlock in shrinker debugfs")
could be reverted since the issue is addressed should no longer occur
since the count and scan operations are lockless since commit 20cd1892fcc3
("mm: shrinkers: make count and scan in shrinker debugfs lockless"). 
However, since this is a contended lock, prefer instead moving `ida_free`
outside the lock to avoid the race.

Link: https://lkml.kernel.org/r/20230503013232.299211-1-joanbrugueram@gmail.com
Fixes: badc28d4924b ("mm: shrinkers: fix deadlock in shrinker debugfs")
Signed-off-by: Joan Bruguera Micó &lt;joanbrugueram@gmail.com&gt;
Cc: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinkers: fix deadlock in shrinker debugfs</title>
<updated>2023-02-09T23:56:51+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2023-02-02T10:56:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=badc28d4924bfed73efc93f716a0c3aa3afbdf6f'/>
<id>urn:sha1:badc28d4924bfed73efc93f716a0c3aa3afbdf6f</id>
<content type='text'>
The debugfs_remove_recursive() is invoked by unregister_shrinker(), which
is holding the write lock of shrinker_rwsem.  It will waits for the
handler of debugfs file complete.  The handler also needs to hold the read
lock of shrinker_rwsem to do something.  So it may cause the following
deadlock:

 	CPU0				CPU1

debugfs_file_get()
shrinker_debugfs_count_show()/shrinker_debugfs_scan_write()

     				unregister_shrinker()
				--&gt; down_write(&amp;shrinker_rwsem);
				    debugfs_remove_recursive()
					// wait for (A)
				    --&gt; wait_for_completion();

    // wait for (B)
--&gt; down_read_killable(&amp;shrinker_rwsem)
debugfs_file_put() -- (A)

				    up_write() -- (B)

The down_read_killable() can be killed, so that the above deadlock can be
recovered.  But it still requires an extra kill action, otherwise it will
block all subsequent shrinker-related operations, so it's better to fix
it.

[akpm@linux-foundation.org: fix CONFIG_SHRINKER_DEBUG=n stub]
Link: https://lkml.kernel.org/r/20230202105612.64641-1-zhengqi.arch@bytedance.com
Fixes: 5035ebc644ae ("mm: shrinkers: introduce debugfs interface for memory shrinkers")
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Reviewed-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: shrinkers: add missing includes for undeclared types</title>
<updated>2022-11-30T23:58:55+00:00</updated>
<author>
<name>T.J. Mercier</name>
<email>tjmercier@google.com</email>
</author>
<published>2022-11-14T23:59:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b7217a0bbe00a98a8f4b15ebc2a8355a31a59e1e'/>
<id>urn:sha1:b7217a0bbe00a98a8f4b15ebc2a8355a31a59e1e</id>
<content type='text'>
The shrinker.h header depends on a user including other headers before it
for types used by shrinker.h.  Fix this by including the appropriate
headers in shrinker.h.

./include/linux/shrinker.h:13:9: error: unknown type name `gfp_t'
   13 |         gfp_t gfp_mask;
      |         ^~~~~
./include/linux/shrinker.h:71:26: error: field `list' has incomplete type
   71 |         struct list_head list;
      |                          ^~~~
./include/linux/shrinker.h:82:9: error: unknown type name `atomic_long_t'
   82 |         atomic_long_t *nr_deferred;
      |

Link: https://lkml.kernel.org/r/20221114235949.201749-1-tjmercier@google.com
Fixes: 83aeeada7c69 ("vmscan: use atomic-long for shrinker batching")
Fixes: b0d40c92adaf ("superblock: introduce per-sb cache shrinker infrastructure")
Signed-off-by: T.J. Mercier &lt;tjmercier@google.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Dave Chinner &lt;dchinner@redhat.com&gt;
Cc: Konstantin Khlebnikov &lt;khlebnikov@openvz.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
