diff options
| author | Kairui Song <kasong@tencent.com> | 2025-12-19 22:43:41 +0300 |
|---|---|---|
| committer | Andrew Morton <akpm@linux-foundation.org> | 2026-02-01 01:22:56 +0300 |
| commit | 2732acda82c93475c5986e1a5640004a5d4f9c3e (patch) | |
| tree | b27b7cc04a108a31d0b995b10243cddf668d8c77 /include/linux | |
| parent | 78d6a12dd91a84b0d7e31110b1ea7f4b6a0f8d65 (diff) | |
| download | linux-2732acda82c93475c5986e1a5640004a5d4f9c3e.tar.xz | |
mm, swap: use swap cache as the swap in synchronize layer
Current swap in synchronization mostly uses the swap_map's SWAP_HAS_CACHE
bit. Whoever sets the bit first does the actual work to swap in a folio.
This has been causing many issues as it's just a poor implementation of a
bit lock. Raced users have no idea what is pinning a slot, so it has to
loop with a schedule_timeout_uninterruptible(1), which is ugly and causes
long-tailing or other performance issues. Besides, the abuse of
SWAP_HAS_CACHE has been causing many other troubles for synchronization or
maintenance.
This is the first step to remove this bit completely.
Now all swap in paths are using the swap cache, and both the swap cache
and swap map are protected by the cluster lock. So we can just resolve
the swap synchronization with the swap cache layer directly using the
cluster lock and folio lock. Whoever inserts a folio in the swap cache
first does the swap in work. And because folios are locked during swap
operations, other raced swap operations will just wait on the folio lock.
The SWAP_HAS_CACHE will be removed in later commit. For now, we still set
it for some remaining users. But now we do the bit setting and swap cache
folio adding in the same critical section, after swap cache is ready. No
one will have to spin on the SWAP_HAS_CACHE bit anymore.
This both simplifies the logic and should improve the performance,
eliminating issues like the one solved in commit 01626a1823024 ("mm: avoid
unconditional one-tick sleep when swapcache_prepare fails"), or the
"skip_if_exists" from commit a65b0e7607ccb ("zswap: make shrinking
memcg-aware"), which will be removed very soon.
[kasong@tencent.com: fix cgroup v1 accounting issue]
Link: https://lkml.kernel.org/r/CAMgjq7CGUnzOVG7uSaYjzw9wD7w2dSKOHprJfaEp4CcGLgE3iw@mail.gmail.com
Link: https://lkml.kernel.org/r/20251220-swap-table-p2-v5-12-8862a265a033@tencent.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Cc: Deepanshu Kartikey <kartikey406@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kairui Song <ryncsn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'include/linux')
| -rw-r--r-- | include/linux/swap.h | 6 |
1 files changed, 0 insertions, 6 deletions
diff --git a/include/linux/swap.h b/include/linux/swap.h index bf72b548a96d..74df3004c850 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -458,7 +458,6 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry); extern swp_entry_t get_swap_page_of_type(int); extern int add_swap_count_continuation(swp_entry_t, gfp_t); extern int swap_duplicate_nr(swp_entry_t entry, int nr); -extern int swapcache_prepare(swp_entry_t entry, int nr); extern void swap_free_nr(swp_entry_t entry, int nr_pages); extern void free_swap_and_cache_nr(swp_entry_t entry, int nr); int swap_type_of(dev_t device, sector_t offset); @@ -517,11 +516,6 @@ static inline int swap_duplicate_nr(swp_entry_t swp, int nr_pages) return 0; } -static inline int swapcache_prepare(swp_entry_t swp, int nr) -{ - return 0; -} - static inline void swap_free_nr(swp_entry_t entry, int nr_pages) { } |
