diff options
author | Yin Fengwei <fengwei.yin@intel.com> | 2023-08-08 05:09:17 +0300 |
---|---|---|
committer | Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 2023-08-30 17:11:11 +0300 |
commit | bd20e20c4d64e131498ec91f1b066cd4708088b3 (patch) | |
tree | c3538edbe2446289426bce8a58722a8190a05e27 | |
parent | f67e3a725b4975e3be3fda61ea6e4978213dcc2f (diff) | |
download | linux-bd20e20c4d64e131498ec91f1b066cd4708088b3.tar.xz |
madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
commit 0e0e9bd5f7b9d40fd03b70092367247d52da1db0 upstream.
Commit 98b211d6415f ("madvise: convert madvise_free_pte_range() to use a
folio") replaced the page_mapcount() with folio_mapcount() to check
whether the folio is shared by other mapping.
It's not correct for large folios. folio_mapcount() returns the total
mapcount of large folio which is not suitable to detect whether the folio
is shared.
Use folio_estimated_sharers() which returns a estimated number of shares.
That means it's not 100% correct. It should be OK for madvise case here.
User-visible effects is that the THP is skipped when user call madvise.
But the correct behavior is THP should be split and processed then.
NOTE: this change is a temporary fix to reduce the user-visible effects
before the long term fix from David is ready.
Link: https://lkml.kernel.org/r/20230808020917.2230692-4-fengwei.yin@intel.com
Fixes: 98b211d6415f ("madvise: convert madvise_free_pte_range() to use a folio")
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-rw-r--r-- | include/linux/mm.h | 19 | ||||
-rw-r--r-- | mm/madvise.c | 4 |
2 files changed, 21 insertions, 2 deletions
diff --git a/include/linux/mm.h b/include/linux/mm.h index f83a1c9ec8e4..104ec00823da 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1727,6 +1727,25 @@ static inline size_t folio_size(struct folio *folio) return PAGE_SIZE << folio_order(folio); } +/** + * folio_estimated_sharers - Estimate the number of sharers of a folio. + * @folio: The folio. + * + * folio_estimated_sharers() aims to serve as a function to efficiently + * estimate the number of processes sharing a folio. This is done by + * looking at the precise mapcount of the first subpage in the folio, and + * assuming the other subpages are the same. This may not be true for large + * folios. If you want exact mapcounts for exact calculations, look at + * page_mapcount() or folio_total_mapcount(). + * + * Return: The estimated number of processes sharing a folio. + */ +static inline int folio_estimated_sharers(struct folio *folio) +{ + return page_mapcount(folio_page(folio, 0)); +} + + #ifndef HAVE_ARCH_MAKE_PAGE_ACCESSIBLE static inline int arch_make_page_accessible(struct page *page) { diff --git a/mm/madvise.c b/mm/madvise.c index d03e149ffe6e..5973399b2f9b 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -654,8 +654,8 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, * deactivate all pages. */ if (folio_test_large(folio)) { - if (folio_mapcount(folio) != 1) - goto out; + if (folio_estimated_sharers(folio) != 1) + break; folio_get(folio); if (!folio_trylock(folio)) { folio_put(folio); |