summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-06-10backing_dev: remove current->backing_dev_infoChristoph Hellwig12-38/+2
Patch series "cleanup the filemap / direct I/O interaction", v4. This series cleans up some of the generic write helper calling conventions and the page cache writeback / invalidation for direct I/O. This is a spinoff from the no-bufferhead kernel project, for which we'll want to an use iomap based buffered write path in the block layer. This patch (of 12): The last user of current->backing_dev_info disappeared in commit b9b1335e6403 ("remove bdi_congested() and wb_congested() and related functions"). Remove the field and all assignments to it. Link: https://lkml.kernel.org/r/20230601145904.1385409-1-hch@lst.de Link: https://lkml.kernel.org/r/20230601145904.1385409-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Acked-by: Theodore Ts'o <tytso@mit.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Chao Yu <chao@kernel.org> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Xiubo Li <xiubli@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm: zswap: shrink until can acceptDomenico Cerasuolo1-3/+14
This update addresses an issue with the zswap reclaim mechanism, which hinders the efficient offloading of cold pages to disk, thereby compromising the preservation of the LRU order and consequently diminishing, if not inverting, its performance benefits. The functioning of the zswap shrink worker was found to be inadequate, as shown by basic benchmark test. For the test, a kernel build was utilized as a reference, with its memory confined to 1G via a cgroup and a 5G swap file provided. The results are presented below, these are averages of three runs without the use of zswap: real 46m26s user 35m4s sys 7m37s With zswap (zbud) enabled and max_pool_percent set to 1 (in a 32G system), the results changed to: real 56m4s user 35m13s sys 8m43s written_back_pages: 18 reject_reclaim_fail: 0 pool_limit_hit:1478 Besides the evident regression, one thing to notice from this data is the extremely low number of written_back_pages and pool_limit_hit. The pool_limit_hit counter, which is increased in zswap_frontswap_store when zswap is completely full, doesn't account for a particular scenario: once zswap hits his limit, zswap_pool_reached_full is set to true; with this flag on, zswap_frontswap_store rejects pages if zswap is still above the acceptance threshold. Once we include the rejections due to zswap_pool_reached_full && !zswap_can_accept(), the number goes from 1478 to a significant 21578266. Zswap is stuck in an undesirable state where it rejects pages because it's above the acceptance threshold, yet fails to attempt memory reclaimation. This happens because the shrink work is only queued when zswap_frontswap_store detects that it's full and the work itself only reclaims one page per run. This state results in hot pages getting written directly to disk, while cold ones remain memory, waiting only to be invalidated. The LRU order is completely broken and zswap ends up being just an overhead without providing any benefits. This commit applies 2 changes: a) the shrink worker is set to reclaim pages until the acceptance threshold is met and b) the task is also enqueued when zswap is not full but still above the threshold. Testing this suggested update showed much better numbers: real 36m37s user 35m8s sys 9m32s written_back_pages: 10459423 reject_reclaim_fail: 12896 pool_limit_hit: 75653 Link: https://lkml.kernel.org/r/20230526183227.793977-1-cerasuolodomenico@gmail.com Fixes: 45190f01dd40 ("mm/zswap.c: add allocation hysteresis if pool limit is hit") Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Vitaly Wool <vitaly.wool@konsulko.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjenning@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/mm_init.c: move set_pageblock_order() to free_area_init()Haifeng Xu1-1/+2
pageblock_order only needs to be set once, there is no need to initialize it in every zone/node. Link: https://lkml.kernel.org/r/20230601063536.26882-1-haifeng.xu@shopee.com Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm: khugepaged: avoid pointless allocation for "struct mm_slot"Xin Hao1-7/+5
In __khugepaged_enter(), if "mm->flags" with MMF_VM_HUGEPAGE bit is set, the "mm_slot" will be released and return, so we can call mm_slot_alloc() after test_and_set_bit(). Link: https://lkml.kernel.org/r/20230531095817.11012-1-xhao@linux.alibaba.com Signed-off-by: Xin Hao <xhao@linux.alibaba.com> Reviewed-by: Andrew Morton <akpm@linux-foudation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/page_alloc: don't wake kswapd from rmqueue() unless __GFP_KSWAPD_RECLAIM ↵Tetsuo Handa1-1/+2
is specified Commit 73444bc4d8f9 ("mm, page_alloc: do not wake kswapd with zone lock held") moved wakeup_kswapd() from steal_suitable_fallback() to rmqueue() using ZONE_BOOSTED_WATERMARK flag. Only allocation contexts that include ALLOC_KSWAPD (which corresponds to __GFP_KSWAPD_RECLAIM) should wake kswapd, for callers are supposed to remove __GFP_KSWAPD_RECLAIM if trying to hold pgdat->kswapd_wait has a risk of deadlock. But since zone->flags is a shared variable, a thread doing !__GFP_KSWAPD_RECLAIM allocation request might observe this flag being set immediately after another thread doing __GFP_KSWAPD_RECLAIM allocation request set this flag, causing possibility of deadlock. Link: https://lkml.kernel.org/r/c3c3dacf-dd3b-77c9-f96a-d0982b4b2a4f@I-love.SAKURA.ne.jp Fixes: 73444bc4d8f9 ("mm, page_alloc: do not wake kswapd with zone lock held") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/mm_init.c: remove free_area_init_memoryless_node()Haifeng Xu1-6/+1
free_area_init_memoryless_node() is just a wrapper of free_area_init_node(), remove it to clean up. Link: https://lkml.kernel.org/r/20230528045720.4835-1-haifeng.xu@shopee.com Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10THP: avoid lock when check whether THP is in deferred listYin Fengwei1-5/+12
free_transhuge_page() acquires split queue lock then check whether the THP was added to deferred list or not. It brings high deferred queue lock contention. It's safe to check whether the THP is in deferred list or not without holding the deferred queue lock in free_transhuge_page() because when code hit free_transhuge_page(), there is no one tries to add the folio to _deferred_list. Running page_fault1 of will-it-scale + order 2 folio for anonymous mapping with 96 processes on an Ice Lake 48C/96T test box, we could see the 61% split_queue_lock contention: - 63.02% 0.01% page_fault1_pro [kernel.kallsyms] [k] free_transhuge_page - 63.01% free_transhuge_page + 62.91% _raw_spin_lock_irqsave With this patch applied, the split_queue_lock contention is less than 1%. Link: https://lkml.kernel.org/r/20230429082759.1600796-2-fengwei.yin@intel.com Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10swap: comments get_swap_device() with usage ruleHuang Ying1-3/+9
The general rule to use a swap entry is as follows. When we get a swap entry, if there aren't some other ways to prevent swapoff, such as the folio in swap cache is locked, page table lock is held, etc., the swap entry may become invalid because of swapoff. Then, we need to enclose all swap related functions with get_swap_device() and put_swap_device(), unless the swap functions call get/put_swap_device() by themselves. Add the rule as comments of get_swap_device(). Link: https://lkml.kernel.org/r/20230529061355.125791-6-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Chris Li (Google) <chrisl@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10swap: remove get/put_swap_device() in __swap_duplicate()Huang Ying1-4/+1
__swap_duplicate() is called by - swap_shmem_alloc(): the folio in swap cache is locked. - copy_nonpresent_pte() -> swap_duplicate() and try_to_unmap_one() -> swap_duplicate(): the page table lock is held. - __read_swap_cache_async() -> swapcache_prepare(): enclosed with get/put_swap_device() in __read_swap_cache_async() already. So, it's safe to remove get/put_swap_device() in __swap_duplicate(). Link: https://lkml.kernel.org/r/20230529061355.125791-5-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Chris Li (Google) <chrisl@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10swap: remove __swp_swapcount()Huang Ying3-22/+4
__swp_swapcount() just encloses the calling to swap_swapcount() with get/put_swap_device(). It is called in __read_swap_cache_async() only, which encloses the calling with get/put_swap_device() already. So, __read_swap_cache_async() can call swap_swapcount() directly. Link: https://lkml.kernel.org/r/20230529061355.125791-4-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Chris Li (Google) <chrisl@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10swap, __read_swap_cache_async(): enlarge get/put_swap_device protection rangeHuang Ying1-10/+21
This makes the function a little easier to be understood because we don't need to consider swapoff. And this makes it possible to remove get/put_swap_device() calling in some functions called by __read_swap_cache_async(). Link: https://lkml.kernel.org/r/20230529061355.125791-3-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Chris Li (Google) <chrisl@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10swap: remove get/put_swap_device() in __swap_count()Huang Ying1-8/+2
Patch series "swap: cleanup get/put_swap_device() usage", v3. The general rule to use a swap entry is as follows. When we get a swap entry, if there aren't some other ways to prevent swapoff, such as the folio in swap cache is locked, page table lock is held, etc., the swap entry may become invalid because of swapoff. Then, we need to enclose all swap related functions with get_swap_device() and put_swap_device(), unless the swap functions call get/put_swap_device() by themselves. Based on the above rule, all get/put_swap_device() usage are checked and cleaned up if necessary. This patch (of 5): get/put_swap_device() are added to __swap_count() in commit eb085574a752 ("mm, swap: fix race between swapoff and some swap operations"). Later, in commit 2799e77529c2 ("swap: fix do_swap_page() race with swapoff"), get/put_swap_device() are added to do_swap_page(). And they enclose the only call site of __swap_count(). So, it's safe to remove get/put_swap_device() in __swap_count() now. Link: https://lkml.kernel.org/r/20230529061355.125791-1-ying.huang@intel.com Link: https://lkml.kernel.org/r/20230529061355.125791-2-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Chris Li (Google) <chrisl@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/mm_init.c: do not calculate zone_start_pfn/zone_end_pfn in ↵Haifeng Xu1-18/+12
zone_absent_pages_in_node() In calculate_node_totalpages(), zone_start_pfn/zone_end_pfn are already calculated in zone_spanned_pages_in_node(), so use them as parameters instead of node_start_pfn/node_end_pfn and the duplicated calculation process can de dropped. Link: https://lkml.kernel.org/r/20230526085251.1977-2-haifeng.xu@shopee.com Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com> Suggested-by: Mike Rapoport <rppt@kernel.org> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Haifeng Xu <haifeng.xu@shopee.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/mm_init.c: introduce reset_memoryless_node_totalpages()Haifeng Xu1-9/+22
Currently, no matter whether a node actually has memory or not, calculate_node_totalpages() is used to account number of pages in zone/node. However, for node without memory, these unnecessary calculations can be skipped. All the zone/node page counts can be set to 0 directly. So introduce reset_memoryless_node_totalpages() to perform this action. Furthermore, calculate_node_totalpages() only gets called for the node with memory. Link: https://lkml.kernel.org/r/20230526085251.1977-1-haifeng.xu@shopee.com Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com> Suggested-by: Mike Rapoport <rppt@kernel.org> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10Docs/mm/damon/design: add a section for the modules layerSeongJae Park1-0/+61
Add a section for covering DAMON modules layer to the design document. Link: https://lkml.kernel.org/r/20230525214314.5204-11-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10Docs/mm/damon/design: add a section for DAMON core APISeongJae Park1-0/+12
Add a section covering the API of DAMON core layer on the design document. Link: https://lkml.kernel.org/r/20230525214314.5204-10-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10Docs/mm/damon/design: add sections for advanced features of DAMOSSeongJae Park1-0/+86
Add sections for advanced features of DAMOS including quotas, prioritization, watermarks, and filters of DAMOS on the design document. Link: https://lkml.kernel.org/r/20230525214314.5204-9-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10Docs/mm/damon/design: add sections for basic parts of DAMOSSeongJae Park1-0/+70
DAMOS is an important part of DAMON, but the design doc is not covering it. Add sections for covering the basic part of DAMOS. Link: https://lkml.kernel.org/r/20230525214314.5204-8-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10Docs/mm/damon/design: add a section for the relation between Core and ↵SeongJae Park1-0/+10
Modules layer Add overall desription of the interface and the relation between the Core and the Modules layer under 'Overall Architecture' section. Link: https://lkml.kernel.org/r/20230525214314.5204-7-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10Docs/mm/damon/design: rewrite configurable layersSeongJae Park1-15/+14
The 'Configurable Operations Set' section is a little bit outdated. Update the text. Link: https://lkml.kernel.org/r/20230525214314.5204-6-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10Docs/mm/damon/design: update the layout based on the layersSeongJae Park1-10/+14
DAMON design document is describing only the operations set layer and monitoring part of the core logic. Update the layout based on the DAMON's layers, so that more parts of DAMON including DAMOS core logic and DAMON modules can easily be added. Link: https://lkml.kernel.org/r/20230525214314.5204-5-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10Docs/mm/damon/design: add a section for overall architectureSeongJae Park1-0/+15
The design doc is missing overall picture of DAMON. Add a section for overall architeucture and layers. Link: https://lkml.kernel.org/r/20230525214314.5204-4-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10Docs/mm/damon/maintainer-profile: fix typos and grammar errorsSeongJae Park1-2/+2
Fix a few typos and grammar erros in DAMON Maintainer Profile document. Link: https://lkml.kernel.org/r/20230525214314.5204-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10Docs/mm/damon/faq: remove old questionsSeongJae Park1-23/+0
Patch series "Docs/mm/damon: Minor fixes and design doc update". Some of the DAMON documents are outdated, or having minor typos or grammar erros. Especially, the design doc has not updated for DAMOS, which is an important part of DAMON. Fix the minor issues and update documents. This patch (of 10): The first two questions of DAMON faqs have raised when DAMON patches were first submitted. More than one year has passed since DAMON patches get merged in the mainline, and that kind of questions are not asked nowadays. Remove the questions. Link: https://lkml.kernel.org/r/20230525214314.5204-1-sj@kernel.org Link: https://lkml.kernel.org/r/20230525214314.5204-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10Multi-gen LRU: fix workingset accountingKalesh Singh2-4/+7
On Android app cycle workloads, MGLRU showed a significant reduction in workingset refaults although pgpgin/pswpin remained relatively unchanged. This indicated MGLRU may be undercounting workingset refaults. This has impact on userspace programs, like Android's LMKD, that monitor workingset refault statistics to detect thrashing. It was found that refaults were only accounted if the MGLRU shadow entry was for a recently evicted folio. However, recently evicted folios should be accounted as workingset activation, and refaults should be accounted regardless of recency. Fix MGLRU's workingset refault and activation accounting to more closely match that of the conventional active/inactive LRU. Link: https://lkml.kernel.org/r/20230523205922.3852731-1-kaleshsingh@google.com Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation") Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reported-by: Charan Teja Kalla <quic_charante@quicinc.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org> Cc: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10maple_tree: relocate the declaration of mas_empty_area_rev().Peng Zhang1-6/+6
Relocate the declaration of mas_empty_area_rev() so that mas_empty_area() and mas_empty_area_rev() are together. Link: https://lkml.kernel.org/r/20230524031247.65949-11-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10maple_tree: simplify and clean up mas_wr_node_store()Peng Zhang1-61/+26
Simplify and clean up mas_wr_node_store(), remove unnecessary code. Link: https://lkml.kernel.org/r/20230524031247.65949-10-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10maple_tree: rework mas_wr_slot_store() to be cleaner and more efficient.Peng Zhang1-34/+19
Get whether the two gaps to be overwritten are empty to avoid calling mas_update_gap() all the time. Also clean up the code and add comments. Link: https://lkml.kernel.org/r/20230524031247.65949-9-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10maple_tree: add comments and some minor cleanups to mas_wr_append()Peng Zhang1-24/+23
Add comment for mas_wr_append(), move mas_update_gap() into mas_wr_append(), and other cleanups to make mas_wr_modify() cleaner. Link: https://lkml.kernel.org/r/20230524031247.65949-8-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10maple_tree: add mas_wr_new_end() to calculate new_end accuratelyPeng Zhang1-11/+23
The previous new_end calculation is inaccurate, because it assumes that two new pivots must be added (this is inaccurate), and sometimes it will miss the fast path and enter the slow path. Add mas_wr_new_end() to accurately calculate new_end to make the conditions for entering the fast path more accurate. Link: https://lkml.kernel.org/r/20230524031247.65949-7-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10maple_tree: make the code symmetrical in mas_wr_extend_null()Peng Zhang1-12/+14
Just make the code symmetrical to improve readability. Link: https://lkml.kernel.org/r/20230524031247.65949-6-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10maple_tree: simplify mas_is_span_wr()Peng Zhang1-23/+11
Make the code for detecting spanning writes more concise. Link: https://lkml.kernel.org/r/20230524031247.65949-5-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10maple_tree: fix the arguments to __must_hold()Peng Zhang1-3/+3
Fix the arguments to __must_hold() to make sparse work. Link: https://lkml.kernel.org/r/20230524031247.65949-4-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10maple_tree: drop mas_{rev_}alloc() and mas_fill_gap()Peng Zhang1-108/+0
mas_{rev_}alloc() and mas_fill_gap() are no longer used, delete them. Link: https://lkml.kernel.org/r/20230524031247.65949-3-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10maple_tree: rework mtree_alloc_{range,rrange}()Peng Zhang1-25/+32
Patch series "Clean ups for maple tree", v4. Some clean ups, mainly to make the code of maple tree more concise. This patchset has passed the self-test. This patch (of 10): Use mas_empty_area{_rev}() to refactor mtree_alloc_{range,rrange}() Link: https://lkml.kernel.org/r/20230524031247.65949-2-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/memcontrol: export memcg.swap watermark via sysfs for v2 memcgLars R. Damerow2-0/+20
This patch is similar to commit 8e20d4b33266 ("mm/memcontrol: export memcg->watermark via sysfs for v2 memcg"), but exports the swap counter's watermark. We allocate jobs to our compute farm using heuristics determined by memory and swap usage from previous jobs. Tracking the peak swap usage for new jobs is important for determining when jobs are exceeding their expected bounds, or when our baseline metrics are getting outdated. Our toolset was written to use the "memory.memsw.max_usage_in_bytes" file in cgroups v1, and altering it to poll cgroups v2's "memory.swap.current" would give less accurate results as well as add complication to the code. Having this watermark exposed in sysfs is much preferred. Link: https://lkml.kernel.org/r/20230524181734.125696-1-lars@pixar.com Signed-off-by: Lars R. Damerow <lars@pixar.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Zefan Li <lizefan.x@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm: shmem: fix UAF bug in shmem_show_options()Tu Jinjiang1-1/+4
shmem_show_options() uses sbinfo->mpol without adding it's refcnt. This may lead to race with replacement of the mpol by remount. The execution sequence is as follows. CPU0 CPU1 shmem_show_options() shmem_reconfigure() shmem_show_mpol(seq, sbinfo->mpol) mpol = sbinfo->mpol mpol_put(mpol) mpol->mode The KASAN report is as follows. BUG: KASAN: slab-use-after-free in shmem_show_options+0x21b/0x340 Read of size 2 at addr ffff888124324004 by task mount/2388 CPU: 2 PID: 2388 Comm: mount Not tainted 6.4.0-rc3-00017-g9d646009f65d-dirty #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x37/0x50 print_report+0xd0/0x620 ? shmem_show_options+0x21b/0x340 ? __virt_addr_valid+0xf4/0x180 ? shmem_show_options+0x21b/0x340 kasan_report+0xb8/0xe0 ? shmem_show_options+0x21b/0x340 shmem_show_options+0x21b/0x340 ? __pfx_shmem_show_options+0x10/0x10 ? strchr+0x2c/0x50 ? strlen+0x23/0x40 ? seq_puts+0x7d/0x90 show_vfsmnt+0x1e6/0x260 ? __pfx_show_vfsmnt+0x10/0x10 ? __kasan_kmalloc+0x7f/0x90 seq_read_iter+0x57a/0x740 vfs_read+0x2e2/0x4a0 ? __pfx_vfs_read+0x10/0x10 ? down_write_killable+0xb8/0x140 ? __pfx_down_write_killable+0x10/0x10 ? __fget_light+0xa9/0x1e0 ? up_write+0x3f/0x80 ksys_read+0xb8/0x150 ? __pfx_ksys_read+0x10/0x10 ? fpregs_assert_state_consistent+0x55/0x60 ? exit_to_user_mode_prepare+0x2d/0x120 do_syscall_64+0x3c/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc </TASK> Allocated by task 2387: kasan_save_stack+0x22/0x50 kasan_set_track+0x25/0x30 __kasan_slab_alloc+0x59/0x70 kmem_cache_alloc+0xdd/0x220 mpol_new+0x83/0x150 mpol_parse_str+0x280/0x4a0 shmem_parse_one+0x364/0x520 vfs_parse_fs_param+0xf8/0x1a0 vfs_parse_fs_string+0xc9/0x130 shmem_parse_options+0xb2/0x110 path_mount+0x597/0xdf0 do_mount+0xcd/0xf0 __x64_sys_mount+0xbd/0x100 do_syscall_64+0x3c/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Freed by task 2389: kasan_save_stack+0x22/0x50 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2e/0x50 __kasan_slab_free+0x10e/0x1a0 kmem_cache_free+0x9c/0x350 shmem_reconfigure+0x278/0x370 reconfigure_super+0x383/0x450 path_mount+0xcc5/0xdf0 do_mount+0xcd/0xf0 __x64_sys_mount+0xbd/0x100 do_syscall_64+0x3c/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc The buggy address belongs to the object at ffff888124324000 which belongs to the cache numa_policy of size 32 The buggy address is located 4 bytes inside of freed 32-byte region [ffff888124324000, ffff888124324020) ================================================================== To fix the bug, shmem_get_sbmpol() / mpol_put() needs to be called before / after shmem_show_mpol() call. Link: https://lkml.kernel.org/r/20230525031640.593733-1-tujinjiang@huawei.com Signed-off-by: Tu Jinjiang <tujinjiang@huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm: compaction: skip fast freepages isolation if enough freepages are isolatedBaolin Wang1-0/+4
I've observed that fast isolation often isolates more pages than cc->migratepages, and the excess freepages will be released back to the buddy system. So skip fast freepages isolation if enough freepages are isolated to save some CPU cycles. Link: https://lkml.kernel.org/r/f39c2c07f2dba2732fd9c0843572e5bef96f7f67.1685018752.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm: compaction: add trace event for fast freepages isolationBaolin Wang2-1/+16
The fast_isolate_freepages() can also isolate freepages, but we can not know the fast isolation efficiency to understand the fast isolation pressure. So add a trace event to show some numbers to help to understand the efficiency for fast freepages isolation. Link: https://lkml.kernel.org/r/78d2932d0160d122c15372aceb3f2c45460a17fc.1685018752.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm: compaction: only set skip flag if cc->no_set_skip_hint is falseBaolin Wang1-1/+1
To keep the same logic as test_and_set_skip(), only set the skip flag if cc->no_set_skip_hint is false, which makes code more reasonable. Link: https://lkml.kernel.org/r/0eb2cd2407ffb259ae6e3071e10f70f2d41d0f3e.1685018752.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm: compaction: skip more fully scanned pageblockBaolin Wang1-1/+1
In fast_isolate_around(), it assumes the pageblock is fully scanned if cc->nr_freepages < cc->nr_migratepages after trying to isolate some free pages, and will set skip flag to avoid scanning in future. However this can miss setting the skip flag for a fully scanned pageblock (returned 'start_pfn' is equal to 'end_pfn') in the case where cc->nr_freepages is larger than cc->nr_migratepages. So using the returned 'start_pfn' from isolate_freepages_block() and 'end_pfn' to decide if a pageblock is fully scanned makes more sense. It can also cover the case where cc->nr_freepages < cc->nr_migratepages, which means the 'start_pfn' is usually equal to 'end_pfn' except some uncommon fatal error occurs after non-strict mode isolation. Link: https://lkml.kernel.org/r/f4efd2fa08735794a6d809da3249b6715ba6ad38.1685018752.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm: compaction: change fast_isolate_freepages() to void typeBaolin Wang1-5/+3
No caller cares about the return value of fast_isolate_freepages(), void it. Link: https://lkml.kernel.org/r/759fca20b22ebf4c81afa30496837b9e0fb2e53b.1685018752.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm: compaction: drop the redundant page validation in update_pageblock_skip()Baolin Wang1-3/+0
Patch series "Misc cleanups and improvements for compaction". This series cantains some cleanups and improvements for compaction. This patch (of 6): The caller has validated the page before calling update_pageblock_skip(), thus drop the redundant page validation in update_pageblock_skip(). Link: https://lkml.kernel.org/r/5142e15b9295fe8c447dbb39b7907a20177a1413.1685018752.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/vmalloc: dont purge usable blocks unnecessarilyThomas Gleixner1-8/+20
Purging fragmented blocks is done unconditionally in several contexts: 1) From drain_vmap_area_work(), when the number of lazy to be freed vmap_areas reached the threshold 2) Reclaiming vmalloc address space from pcpu_get_vm_areas() 3) _vm_unmap_aliases() #1 There is no reason to zap fragmented vmap blocks unconditionally, simply because reclaiming all lazy areas drains at least 32MB * fls(num_online_cpus()) per invocation which is plenty. #2 Reclaiming when running out of space or due to memory pressure makes a lot of sense #3 _unmap_aliases() requires to touch everything because the caller has no clue which vmap_area used a particular page last and the vmap_area lost that information too. Except for the vfree + VM_FLUSH_RESET_PERMS case, which removes the vmap area first and then cares about the flush. That in turn requires a full walk of _all_ vmap areas including the one which was just added to the purge list. But as this has to be flushed anyway this is an opportunity to combine outstanding TLB flushes and do the housekeeping of purging freed areas, but like #1 there is no real good reason to zap usable vmap blocks unconditionally. Add a @force_purge argument to the newly split out block purge function and if not true only purge fragmented blocks which have less than 1/4 of their capacity left. Rename purge_vmap_area_lazy() to reclaim_and_purge_vmap_areas() to make it clear what the function does. [lstoakes@gmail.com: correct VMAP_PURGE_THRESHOLD check] Link: https://lkml.kernel.org/r/3e92ef61-b910-4576-88e7-cf43211fd4e7@lucifer.local Link: https://lkml.kernel.org/r/20230525124504.864005691@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Baoquan He <bhe@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/vmalloc: add missing READ/WRITE_ONCE() annotationsThomas Gleixner1-5/+8
purge_fragmented_blocks() accesses vmap_block::free and vmap_block::dirty lockless for a quick check. Add the missing READ/WRITE_ONCE() annotations. Link: https://lkml.kernel.org/r/20230525124504.807356682@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Baoquan He <bhe@redhat.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/vmalloc: check free space in vmap_block locklessThomas Gleixner1-1/+4
vb_alloc() unconditionally locks a vmap_block on the free list to check the free space. This can be done locklessly because vmap_block::free never increases, it's only decreased on allocations. Check the free space lockless and only if that succeeds, recheck under the lock. Link: https://lkml.kernel.org/r/20230525124504.750481992@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/vmalloc: prevent flushing dirty space over and overThomas Gleixner1-2/+6
vmap blocks which have active mappings cannot be purged. Allocations which have been freed are accounted for in vmap_block::dirty_min/max, so that they can be detected in _vm_unmap_aliases() as potentially stale TLBs. If there are several invocations of _vm_unmap_aliases() then each of them will flush the dirty range. That's pointless and just increases the probability of full TLB flushes. Avoid that by resetting the flush range after accounting for it. That's safe versus other invocations of _vm_unmap_aliases() because this is all serialized with vmap_purge_lock. Link: https://lkml.kernel.org/r/20230525124504.692056496@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Baoquan He <bhe@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/vmalloc: avoid iterating over per CPU vmap blocks twiceThomas Gleixner1-24/+46
_vunmap_aliases() walks the per CPU xarrays to find partially unmapped blocks and then walks the per cpu free lists to purge fragmented blocks. Arguably that's waste of CPU cycles and cache lines as the full xarray walk already touches every block. Avoid this double iteration: - Split out the code to purge one block and the code to free the local purge list into helper functions. - Try to purge the fragmented blocks in the xarray walk before looking at their dirty space. Link: https://lkml.kernel.org/r/20230525124504.633469722@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Baoquan He <bhe@redhat.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10mm/vmalloc: prevent stale TLBs in fully utilized blocksThomas Gleixner1-1/+2
Patch series "mm/vmalloc: Assorted fixes and improvements", v2. this series addresses the following issues: 1) Prevent the stale TLB problem related to fully utilized vmap blocks 2) Avoid the double per CPU list walk in _vm_unmap_aliases() 3) Avoid flushing dirty space over and over 4) Add a lockless quickcheck in vb_alloc() and add missing READ/WRITE_ONCE() annotations 5) Prevent overeager purging of usable vmap_blocks if not under memory/address space pressure. This patch (of 6): _vm_unmap_aliases() is used to ensure that no unflushed TLB entries for a page are left in the system. This is required due to the lazy TLB flush mechanism in vmalloc. This is tried to achieve by walking the per CPU free lists, but those do not contain fully utilized vmap blocks because they are removed from the free list once the blocks free space became zero. When the block is not fully unmapped then it is not on the purge list either. So neither the per CPU list iteration nor the purge list walk find the block and if the page was mapped via such a block and the TLB has not yet been flushed, the guarantee of _vm_unmap_aliases() that there are no stale TLBs after returning is broken: x = vb_alloc() // Removes vmap_block from free list because vb->free became 0 vb_free(x) // Unmaps page and marks in dirty_min/max range // Block has still mappings and is not put on purge list // Page is reused vm_unmap_aliases() // Can't find vmap block with the dirty space -> FAIL So instead of walking the per CPU free lists, walk the per CPU xarrays which hold pointers to _all_ active blocks in the system including those removed from the free lists. Link: https://lkml.kernel.org/r/20230525122342.109672430@linutronix.de Link: https://lkml.kernel.org/r/20230525124504.573987880@linutronix.de Fixes: db64fe02258f ("mm: rewrite vmap layer") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-10kmemleak-test: drop __init to get better backtraceJim Cromie1-1/+1
Drop the __init on kmemleak_test_init(). With it, the storage is reclaimed, but then the symbol isn't available for "%pS" rendering, and the backtrace gets a bare pointer where the actual leak happened. unreferenced object 0xffff88800a2b0800 (size 1024): comm "modprobe", pid 413, jiffies 4294953430 hex dump (first 32 bytes): 73 02 00 00 75 01 00 68 02 00 00 01 00 00 00 04 s...u..h........ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000fabad728>] kmalloc_trace+0x26/0x90 [<00000000ef738764>] 0xffffffffc02350a2 [<00000000004e5795>] do_one_initcall+0x43/0x210 [<00000000d768905e>] do_init_module+0x4a/0x210 [<0000000087135ab5>] __do_sys_finit_module+0x93/0xf0 [<000000004fcb1fa2>] do_syscall_64+0x34/0x80 [<00000000c73c8d9d>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 with __init gone, that trace entry renders like: [<00000000ef738764>] kmemleak_test_init+<offset>/<size> Link: https://lkml.kernel.org/r/20230525174356.69711-1-jim.cromie@gmail.com Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>