summaryrefslogtreecommitdiff
path: root/mm/mmap.c
AgeCommit message (Collapse)AuthorFilesLines
2023-04-06mm/mmap: free vm_area_struct without call_rcu in exit_mmapSuren Baghdasaryan1-4/+7
call_rcu() can take a long time when callback offloading is enabled. Its use in the vm_area_free can cause regressions in the exit path when multiple VMAs are being freed. Because exit_mmap() is called only after the last mm user drops its refcount, the page fault handlers can't be racing with it. Any other possible user like oom-reaper or process_mrelease are already synchronized using mmap_lock. Therefore exit_mmap() can free VMAs directly, without the use of call_rcu(). Expose __vm_area_free() and use it from exit_mmap() to avoid possible call_rcu() floods and performance regressions caused by it. Link: https://lkml.kernel.org/r/20230227173632.3292573-33-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm: introduce vma detached flagSuren Baghdasaryan1-0/+2
Per-vma locking mechanism will search for VMA under RCU protection and then after locking it, has to ensure it was not removed from the VMA tree after we found it. To make this check efficient, introduce a vma->detached flag to mark VMAs which were removed from the VMA tree. Link: https://lkml.kernel.org/r/20230227173632.3292573-23-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap: prevent pagefault handler from racing with mmu_notifier registrationSuren Baghdasaryan1-0/+9
Page fault handlers might need to fire MMU notifications while a new notifier is being registered. Modify mm_take_all_locks to write-lock all VMAs and prevent this race with page fault handlers that would hold VMA locks. VMAs are locked before i_mmap_rwsem and anon_vma to keep the same locking order as in page fault handlers. Link: https://lkml.kernel.org/r/20230227173632.3292573-22-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm: conditionally write-lock VMA in free_pgtablesSuren Baghdasaryan1-2/+3
Normally free_pgtables needs to lock affected VMAs except for the case when VMAs were isolated under VMA write-lock. munmap() does just that, isolating while holding appropriate locks and then downgrading mmap_lock and dropping per-VMA locks before freeing page tables. Add a parameter to free_pgtables for such scenario. Link: https://lkml.kernel.org/r/20230227173632.3292573-20-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm: write-lock VMAs before removing them from VMA treeSuren Baghdasaryan1-0/+1
Write-locking VMAs before isolating them ensures that page fault handlers don't operate on isolated VMAs. [surenb@google.com: mm/nommu: remove unnecessary VMA locking] Link: https://lkml.kernel.org/r/20230301190457.1498985-1-surenb@google.com Link: https://lore.kernel.org/all/Y%2F8CJQGNuMUTdLwP@localhost/ Link: https://lkml.kernel.org/r/20230227173632.3292573-19-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mremap: write-lock VMA while remapping it to a new address rangeSuren Baghdasaryan1-0/+1
Write-lock VMA as locked before copying it and when copy_vma produces a new VMA. Link: https://lkml.kernel.org/r/20230227173632.3292573-18-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap: write-lock VMAs in vma_prepare before modifying themSuren Baghdasaryan1-0/+9
Write-lock all VMAs which might be affected by a merge, split, expand or shrink operations. All these operations use vma_prepare() before making the modifications, therefore it provides a centralized place to perform VMA locking. [surenb@google.com: remove unnecessary vp->vma check in vma_prepare] Link: https://lkml.kernel.org/r/20230301022720.1380780-1-surenb@google.com Link: https://lore.kernel.org/r/202302281802.J93Nma7q-lkp@intel.com/ Link: https://lkml.kernel.org/r/20230227173632.3292573-17-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Laurent Dufour <laurent.dufour@fr.ibm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap: move vma_prepare before vma_adjust_trans_hugeSuren Baghdasaryan1-5/+5
vma_prepare() acquires all locks required before VMA modifications. Move vma_prepare() before vma_adjust_trans_huge() so that VMA is locked before any modification. Link: https://lkml.kernel.org/r/20230227173632.3292573-15-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable caseLorenzo Stoakes1-25/+22
Rather than setting err = -1 and only resetting if we hit merge cases, explicitly check the non-mergeable case to make it abundantly clear that we only proceed with the rest if something is mergeable, default err to 0 and only update if an error might occur. Move the merge_prev, merge_next cases closer to the logic determining curr, next and reorder initial variables so they are more logically grouped. This has no functional impact. Link: https://lkml.kernel.org/r/99259fbc6403e80e270e1cc4612abbc8620b121b.1679516210.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vernon Yang <vernon2gm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: explicitly assign res, vma, extend invariantsLorenzo Stoakes1-5/+14
Previously, vma was an uninitialised variable which was only definitely assigned as a result of the logic covering all possible input cases - for it to have remained uninitialised, prev would have to be NULL, and next would _have_ to be mergeable. The value of res defaults to NULL, so we can neatly eliminate the assignment to res and vma in the if (prev) block and ensure that both res and vma are both explicitly assigned, by just setting both to prev. In addition we add an explanation as to under what circumstances both might change, and since we absolutely do rely on addr == curr->vm_start should curr exist, assert that this is the case. Link: https://lkml.kernel.org/r/83938bed24422cbe5954bbf491341674becfe567.1679516210.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vernon Yang <vernon2gm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: fold curr, next assignment logicLorenzo Stoakes1-13/+11
Use find_vma_intersection() and vma_lookup() to both simplify the logic and to fold the end == next->vm_start condition into one block. This groups all of the simple range checks together and establishes the invariant that, if prev, curr or next are non-NULL then their positions are as expected. This has no functional impact. Link: https://lkml.kernel.org/r/c6d960641b4ba58fa6ad3d07bf68c27d847963c8.1679516210.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vernon Yang <vernon2gm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: further improve prev/next VMA namingLorenzo Stoakes1-43/+43
Patch series "further cleanup of vma_merge()", v2. Following on from Vlastimil Babka's patch series "cleanup vma_merge() and improve mergeability tests" which was in turn based on Liam's prior cleanups, this patch series introduces changes discussed in review of Vlastimil's series and goes further in attempting to make the logic as clear as possible. Nearly all of this should have absolutely no functional impact, however it does add a singular VM_WARN_ON() case. With many thanks to Vernon for helping kick start the discussion around simplification - abstract use of vma did indeed turn out not to be necessary - and to Liam for his excellent suggestions which greatly simplified things. This patch (of 4): Previously the ASCII diagram above vma_merge() and the accompanying variable naming was rather confusing, however recent efforts by Liam Howlett and Vlastimil Babka have significantly improved matters. This patch goes a little further - replacing 'X' with 'N' which feels a lot more natural and replacing what was 'N' with 'C' which stands for 'concurrent' VMA. No word quite describes a VMA that has coincident start as the input span, concurrent, abbreviated to 'curr' (and which can be thought of also as 'current') however fits intuitions well alongside prev and next. This has no functional impact. Link: https://lkml.kernel.org/r/cover.1679431180.git.lstoakes@gmail.com Link: https://lkml.kernel.org/r/6001e08fa7e119470cbb1d2b6275ad8d742ff9a7.1679431180.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vernon Yang <vernon2gm@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap: start distinguishing if vma can be removed in mergeability testVlastimil Babka1-5/+10
Since pre-git times, is_mergeable_vma() returns false for a vma with vm_ops->close, so that no owner assumptions are violated in case the vma is removed as part of the merge. This check is currently very conservative and can prevent merging even situations where vma can't be removed, such as simple expansion of previous vma, as evidenced by commit d014cd7c1c35 ("mm, mremap: fix mremap() expanding for vma's with vm_ops->close()") In order to allow more merging when appropriate and simplify the code that was made more complex by commit d014cd7c1c35, start distinguishing cases where the vma can be really removed, and allow merging with vm_ops->close otherwise. As a first step, add a may_remove_vma parameter to is_mergeable_vma(). can_vma_merge_before() sets it to true, because when called from vma_merge(), a removal of the vma is possible. In can_vma_merge_after(), pass the parameter as false, because no removal can occur in each of its callers: - vma_merge() calls it on the 'prev' vma, which is never removed - mmap_region() and do_brk_flags() call it to determine if it can expand a vma, which is not removed As a result, vma's with vm_ops->close may now merge with compatible ranges in more situations than previously. We can also revert commit d014cd7c1c35 as the next step to simplify mremap code again. [vbabka@suse.cz: adjust comment as suggested by Lorenzo] Link: https://lkml.kernel.org/r/74f2ea6c-f1a9-6dd7-260c-25e660f42379@suse.cz Link: https://lkml.kernel.org/r/20230309111258.24079-10-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: convert mergeability checks to return boolVlastimil Babka1-28/+25
The comments already mention returning 'true' so make the code match them. Link: https://lkml.kernel.org/r/20230309111258.24079-9-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: rename adj_next to adj_startVlastimil Babka1-8/+8
The variable 'adj_next' holds the value by which we adjust vm_start of a vma in variable 'adjust', that's either 'next' or 'mid', so the current name is inaccurate. Rename it to 'adj_start'. Link: https://lkml.kernel.org/r/20230309111258.24079-8-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: set mid to NULL if not applicableVlastimil Babka1-8/+15
There are several places where we test if 'mid' is really the area NNNN in the diagram and the tests have two variants and are non-obvious to follow. Instead, set 'mid' to NULL up-front if it's not the NNNN area, and simplify the tests. Also update the description in comment accordingly. [vbabka@suse.cz: adjust/add comments as suggested by Lorenzo] Link: https://lkml.kernel.org/r/def43190-53f7-a607-d1b0-b657565f4288@suse.cz Link: https://lkml.kernel.org/r/20230309111258.24079-7-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: initialize mid and next in natural orderVlastimil Babka1-4/+5
It is more intuitive to go from prev to mid and then next. No functional change. Link: https://lkml.kernel.org/r/20230309111258.24079-6-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: use the proper vma pointer in case 4Vlastimil Babka1-4/+4
Almost all cases now use the 'next' pointer for the vma following the merged area, and the cases diagram shows it as XXXX. Case 4 is different as it uses 'mid' and NNNN, so change it for consistency. No functional change. Link: https://lkml.kernel.org/r/20230309111258.24079-5-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: use the proper vma pointers in cases 1 and 6Vlastimil Babka1-5/+6
Case 1 is now shown in the comment as next vma being merged with prev, so use 'next' instead of 'mid'. In case 1 they both point to the same vma. As a consequence, in case 6, the dup_anon_vma() is now tried first on 'next' and then on 'mid', before it was the opposite order. This is not a functional change, as those two vma's cannnot have a different anon_vma, as that would have prevented the merging in the first place. Link: https://lkml.kernel.org/r/20230309111258.24079-4-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: use the proper vma pointer in case 3Vlastimil Babka1-4/+5
In case 3 we we use 'next' for everything but vma_pgoff. So use 'next' for that as well, instead of 'mid', for consistency. Then in case 8 we have to use 'mid' explicitly, which should also make the intent more obvious. Adjust the diagram for cases 1-3 in the comment to match the code - we are using 'next' for case 3 so mark the range with XXXX instead of NNNN. For case 2 that's a no-op as the code doesn't touch 'next' or 'mid'. For case 1 it's now wrong but that will be fixed next. No functional change. Link: https://lkml.kernel.org/r/20230309111258.24079-3-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-06mm/mmap/vma_merge: use only primary pointers for preparing mergeVlastimil Babka1-7/+7
Patch series "cleanup vma_merge() and improve mergeability tests". My initial goal here was to try making the check for vm_ops->close in is_mergeable_vma() only be applied for vma's that would be truly removed as part of the merge (see Patch 9). This would then allow reverting the quick fix d014cd7c1c35 ("mm, mremap: fix mremap() expanding for vma's with vm_ops->close()"). This was successful enough to allow the revert (Patch 10). Checks using can_vma_merge_before() are still pessimistic about possible vma removal, and making them precise would probably complicate the vma_merge() code too much. Liam's 6.3-rc1 simplification of vma_merge() and removal of __vma_adjust() was very much helpful in understanding the vma_merge() implementation and especially when vma removals can happen, which is now very obvious. While studing the code, I've found ways to make it hopefully even more easy to follow, so that's the patches 1-8. That made me also notice a bug that's now already fixed in 6.3-rc1. This patch (of 10): In the merging preparation part of vma_merge(), some vma pointer variables are assigned for later execution of the merge, but also read from in the block itself. The code is easier follow and check against the cases diagram in the comment if the code reads only from the "primary" vma variables prev, mid, next instead. No functional change. Link: https://lkml.kernel.org/r/20230309111258.24079-1-vbabka@suse.cz Link: https://lkml.kernel.org/r/20230309111258.24079-2-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>] Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-24mm: deduplicate error handling for map_deny_write_execJoey Gouly1-6/+1
Patch series "Fixes for MDWE prctl" These are four small fixes for the recent memory-write-deny-execute prctl patches [1]. Two reported by Alexey about error handling and two tooling fixes by Peter. This patch (of 4): Commit cc8d1b097de7 ("mmap: clean up mmap_region() unrolling") deduplicated the error handling, do the same for the return value of `map_deny_write_exec`. Link: https://lkml.kernel.org/r/20230308190423.46491-1-joey.gouly@arm.com Link: https://lkml.kernel.org/r/20230308190423.46491-2-joey.gouly@arm.com Link: https://lore.kernel.org/linux-arm-kernel/20230119160344.54358-1-joey.gouly@arm.com/ [1] Fixes: b507808ebce2 ("mm: implement memory-deny-write-execute as a prctl") Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reported-by: Alexey Izbyshev <izbyshev@ispras.ru> Link: https://lore.kernel.org/linux-arm-kernel/8408d8901e9d7ee6b78db4c6cba04b78@ispras.ru/ Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: nd <nd@arm.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-28mm/mremap: fix dup_anon_vma() in vma_merge() case 4Vlastimil Babka1-1/+1
In case 4, we are shrinking 'prev' (PPPP in the comment) and expanding 'mid' (NNNN). So we need to make sure 'mid' clones the anon_vma from 'prev', if it doesn't have any. After commit 0503ea8f5ba7 ("mm/mmap: remove __vma_adjust()") we can fail to do that due to wrong parameters for dup_anon_vma(). The call is a no-op because res == next, adjust == mid and mid == next. Fix it. Link: https://lkml.kernel.org/r/ad91d62b-37eb-4b73-707a-3c45c9e16256@suse.cz Fixes: 0503ea8f5ba7 ("mm/mmap: remove __vma_adjust()") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm: introduce __vm_flags_mod and use it in untrack_pfnSuren Baghdasaryan1-6/+10
There are scenarios when vm_flags can be modified without exclusive mmap_lock, such as: - after VMA was isolated and mmap_lock was downgraded or dropped - in exit_mmap when there are no other mm users and locking is unnecessary Introduce __vm_flags_mod to avoid assertions when the caller takes responsibility for the required locking. Pass a hint to untrack_pfn to conditionally use __vm_flags_mod for flags modification to avoid assertion. Link: https://lkml.kernel.org/r/20230126193752.297968-7-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjun Roy <arjunroy@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David Rientjes <rientjes@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@google.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Oskolkov <posk@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Punit Agrawal <punit.agrawal@bytedance.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Sebastian Reichel <sebastian.reichel@collabora.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Song Liu <songliubraving@fb.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm: replace vma->vm_flags direct modifications with modifier callsSuren Baghdasaryan1-5/+5
Replace direct modifications to vma->vm_flags with calls to modifier functions to be able to track flag changes and to keep vma locking correctness. [akpm@linux-foundation.org: fix drivers/misc/open-dice.c, per Hyeonggon Yoo] Link: https://lkml.kernel.org/r/20230126193752.297968-5-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjun Roy <arjunroy@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: David Rientjes <rientjes@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Minchan Kim <minchan@google.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Oskolkov <posk@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Punit Agrawal <punit.agrawal@bytedance.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Shakeel Butt <shakeelb@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Song Liu <songliubraving@fb.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm: replace VM_LOCKED_CLEAR_MASK with VM_LOCKED_MASKSuren Baghdasaryan1-3/+3
To simplify the usage of VM_LOCKED_CLEAR_MASK in vm_flags_clear(), replace it with VM_LOCKED_MASK bitmask and convert all users. Link: https://lkml.kernel.org/r/20230126193752.297968-4-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjun Roy <arjunroy@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Minchan Kim <minchan@google.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Oskolkov <posk@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Punit Agrawal <punit.agrawal@bytedance.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Sebastian Reichel <sebastian.reichel@collabora.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Song Liu <songliubraving@fb.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10vma_merge: set vma iterator to correct position.Liam R. Howlett1-3/+1
When merging the previous value, set the vma iterator to the previous slot. Don't use the vma iterator to get the next/prev so that it is in the correct position for a write. Link: https://lkml.kernel.org/r/20230120162650.984577-50-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm/mmap: remove __vma_adjust()Liam R. Howlett1-153/+97
Inline the work of __vma_adjust() into vma_merge(). This reduces code size and has the added benefits of the comments for the cases being located with the code. Change the comments referencing vma_adjust() accordingly. [Liam.Howlett@oracle.com: fix vma_merge() offset when expanding the next vma] Link: https://lkml.kernel.org/r/20230130195713.2881766-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20230120162650.984577-49-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm/mmap: convert do_brk_flags() to use vma_prepare() and vma_complete()Liam R. Howlett1-8/+4
Use the abstracted vma locking for do_brk_flags() Link: https://lkml.kernel.org/r/20230120162650.984577-48-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm/mmap: introduce dup_vma_anon() helperLiam R. Howlett1-34/+40
Create a helper for duplicating the anon vma when adjusting the vma. This simplifies the logic of __vma_adjust(). Link: https://lkml.kernel.org/r/20230120162650.984577-47-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm/mmap: don't use __vma_adjust() in shift_arg_pages()Liam R. Howlett1-9/+43
Introduce shrink_vma() which uses the vma_prepare() and vma_complete() functions to reduce the vma coverage. Convert shift_arg_pages() to use expand_vma() and the new shrink_vma() function. Remove support from __vma_adjust() to reduce a vma size since shift_arg_pages() is the only user that shrinks a VMA in this way. Link: https://lkml.kernel.org/r/20230120162650.984577-46-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm/mremap: convert vma_adjust() to vma_expand()Liam R. Howlett1-3/+3
Stop using vma_adjust() in preparation for removing the function. Export vma_expand() to use instead. Link: https://lkml.kernel.org/r/20230120162650.984577-45-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm: don't use __vma_adjust() in __split_vma()Liam R. Howlett1-63/+55
Use the abstracted locking and maple tree operations. Since __split_vma() is the only user of the __vma_adjust() function to use the insert argument, drop that argument. Remove the NULL passed through from fs/exec's shift_arg_pages() and mremap() at the same time. Link: https://lkml.kernel.org/r/20230120162650.984577-44-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm/mmap: introduce init_vma_prep() and init_multi_vma_prep()Liam R. Howlett1-47/+61
Add init_vma_prep() and init_multi_vma_prep() to set up the struct vma_prepare. This is to abstract the locking when adjusting the VMAs. Also change __vma_adjust() variable remove_next int in favour of a pointer to the VMA to remove. Rename next_next to remove2 since this better reflects its use. Link: https://lkml.kernel.org/r/20230120162650.984577-43-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm/mmap: use vma_prepare() and vma_complete() in vma_expand()Liam R. Howlett1-116/+72
Use the new locking functions for vma_expand(). This reduces code duplication. At the same time change VM_BUG_ON() to VM_WARN_ON() Link: https://lkml.kernel.org/r/20230120162650.984577-42-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm/mmap: refactor locking out of __vma_adjust()Liam R. Howlett1-95/+136
Move the locking into vma_prepare() and vma_complete() for use elsewhere Link: https://lkml.kernel.org/r/20230120162650.984577-41-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm/mmap: move anon_vma setting in __vma_adjust()Liam R. Howlett1-5/+8
Move the anon_vma setting & warn_no up the function. This is done to clear up the locking later. Link: https://lkml.kernel.org/r/20230120162650.984577-40-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm: change munmap splitting order and move_vma()Liam R. Howlett1-16/+2
Splitting can be more efficient when the order is not of concern. Change do_vmi_align_munmap() to reduce walking of the tree during split operations. move_vma() must also be altered to remove the dependency of keeping the original VMA as the active part of the split. Transition to using vma iterator to look up the prev and/or next vma after munmap. [Liam.Howlett@oracle.com: fix vma iterator initialization] Link: https://lkml.kernel.org/r/20230126212011.980350-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20230120162650.984577-39-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mmap: clean up mmap_region() unrollingLiam R. Howlett1-27/+18
Move logic of unrolling to the error path as apposed to duplicating it within the function body. This reduces the potential of missing an update to one path when making changes. Link: https://lkml.kernel.org/r/20230120162650.984577-38-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Li Zetao <lizetao1@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm: add vma iterator to vma_adjust() argumentsLiam R. Howlett1-5/+5
Change the vma_adjust() function definition to accept the vma iterator and pass it through to __vma_adjust(). Update fs/exec to use the new vma_adjust() function parameters. Update mm/mremap to use the new vma_adjust() function parameters. Revert the __split_vma() calls back from __vma_adjust() to vma_adjust() and pass through the vma iterator. Link: https://lkml.kernel.org/r/20230120162650.984577-37-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm: pass vma iterator through to __vma_adjust()Liam R. Howlett1-8/+14
Pass the iterator through to be used in __vma_adjust(). The state of the iterator needs to be correct for the operation that will occur so make the adjustments. Link: https://lkml.kernel.org/r/20230120162650.984577-36-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm: remove unnecessary write to vma iterator in __vma_adjust()Liam R. Howlett1-4/+6
If the vma start address is going to change due to an insert, then it is safe to not write the vma to the tree. The write of the insert vma will alter the tree as necessary. Link: https://lkml.kernel.org/r/20230120162650.984577-35-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm: pass through vma iterator to __vma_adjust()Liam R. Howlett1-16/+15
Pass the vma iterator through to __vma_adjust() so the state can be updated. Link: https://lkml.kernel.org/r/20230120162650.984577-33-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mmap: convert __vma_adjust() to use vma iteratorLiam R. Howlett1-62/+13
Use the vma iterator internally for __vma_adjust(). Avoid using the maple tree interface directly for type safety. Link: https://lkml.kernel.org/r/20230120162650.984577-32-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm: switch vma_merge(), split_vma(), and __split_vma to vma iteratorLiam R. Howlett1-57/+22
Drop the vmi_* functions and transition all users to use the vma iterator directly. Link: https://lkml.kernel.org/r/20230120162650.984577-30-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mmap: use vmi version of vma_merge()Liam R. Howlett1-3/+5
Use the vma iterator so that the iterator can be invalidated or updated to avoid each caller doing so. Link: https://lkml.kernel.org/r/20230120162650.984577-26-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mmap: pass through vmi iterator to __split_vma()Liam R. Howlett1-2/+2
Use the vma iterator so that the iterator can be invalidated or updated to avoid each caller doing so. Link: https://lkml.kernel.org/r/20230120162650.984577-25-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10ipc/shm: introduce new do_vma_munmap() to munmapLiam R. Howlett1-20/+18
The shm already has the vma iterator in position for a write. do_vmi_munmap() searches for the correct position and aligns the write, so it is not the right function to use in this case. The shm VMA tree modification is similar to the brk munmap situation, the vma iterator is in position and the VMA is already known. This patch generalizes the brk munmap function do_brk_munmap() to be used for any other callers with the vma iterator already in position to munmap a VMA. Link: https://lkml.kernel.org/r/20230126212049.980501-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Sven Schnelle <svens@linux.ibm.com> Link: https://lore.kernel.org/linux-mm/yt9dh6wec21a.fsf@linux.ibm.com/ Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mm: add temporary vma iterator versions of vma_merge(), split_vma(), and ↵Liam R. Howlett1-0/+44
__split_vma() These wrappers are short-lived in this patch set so that each user can be converted on its own. In the end, these functions are renamed in one commit. Link: https://lkml.kernel.org/r/20230120162650.984577-15-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-10mmap: convert vma_expand() to use vma iteratorLiam R. Howlett1-5/+4
Use the vma iterator instead of the maple state for type safety and for consistency through the mm code. Link: https://lkml.kernel.org/r/20230120162650.984577-14-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>