<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/mm/mseal.c, branch linux-7.1.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.1.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.1.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-04-05T20:53:41+00:00</updated>
<entry>
<title>mm/vma: convert vma_modify_flags[_uffd]() to use vma_flags_t</title>
<updated>2026-04-05T20:53:41+00:00</updated>
<author>
<name>Lorenzo Stoakes (Oracle)</name>
<email>ljs@kernel.org</email>
</author>
<published>2026-03-20T19:38:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a06eb2f8279e0b2b42799d42041f144377f5a086'/>
<id>urn:sha1:a06eb2f8279e0b2b42799d42041f144377f5a086</id>
<content type='text'>
Update the vma_modify_flags() and vma_modify_flags_uffd() functions to
accept a vma_flags_t parameter rather than a vm_flags_t one, and propagate
the changes as needed to implement this change.

Also add vma_flags_reset_once() in replacement of vm_flags_reset_once(). We
still need to be careful here because we need to avoid tearing, so maintain
the assumption that the first system word set of flags are the only ones
that require protection from tearing, and retain this functionality.

We can copy the remainder of VMA flags above 64 bits normally. But
hopefully by the time that happens, we will have replaced the logic that
requires these WRITE_ONCE()'s with something else.

We also replace instances of vm_flags_reset() with a simple write of VMA
flags. We are no longer perform a number of checks, most notable of all the
VMA flags asserts becase:

1. We might be operating on a VMA that is not yet added to the tree.

2. We might be operating on a VMA that is now detached.

3. Really in all but core code, you should be using vma_desc_xxx().

4. Other VMA fields are manipulated with no such checks.

5. It'd be egregious to have to add variants of flag functions just to
   account for cases such as the above, especially when we don't do so for
   other VMA fields. Drivers are the problematic cases and why it was
   especially important (and also for debug as VMA locks were introduced),
   the mmap_prepare work is solving this generally.

Additionally, we can fairly safely assume by this point the soft dirty
flags are being set correctly, so it's reasonable to drop this also.

Finally, update the VMA tests to reflect this.

Link: https://lkml.kernel.org/r/51afbb2b8c3681003cc7926647e37335d793836e.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) &lt;ljs@kernel.org&gt;
Acked-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
Cc: Albert Ou &lt;aou@eecs.berkeley.edu&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Cc: Alexandre Ghiti &lt;alex@ghiti.fr&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Anton Ivanov &lt;anton.ivanov@cambridgegreys.com&gt;
Cc: "Borislav Petkov (AMD)" &lt;bp@alien8.de&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Cc: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: David Hildenbrand &lt;david@kernel.org&gt;
Cc: Dinh Nguyen &lt;dinguyen@kernel.org&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Huacai Chen &lt;chenhuacai@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: Liam Howlett &lt;liam.howlett@oracle.com&gt;
Cc: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Rapoport &lt;rppt@kernel.org&gt;
Cc: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Cc: Ondrej Mosnacek &lt;omosnace@redhat.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Paul Moore &lt;paul@paul-moore.com&gt;
Cc: Pedro Falcato &lt;pfalcato@suse.de&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Stephen Smalley &lt;stephen.smalley.work@gmail.com&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vineet Gupta &lt;vgupta@kernel.org&gt;
Cc: WANG Xuerui &lt;kernel@xen0n.name&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: xu xin &lt;xu.xin16@zte.com.cn&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mseal: update VMA end correctly on merge</title>
<updated>2026-04-05T20:53:38+00:00</updated>
<author>
<name>Lorenzo Stoakes (Oracle)</name>
<email>ljs@kernel.org</email>
</author>
<published>2026-03-27T17:31:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5ac9c7c2efd0d3c0c2d3bc6e9cd900d3ab6af27a'/>
<id>urn:sha1:5ac9c7c2efd0d3c0c2d3bc6e9cd900d3ab6af27a</id>
<content type='text'>
Previously we stored the end of the current VMA in curr_end, and then upon
iterating to the next VMA updated curr_start to curr_end to advance to the
next VMA.

However, this doesn't take into account the fact that a VMA might be
updated due to a merge by vma_modify_flags(), which can result in curr_end
being stale and thus, upon setting curr_start to curr_end, ending up with
an incorrect curr_start on the next iteration.

Resolve the issue by setting curr_end to vma-&gt;vm_end unconditionally to
ensure this value remains updated should this occur.

While we're here, eliminate this entire class of bug by simply setting
const curr_[start/end] to be clamped to the input range and VMAs, which
also happens to simplify the logic.

Link: https://lkml.kernel.org/r/20260327173104.322405-1-ljs@kernel.org
Fixes: 6c2da14ae1e0 ("mm/mseal: rework mseal apply logic")
Signed-off-by: Lorenzo Stoakes (Oracle) &lt;ljs@kernel.org&gt;
Reported-by: Antonius &lt;antonius@bluedragonsec.com&gt;
Closes: https://lore.kernel.org/linux-mm/CAK8a0jwWGj9-SgFk0yKFh7i8jMkwKm5b0ao9=kmXWjO54veX2g@mail.gmail.com/
Suggested-by: David Hildenbrand (ARM) &lt;david@kernel.org&gt;
Acked-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
Reviewed-by: Pedro Falcato &lt;pfalcato@suse.de&gt;
Acked-by: David Hildenbrand (Arm) &lt;david@kernel.org&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Jeff Xu &lt;jeffxu@chromium.org&gt;
Cc: Liam Howlett &lt;liam.howlett@oracle.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix minor spelling mistakes in comments</title>
<updated>2026-01-21T03:24:48+00:00</updated>
<author>
<name>Kevin Lourenco</name>
<email>klourencodev@gmail.com</email>
</author>
<published>2025-12-18T15:09:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=62451ae347b0015bf3d644c97cbc14e75a8287e6'/>
<id>urn:sha1:62451ae347b0015bf3d644c97cbc14e75a8287e6</id>
<content type='text'>
Correct several typos in comments across files in mm/

[akpm@linux-foundation.org: also fix comment grammar, per SeongJae]
Link: https://lkml.kernel.org/r/20251218150906.25042-1-klourencodev@gmail.com
Signed-off-by: Kevin Lourenco &lt;klourencodev@gmail.com&gt;
Reviewed-by: SeongJae Park &lt;sj@kernel.org&gt;
Acked-by: David Hildenbrand (Red Hat) &lt;david@kernel.org&gt;
Reviewed-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: update vma_modify_flags() to handle residual flags, document</title>
<updated>2025-11-20T21:43:58+00:00</updated>
<author>
<name>Lorenzo Stoakes</name>
<email>lorenzo.stoakes@oracle.com</email>
</author>
<published>2025-11-18T10:17:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9119d6c2095bb20292cb9812dd70d37f17e3bd37'/>
<id>urn:sha1:9119d6c2095bb20292cb9812dd70d37f17e3bd37</id>
<content type='text'>
The vma_modify_*() family of functions each either perform splits, a merge
or no changes at all in preparation for the requested modification to
occur.

When doing so for a VMA flags change, we currently don't account for any
flags which may remain (for instance, VM_SOFTDIRTY) despite the requested
change in the case that a merge succeeded.

This is made more important by subsequent patches which will introduce the
concept of sticky VMA flags which rely on this behaviour.

This patch fixes this by passing the VMA flags parameter as a pointer and
updating it accordingly on merge and updating callers to accommodate for
this.

Additionally, while we are here, we add kdocs for each of the
vma_modify_*() functions, as the fact that the requested modification is
not performed is confusing so it is useful to make this abundantly clear.

We also update the VMA userland tests to account for this change.

Link: https://lkml.kernel.org/r/23b5b549b0eaefb2922625626e58c2a352f3e93c.1763460113.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Reviewed-by: Pedro Falcato &lt;pfalcato@suse.de&gt;
Reviewed-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Andrei Vagin &lt;avagin@gmail.com&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: Barry Song &lt;baohua@kernel.org&gt;
Cc: David Hildenbrand (Red Hat) &lt;david@kernel.org&gt;
Cc: Dev Jain &lt;dev.jain@arm.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Lance Yang &lt;lance.yang@linux.dev&gt;
Cc: Liam Howlett &lt;liam.howlett@oracle.com&gt;
Cc: "Masami Hiramatsu (Google)" &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Rapoport &lt;rppt@kernel.org&gt;
Cc: Nico Pache &lt;npache@redhat.com&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mseal: rework mseal apply logic</title>
<updated>2025-08-02T19:06:09+00:00</updated>
<author>
<name>Lorenzo Stoakes</name>
<email>lorenzo.stoakes@oracle.com</email>
</author>
<published>2025-07-25T08:29:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6c2da14ae1e0a0146587381594559027bd46c059'/>
<id>urn:sha1:6c2da14ae1e0a0146587381594559027bd46c059</id>
<content type='text'>
The logic can be simplified - firstly by renaming the inconsistently named
apply_mm_seal() to mseal_apply().

We then wrap mseal_fixup() into the main loop as the logic is simple
enough to not require it, equally it isn't a hugely pleasant pattern in
mprotect() etc.  so it's not something we want to perpetuate.

We eliminate the need for invoking vma_iter_end() on each loop by directly
determining if the VMA was merged - the only thing we need concern
ourselves with is whether the start/end of the (gapless) range are offset
into VMAs.

This refactoring also avoids the rather horrid 'pass pointer to prev
around' pattern used in mprotect() et al.

No functional change intended.

Link: https://lkml.kernel.org/r/ddfa4376ce29f19a589d7dc8c92cb7d4f7605a4c.1753431105.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Reviewed-by: Pedro Falcato &lt;pfalcato@suse.de&gt;
Reviewed-by: Liam R. Howlett &lt;Liam.Howlett@oracle.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Jeff Xu &lt;jeffxu@chromium.org&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mseal: simplify and rename VMA gap check</title>
<updated>2025-08-02T19:06:09+00:00</updated>
<author>
<name>Lorenzo Stoakes</name>
<email>lorenzo.stoakes@oracle.com</email>
</author>
<published>2025-07-25T08:29:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=530e090964130d538dfa74874012ca461ef692fa'/>
<id>urn:sha1:530e090964130d538dfa74874012ca461ef692fa</id>
<content type='text'>
The check_mm_seal() function is doing something general - checking whether
a range contains only VMAs (or rather that it does NOT contain any
unmapped regions).

So rename this function to range_contains_unmapped().

Additionally simplify the logic, we are simply checking whether the last
vma-&gt;vm_end has either a VMA starting after it or ends before the end
parameter.

This check is rather dubious, so it is sensible to keep it local to
mm/mseal.c as at a later stage it may be removed, and we don't want any
other mm code to perform such a check.

No functional change intended.

[lorenzo.stoakes@oracle.com: add comment explaining why we disallow gaps on mseal()]
  Link: https://lkml.kernel.org/r/d85b3d55-09dc-43ba-8204-b48267a96751@lucifer.local
Link: https://lkml.kernel.org/r/dd50984eff1e242b5f7f0f070a3360ef760e06b8.1753431105.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Reviewed-by: Liam R. Howlett &lt;Liam.Howlett@oracle.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Jeff Xu &lt;jeffxu@chromium.org&gt;
Reviewed-by: Pedro Falcato &lt;pfalcato@suse.de&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mseal: small cleanups</title>
<updated>2025-08-02T19:06:09+00:00</updated>
<author>
<name>Lorenzo Stoakes</name>
<email>lorenzo.stoakes@oracle.com</email>
</author>
<published>2025-07-25T08:29:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8b2914162aa3a56062d4b7c716149946672d48a6'/>
<id>urn:sha1:8b2914162aa3a56062d4b7c716149946672d48a6</id>
<content type='text'>
Drop the wholly unnecessary set_vma_sealed() helper(), which is used only
once, and place VMA_ITERATOR() declarations in the correct place.

Retain vma_is_sealed(), and use it instead of the confusingly named
can_modify_vma(), so it's abundantly clear what's being tested, rather
then a nebulous sense of 'can the VMA be modified'.

No functional change intended.

Link: https://lkml.kernel.org/r/98cf28d04583d632a6eb698e9ad23733bb6af26b.1753431105.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Reviewed-by: Liam R. Howlett &lt;Liam.Howlett@oracle.com&gt;
Reviewed-by: Pedro Falcato &lt;pfalcato@suse.de&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Jeff Xu &lt;jeffxu@chromium.org&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mseal: update madvise() logic</title>
<updated>2025-08-02T19:06:09+00:00</updated>
<author>
<name>Lorenzo Stoakes</name>
<email>lorenzo.stoakes@oracle.com</email>
</author>
<published>2025-07-25T08:29:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d0b47a6866f1047247061f3a38f12a981825b265'/>
<id>urn:sha1:d0b47a6866f1047247061f3a38f12a981825b265</id>
<content type='text'>
The madvise() logic is inexplicably performed in mm/mseal.c - this ought
to be located in mm/madvise.c.

Additionally can_modify_vma_madv() is inconsistently named and, in
combination with is_ro_anon(), is very confusing logic.

Put a static function in mm/madvise.c instead - can_madvise_modify() -
that spells out exactly what's happening.  Also explicitly check for an
anon VMA.

Also add commentary to explain what's going on.

Essentially - we disallow discarding of data in mseal()'d mappings in
instances where the user couldn't otherwise write to that data.

We retain the existing behaviour here regarding MAP_PRIVATE mappings of
file-backed mappings, which entails some complexity - while this, strictly
speaking - appears to violate mseal() semantics, it may interact badly
with users which expect to be able to madvise(MADV_DONTNEED) .text
mappings for instance.

We may revisit this at a later date.

No functional change intended.

Link: https://lkml.kernel.org/r/492a98d9189646e92c8f23f4cce41ed323fe01df.1753431105.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Reviewed-by: Liam R. Howlett &lt;Liam.Howlett@oracle.com&gt;
Reviewed-by: Pedro Falcato &lt;pfalcato@suse.de&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Jeff Xu &lt;jeffxu@chromium.org&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mseal: remove can_do_mseal()</title>
<updated>2025-01-14T06:40:51+00:00</updated>
<author>
<name>Jeff Xu</name>
<email>jeffxu@chromium.org</email>
</author>
<published>2024-12-06T19:48:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b487c02430104a2df7ecee8dabaf3b3b66e536bf'/>
<id>urn:sha1:b487c02430104a2df7ecee8dabaf3b3b66e536bf</id>
<content type='text'>
No code logic change.

can_do_mseal() is called exclusively by mseal.c, and mseal.c is compiled
only when CONFIG_64BIT flag is set in makefile.  Therefore, it is
unnecessary to have 32 bit stub function in the header file, remove this
function and merge the logic into do_mseal().

Link: https://lkml.kernel.org/r/20241206013934.2782793-1-jeffxu@google.com
Link: https://lkml.kernel.org/r/20241206194839.3030596-2-jeffxu@google.com
Signed-off-by: Jeff Xu &lt;jeffxu@chromium.org&gt;
Reviewed-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Cc: Jorge Lucangeli Obes &lt;jorgelo@chromium.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Liam R. Howlett &lt;Liam.Howlett@Oracle.com&gt;
Cc: Pedro Falcato &lt;pedro.falcato@gmail.com&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: madvise: implement lightweight guard page mechanism</title>
<updated>2024-11-11T08:26:45+00:00</updated>
<author>
<name>Lorenzo Stoakes</name>
<email>lorenzo.stoakes@oracle.com</email>
</author>
<published>2024-10-28T14:13:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=662df3e5c37666d6ed75c88098699e070a4b35b5'/>
<id>urn:sha1:662df3e5c37666d6ed75c88098699e070a4b35b5</id>
<content type='text'>
Implement a new lightweight guard page feature, that is regions of
userland virtual memory that, when accessed, cause a fatal signal to
arise.

Currently users must establish PROT_NONE ranges to achieve this.

However this is very costly memory-wise - we need a VMA for each and every
one of these regions AND they become unmergeable with surrounding VMAs.

In addition repeated mmap() calls require repeated kernel context switches
and contention of the mmap lock to install these ranges, potentially also
having to unmap memory if installed over existing ranges.

The lightweight guard approach eliminates the VMA cost altogether - rather
than establishing a PROT_NONE VMA, it operates at the level of page table
entries - establishing PTE markers such that accesses to them cause a
fault followed by a SIGSGEV signal being raised.

This is achieved through the PTE marker mechanism, which we have already
extended to provide PTE_MARKER_GUARD, which we installed via the generic
page walking logic which we have extended for this purpose.

These guard ranges are established with MADV_GUARD_INSTALL.  If the range
in which they are installed contain any existing mappings, they will be
zapped, i.e.  free the range and unmap memory (thus mimicking the
behaviour of MADV_DONTNEED in this respect).

Any existing guard entries will be left untouched.  There is therefore no
nesting of guarded pages.

Guarded ranges are NOT cleared by MADV_DONTNEED nor MADV_FREE (in both
instances the memory range may be reused at which point a user would
expect guards to still be in place), but they are cleared via
MADV_GUARD_REMOVE, process teardown or unmapping of memory ranges.

The guard property can be removed from ranges via MADV_GUARD_REMOVE.  The
ranges over which this is applied, should they contain non-guard entries,
will be untouched, with only guard entries being cleared.

We permit this operation on anonymous memory only, and only VMAs which are
non-special, non-huge and not mlock()'d (if we permitted this we'd have to
drop locked pages which would be rather counterintuitive).

Racing page faults can cause repeated attempts to install guard pages that
are interrupted, result in a zap, and this process can end up being
repeated.  If this happens more than would be expected in normal
operation, we rescind locks and retry the whole thing, which avoids lock
contention in this scenario.

Link: https://lkml.kernel.org/r/6aafb5821bf209f277dfae0787abb2ef87a37542.1730123433.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Suggested-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Suggested-by: Jann Horn &lt;jannh@google.com&gt;
Suggested-by: David Hildenbrand &lt;david@redhat.com&gt;
Suggested-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Suggested-by: Jann Horn &lt;jannh@google.com&gt;
Suggested-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Arnd Bergmann &lt;arnd@kernel.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: James E.J. Bottomley &lt;James.Bottomley@HansenPartnership.com&gt;
Cc: Jeff Xu &lt;jeffxu@chromium.org&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Cc: Liam R. Howlett &lt;Liam.Howlett@Oracle.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Richard Henderson &lt;richard.henderson@linaro.org&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Cc: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
Cc: Sidhartha Kumar &lt;sidhartha.kumar@oracle.com&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
