<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/mm/ksm.c, branch v7.1</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.1</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.1'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-04-05T20:53:40+00:00</updated>
<entry>
<title>mm: convert do_brk_flags() to use vma_flags_t</title>
<updated>2026-04-05T20:53:40+00:00</updated>
<author>
<name>Lorenzo Stoakes (Oracle)</name>
<email>ljs@kernel.org</email>
</author>
<published>2026-03-20T19:38:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3a6455d56bd7c4cfb1ea35ddae052943065e338e'/>
<id>urn:sha1:3a6455d56bd7c4cfb1ea35ddae052943065e338e</id>
<content type='text'>
In order to be able to do this, we need to change VM_DATA_DEFAULT_FLAGS
and friends and update the architecture-specific definitions also.

We then have to update some KSM logic to handle VMA flags, and introduce
VMA_STACK_FLAGS to define the vma_flags_t equivalent of VM_STACK_FLAGS.

We also introduce two helper functions for use during the time we are
converting legacy flags to vma_flags_t values - vma_flags_to_legacy() and
legacy_to_vma_flags().

This enables us to iteratively make changes to break these changes up into
separate parts.

We use these explicitly here to keep VM_STACK_FLAGS around for certain
users which need to maintain the legacy vm_flags_t values for the time
being.

We are no longer able to rely on the simple VM_xxx being set to zero if
the feature is not enabled, so in the case of VM_DROPPABLE we introduce
VMA_DROPPABLE as the vma_flags_t equivalent, which is set to
EMPTY_VMA_FLAGS if the droppable flag is not available.

While we're here, we make the description of do_brk_flags() into a kdoc
comment, as it almost was already.

We use vma_flags_to_legacy() to not need to update the vm_get_page_prot()
logic as this time.

Note that in create_init_stack_vma() we have to replace the BUILD_BUG_ON()
with a VM_WARN_ON_ONCE() as the tested values are no longer build time
available.

We also update mprotect_fixup() to use VMA flags where possible, though we
have to live with a little duplication between vm_flags_t and vma_flags_t
values for the time being until further conversions are made.

While we're here, update VM_SPECIAL to be defined in terms of
VMA_SPECIAL_FLAGS now we have vma_flags_to_legacy().

Finally, we update the VMA tests to reflect these changes.

Link: https://lkml.kernel.org/r/d02e3e45d9a33d7904b149f5604904089fd640ae.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) &lt;ljs@kernel.org&gt;
Acked-by: Paul Moore &lt;paul@paul-moore.com&gt;	[SELinux]
Acked-by: Vlastimil Babka (SUSE) &lt;vbabka@kernel.org&gt;
Cc: Albert Ou &lt;aou@eecs.berkeley.edu&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Cc: Alexandre Ghiti &lt;alex@ghiti.fr&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Anton Ivanov &lt;anton.ivanov@cambridgegreys.com&gt;
Cc: "Borislav Petkov (AMD)" &lt;bp@alien8.de&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Cc: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: David Hildenbrand &lt;david@kernel.org&gt;
Cc: Dinh Nguyen &lt;dinguyen@kernel.org&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Huacai Chen &lt;chenhuacai@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: Liam Howlett &lt;liam.howlett@oracle.com&gt;
Cc: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Rapoport &lt;rppt@kernel.org&gt;
Cc: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Cc: Ondrej Mosnacek &lt;omosnace@redhat.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Pedro Falcato &lt;pfalcato@suse.de&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Stephen Smalley &lt;stephen.smalley.work@gmail.com&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vineet Gupta &lt;vgupta@kernel.org&gt;
Cc: WANG Xuerui &lt;kernel@xen0n.name&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: xu xin &lt;xu.xin16@zte.com.cn&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ksm: initialize the addr only once in rmap_walk_ksm</title>
<updated>2026-04-05T20:52:35+00:00</updated>
<author>
<name>xu xin</name>
<email>xu.xin16@zte.com.cn</email>
</author>
<published>2026-02-12T11:29:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=318d87b8fa733bebdc9c803657671df0b7b8b006'/>
<id>urn:sha1:318d87b8fa733bebdc9c803657671df0b7b8b006</id>
<content type='text'>
This is a minor performance optimization, especially when there are many
for-loop iterations, because the addr variable doesn't change across
iterations.

Therefore, it only needs to be initialized once before the loop.

Link: https://lkml.kernel.org/r/20260212192820223O_r2NQzSEPG_C56cs-z4l@zte.com.cn
Link: https://lkml.kernel.org/r/20260212192932941MSsJEAyoRW4YdLBN7_myn@zte.com.cn
Signed-off-by: xu xin &lt;xu.xin16@zte.com.cn&gt;
Acked-by: David Hildenbrand (Arm) &lt;david@kernel.org&gt;
Cc: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Wang Yaxin &lt;wang.yaxin@zte.com.cn&gt;
Cc: Yang Yang &lt;yang.yang29@zte.com.cn&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Convert more 'alloc_obj' cases to default GFP_KERNEL arguments</title>
<updated>2026-02-22T04:03:00+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T04:03:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=32a92f8c89326985e05dce8b22d3f0aa07a3e1bd'/>
<id>urn:sha1:32a92f8c89326985e05dce8b22d3f0aa07a3e1bd</id>
<content type='text'>
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm/ksm: fix pte_unmap_unlock of wrong address in break_ksm_pmd_entry</title>
<updated>2025-12-23T19:23:17+00:00</updated>
<author>
<name>Sasha Levin</name>
<email>sashal@kernel.org</email>
</author>
<published>2025-12-20T20:29:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d6b5a8d6f142ad0a8e45181f06e70b4746c4abc3'/>
<id>urn:sha1:d6b5a8d6f142ad0a8e45181f06e70b4746c4abc3</id>
<content type='text'>
On ARM32 with HIGHMEM/HIGHPTE, break_ksm_pmd_entry() triggers a BUG during
KSM unmerging because pte_unmap_unlock() is passed a pointer that may be
beyond the mapped PTE page.

The issue occurs when the PTE iteration loop completes without finding a
KSM page.  After the loop, 'ptep' has been incremented past the last PTE
entry.  On ARM32 LPAE with 512 PTEs per page (512 * 8 = 4096 bytes), this
means ptep points to the next page, outside the kmap'd region.

When pte_unmap_unlock(ptep, ptl) calls kunmap_local(ptep), it unmaps the
wrong page address, leaving the original kmap slot still mapped.  The next
kmap_local then finds this slot unexpectedly occupied:

  WARNING: mm/highmem.c:622 kunmap_local_indexed  (address mismatch)
  kernel BUG at mm/highmem.c:564  __kmap_local_pfn_prot  (slot not empty)

Fix this by passing start_ptep to pte_unmap_unlock(), which always points
within the originally mapped PTE page.

Reproducer: Run LTP ksm03 test on ARM32 with HIGHMEM enabled.  The test
triggers KSM merging followed by unmerging (writing 0 then 2 to
/sys/kernel/mm/ksm/run), which exercises break_ksm_pmd_entry().

Link: https://lkml.kernel.org/r/20251220202926.318366-1-sashal@kernel.org
Fixes: 5d4939fc2258 ("ksm: perform a range-walk in break_ksm")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Assisted-by: claude-opus-4-5-20251101
Acked-by: David Hildenbrand (Red Hat) &lt;david@kernel.org&gt;
Reviewed-by: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Cc: Pedro Demarchi Gomes &lt;pedrodemargomes@gmail.com&gt;
Cc: xu xin &lt;xu.xin16@zte.com.cn&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: eliminate further swapops predicates</title>
<updated>2025-11-24T23:08:52+00:00</updated>
<author>
<name>Lorenzo Stoakes</name>
<email>lorenzo.stoakes@oracle.com</email>
</author>
<published>2025-11-10T22:21:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=93976a20345b4aff1ac7598ec1223d65ca33d49c'/>
<id>urn:sha1:93976a20345b4aff1ac7598ec1223d65ca33d49c</id>
<content type='text'>
Having converted so much of the code base to software leaf entries, we can
mop up some remaining cases.

We replace is_pfn_swap_entry(), pfn_swap_entry_to_page(),
is_writable_device_private_entry(), is_device_exclusive_entry(),
is_migration_entry(), is_writable_migration_entry(),
is_readable_migration_entry(), swp_offset_pfn() and pfn_swap_entry_folio()
with softleaf equivalents.

No functional change intended.

Link: https://lkml.kernel.org/r/956bc9c031604811c0070d2f4bf2f1373f230213.1762812360.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Axel Rasmussen &lt;axelrasmussen@google.com&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Barry Song &lt;baohua@kernel.org&gt;
Cc: Byungchul Park &lt;byungchul@sk.com&gt;
Cc: Chengming Zhou &lt;chengming.zhou@linux.dev&gt;
Cc: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Dev Jain &lt;dev.jain@arm.com&gt;
Cc: Gerald Schaefer &lt;gerald.schaefer@linux.ibm.com&gt;
Cc: Gregory Price &lt;gourry@gourry.net&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: "Huang, Ying" &lt;ying.huang@linux.alibaba.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Janosch Frank &lt;frankja@linux.ibm.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Joshua Hahn &lt;joshua.hahnjy@gmail.com&gt;
Cc: Kairui Song &lt;kasong@tencent.com&gt;
Cc: Kemeng Shi &lt;shikemeng@huaweicloud.com&gt;
Cc: Lance Yang &lt;lance.yang@linux.dev&gt;
Cc: Leon Romanovsky &lt;leon@kernel.org&gt;
Cc: Liam Howlett &lt;liam.howlett@oracle.com&gt;
Cc: Mathew Brost &lt;matthew.brost@intel.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Rapoport &lt;rppt@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Naoya Horiguchi &lt;nao.horiguchi@gmail.com&gt;
Cc: Nhat Pham &lt;nphamcs@gmail.com&gt;
Cc: Nico Pache &lt;npache@redhat.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Rakie Kim &lt;rakie.kim@sk.com&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: SeongJae Park &lt;sj@kernel.org&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Wei Xu &lt;weixugc@google.com&gt;
Cc: xu xin &lt;xu.xin16@zte.com.cn&gt;
Cc: Yuanchu Xie &lt;yuanchu@google.com&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ksm: replace function unmerge_ksm_pages with break_ksm</title>
<updated>2025-11-17T01:28:28+00:00</updated>
<author>
<name>Pedro Demarchi Gomes</name>
<email>pedrodemargomes@gmail.com</email>
</author>
<published>2025-11-05T18:49:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=05c3fa9c9fa636b4e5856b0d86c3f194bbc804e4'/>
<id>urn:sha1:05c3fa9c9fa636b4e5856b0d86c3f194bbc804e4</id>
<content type='text'>
Function unmerge_ksm_pages() is unnecessary since now break_ksm() walks an
address range.  So replace it with break_ksm().

Link: https://lkml.kernel.org/r/20251105184912.186329-4-pedrodemargomes@gmail.com
Signed-off-by: Pedro Demarchi Gomes &lt;pedrodemargomes@gmail.com&gt;
Suggested-by: David Hildenbrand (Red Hat) &lt;david@kernel.org&gt;
Acked-by: David Hildenbrand (Red Hat) &lt;david@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ksm: perform a range-walk in break_ksm</title>
<updated>2025-11-17T01:28:28+00:00</updated>
<author>
<name>Pedro Demarchi Gomes</name>
<email>pedrodemargomes@gmail.com</email>
</author>
<published>2025-11-05T18:49:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5d4939fc2258e80e0eda2a8a190c9e4f78f52456'/>
<id>urn:sha1:5d4939fc2258e80e0eda2a8a190c9e4f78f52456</id>
<content type='text'>
Make break_ksm() receive an address range and change break_ksm_pmd_entry()
to perform a range-walk and return the address of the first ksm page
found.

This change allows break_ksm() to skip unmapped regions instead of
iterating every page address.  When unmerging large sparse VMAs, this
significantly reduces runtime.

In a benchmark unmerging a 32 TiB sparse virtual address space where only
one page was populated, the runtime dropped from 9 minutes to less then 5
seconds.

Link: https://lkml.kernel.org/r/20251105184912.186329-3-pedrodemargomes@gmail.com
Signed-off-by: Pedro Demarchi Gomes &lt;pedrodemargomes@gmail.com&gt;
Suggested-by: David Hildenbrand (Red Hat) &lt;david@kernel.org&gt;
Acked-by: David Hildenbrand (Red Hat) &lt;david@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "mm/ksm: convert break_ksm() from walk_page_range_vma() to folio_walk"</title>
<updated>2025-11-17T01:28:28+00:00</updated>
<author>
<name>Pedro Demarchi Gomes</name>
<email>pedrodemargomes@gmail.com</email>
</author>
<published>2025-11-05T18:49:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=912aa825957f556a29d781c8f4cb4f4dfd938a9d'/>
<id>urn:sha1:912aa825957f556a29d781c8f4cb4f4dfd938a9d</id>
<content type='text'>
Patch series "ksm: perform a range-walk to jump over holes in break_ksm",
v4.

When unmerging an address range, unmerge_ksm_pages function walks every
page address in the specified range to locate ksm pages.  This becomes
highly inefficient when scanning large virtual memory areas that contain
mostly unmapped regions, causing the process to get blocked for several
minutes.

This patch makes break_ksm, function called by unmerge_ksm_pages for every
page in an address range, perform a range walk, allowing it to skip over
entire unmapped holes in a VMA, avoiding unnecessary lookups.

As pointed out by David Hildenbrand in [1], unmerge_ksm_pages() is called
from:

* ksm_madvise() through madvise(MADV_UNMERGEABLE).  There are not a lot
  of users of that function.

* __ksm_del_vma() through ksm_del_vmas().  Effectively called when
  disabling KSM for a process either through the sysctl or from s390x gmap
  code when enabling storage keys for a VM.

Consider the following test program which creates a 32 TiB mapping in the
virtual address space but only populates a single page:

#include &lt;unistd.h&gt;
#include &lt;stdio.h&gt;
#include &lt;sys/mman.h&gt;

/* 32 TiB */
const size_t size = 32ul * 1024 * 1024 * 1024 * 1024;

int main() {
        char *area = mmap(NULL, size, PROT_READ | PROT_WRITE,
                          MAP_NORESERVE | MAP_PRIVATE | MAP_ANON, -1, 0);

        if (area == MAP_FAILED) {
                perror("mmap() failed\n");
                return -1;
        }

        /* Populate a single page such that we get an anon_vma. */
        *area = 0;

        /* Enable KSM. */
        madvise(area, size, MADV_MERGEABLE);
        madvise(area, size, MADV_UNMERGEABLE);
        return 0;
}


Without this patch, this program takes 9 minutes to finish, while with
this patch it finishes in less then 5 seconds.


This patch (of 3):

This reverts commit e317a8d8b4f600fc7ec9725e26417030ee594f52 and changes
function break_ksm_pmd_entry() to use folios.

This reverts break_ksm() to use walk_page_range_vma() instead of
folio_walk_start().

Change break_ksm_pmd_entry() to call is_ksm_zero_pte() only if we know the
folio is present, and also rename variable ret to found.  This will make
it easier to later modify break_ksm() to perform a proper range walk.

Link: https://lkml.kernel.org/r/20251105184912.186329-1-pedrodemargomes@gmail.com
Link: https://lkml.kernel.org/r/20251105184912.186329-2-pedrodemargomes@gmail.com
Link: https://lore.kernel.org/linux-mm/e0886fdf-d198-4130-bd9a-be276c59da37@redhat.com/ [1]
Signed-off-by: Pedro Demarchi Gomes &lt;pedrodemargomes@gmail.com&gt;
Suggested-by: David Hildenbrand (Red Hat) &lt;david@kernel.org&gt;
Acked-by: David Hildenbrand (Red Hat) &lt;david@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/ksm: fix exec/fork inheritance support for prctl</title>
<updated>2025-11-17T01:27:55+00:00</updated>
<author>
<name>xu xin</name>
<email>xu.xin16@zte.com.cn</email>
</author>
<published>2025-10-07T10:28:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=590c03ca6a3fbb114396673314e2aa483839608b'/>
<id>urn:sha1:590c03ca6a3fbb114396673314e2aa483839608b</id>
<content type='text'>
Patch series "ksm: fix exec/fork inheritance", v2.

This series fixes exec/fork inheritance.  See the detailed description of
the issue below.


This patch (of 2):

Background
==========

commit d7597f59d1d33 ("mm: add new api to enable ksm per process")
introduced MMF_VM_MERGE_ANY for mm-&gt;flags, and allowed user to set it by
prctl() so that the process's VMAs are forcibly scanned by ksmd. 

Subsequently, the 3c6f33b7273a ("mm/ksm: support fork/exec for prctl")
supported inheriting the MMF_VM_MERGE_ANY flag when a task calls execve().

Finally, commit 3a9e567ca45fb ("mm/ksm: fix ksm exec support for prctl")
fixed the issue that ksmd doesn't scan the mm_struct with MMF_VM_MERGE_ANY
by adding the mm_slot to ksm_mm_head in __bprm_mm_init().

Problem
=======

In some extreme scenarios, however, this inheritance of MMF_VM_MERGE_ANY
during exec/fork can fail.  For example, when the scanning frequency of
ksmd is tuned extremely high, a process carrying MMF_VM_MERGE_ANY may
still fail to pass it to the newly exec'd process.  This happens because
ksm_execve() is executed too early in the do_execve flow (prematurely
adding the new mm_struct to the ksm_mm_slot list).

As a result, before do_execve completes, ksmd may have already performed a
scan and found that this new mm_struct has no VM_MERGEABLE VMAs, thus
clearing its MMF_VM_MERGE_ANY flag.  Consequently, when the new program
executes, the flag MMF_VM_MERGE_ANY inheritance missed.

Root reason
===========

commit d7597f59d1d33 ("mm: add new api to enable ksm per process") clear
the flag MMF_VM_MERGE_ANY when ksmd found no VM_MERGEABLE VMAs.

Solution
========

Firstly, Don't clear MMF_VM_MERGE_ANY when ksmd found no VM_MERGEABLE
VMAs, because perhaps their mm_struct has just been added to ksm_mm_slot
list, and its process has not yet officially started running or has not
yet performed mmap/brk to allocate anonymous VMAS.

Secondly, recheck MMF_VM_MERGEABLE again if a process takes
MMF_VM_MERGE_ANY, and create a mm_slot and join it into ksm_scan_list
again.

Link: https://lkml.kernel.org/r/20251007182504440BJgK8VXRHh8TD7IGSUIY4@zte.com.cn
Link: https://lkml.kernel.org/r/20251007182821572h_SoFqYZXEP1mvWI4n9VL@zte.com.cn
Fixes: 3c6f33b7273a ("mm/ksm: support fork/exec for prctl")
Fixes: d7597f59d1d3 ("mm: add new api to enable ksm per process")
Signed-off-by: xu xin &lt;xu.xin16@zte.com.cn&gt;
Cc: Stefan Roesch &lt;shr@devkernel.io&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jinjiang Tu &lt;tujinjiang@huawei.com&gt;
Cc: Wang Yaxin &lt;wang.yaxin@zte.com.cn&gt;
Cc: Yang Yang &lt;yang.yang29@zte.com.cn&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
