<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/arch/s390/include/asm/pgalloc.h, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-09-25T12:28:58+00:00</updated>
<entry>
<title>s390/mm: Add memory allocation profiling hooks</title>
<updated>2025-09-25T12:28:58+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2025-09-23T15:34:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=088bb10e37252034ec58a6152f20bfdc8a837f54'/>
<id>urn:sha1:088bb10e37252034ec58a6152f20bfdc8a837f54</id>
<content type='text'>
Similar to common code changes [1] add alloc_hook() wrappers to page table
allocation functions to allow for memory allocation profiling.

If CONFIG_MEM_ALLOC_PROFILING is enabled call sites of page table
allocations are accounted, instead of e.g. only crst_table_alloc() and
page_table_alloc(). This allows for slightly better profiling data, and the
output of /proc/allocinfo is similar to other architectures.

Without alloc_hook() wrappers the output of /proc/allocinfo looks like
this:

    17096704     4174 mm/memory.c:1061 func:folio_prealloc
    17809408     4348 mm/memory.c:1063 func:folio_prealloc
           0        0 mm/memory.c:4422 func:alloc_swap_folio
           0        0 mm/memory.c:4286 func:__alloc_swap_folio
           0        0 mm/memory.c:4971 func:alloc_anon_folio
         ...
     1589248       97 arch/s390/mm/pgalloc.c:25 func:crst_table_alloc
           0        0 arch/s390/mm/pgalloc.c:124 func:page_table_alloc_pgste
     4280320     1045 arch/s390/mm/pgalloc.c:149 func:page_table_alloc

With alloc_hook() wrappers:

     1097728      268 mm/memory.c:5147 func:__do_fault
    20119552     4912 mm/memory.c:1061 func:folio_prealloc
    17534976     4281 mm/memory.c:1063 func:folio_prealloc
           0        0 mm/memory.c:4422 func:alloc_swap_folio
           0        0 mm/memory.c:4286 func:__alloc_swap_folio
      786432      192 mm/memory.c:452 func:__pte_alloc
      405504       99 mm/memory.c:464 func:__pte_alloc_kernel
     1880064      459 mm/memory.c:5525 func:do_fault_around
           0        0 mm/memory.c:6403 func:__p4d_alloc
           0        0 mm/memory.c:6426 func:__pud_alloc
     1064960       65 mm/memory.c:6450 func:__pmd_alloc
           0        0 mm/memory.c:4971 func:alloc_anon_folio
           0        0 mm/memory.c:5233 func:do_set_pmd

[1] commit 2c321f3f70bc ("mm: change inlined allocation helpers to account at the call site")

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Acked-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</content>
</entry>
<entry>
<title>mm: pass mm down to pagetable_{pte,pmd}_ctor</title>
<updated>2025-05-12T00:48:21+00:00</updated>
<author>
<name>Kevin Brodsky</name>
<email>kevin.brodsky@arm.com</email>
</author>
<published>2025-04-08T09:52:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d82d3bf4115217bb3f43f9320bad5d68a35c278f'/>
<id>urn:sha1:d82d3bf4115217bb3f43f9320bad5d68a35c278f</id>
<content type='text'>
Patch series "Always call constructor for kernel page tables", v2.

There has been much confusion around exactly when page table
constructors/destructors (pagetable_*_[cd]tor) are supposed to be called. 
They were initially introduced for user PTEs only (to support split page
table locks), then at the PMD level for the same purpose.  Accounting was
added later on, starting at the PTE level and then moving to higher levels
(PMD, PUD).  Finally, with my earlier series "Account page tables at all
levels" [1], the ctor/dtor is run for all levels, all the way to PGD.

I thought this was the end of the story, and it hopefully is for user
pgtables, but I was wrong for what concerns kernel pgtables.  The current
situation there makes very little sense:

* At the PTE level, the ctor/dtor is not called (at least in the generic
  implementation).  Specific helpers are used for kernel pgtables at this
  level (pte_{alloc,free}_kernel()) and those have never called the
  ctor/dtor, most likely because they were initially irrelevant in the
  kernel case.

* At all other levels, the ctor/dtor is normally called.  This is
  potentially wasteful at the PMD level (more on that later).

This series aims to ensure that the ctor/dtor is always called for kernel
pgtables, as it already is for user pgtables.  Besides consistency, the
main motivation is to guarantee that ctor/dtor hooks are systematically
called; this makes it possible to insert hooks to protect page tables [2],
for instance.  There is however an extra challenge: split locks are not
used for kernel pgtables, and it would therefore be wasteful to initialise
them (ptlock_init()).

It is worth clarifying exactly when split locks are used.  They clearly
are for user pgtables, but as illustrated in commit 61444cde9170 ("ARM:
8591/1: mm: use fully constructed struct pages for EFI pgd allocations"),
they also are for special page tables like efi_mm.  The one case where
split locks are definitely unused is pgtables owned by init_mm; this is
consistent with the behaviour of apply_to_pte_range().

The approach chosen in this series is therefore to pass the mm associated
to the pgtables being constructed to pagetable_{pte,pmd}_ctor() (patch 1),
and skip ptlock_init() if mm == &amp;init_mm (patch 3 and 7).  This makes it
possible to call the PTE ctor/dtor from pte_{alloc,free}_kernel() without
unintended consequences (patch 3).  As a result the accounting functions
are now called at all levels for kernel pgtables, and split locks are
never initialised.

In configurations where ptlocks are dynamically allocated (32-bit,
PREEMPT_RT, etc.) and ARCH_ENABLE_SPLIT_PMD_PTLOCK is selected, this
series results in the removal of a kmem_cache allocation for every kernel
PMD.  Additionally, for certain architectures that do not use
&lt;asm-generic/pgalloc.h&gt; such as s390, the same optimisation occurs at the
PTE level.

===

Things get more complicated when it comes to special pgtable allocators
(patch 8-12).  All architectures need such allocators to create initial
kernel pgtables; we are not concerned with those as the ctor cannot be
called so early in the boot sequence.  However, those allocators may also
be used later in the boot sequence or during normal operations.  There are
two main use-cases:

1. Mapping EFI memory: efi_mm (arm, arm64, riscv)
2. arch_add_memory(): init_mm

The ctor is already explicitly run (at the PTE/PMD level) in the first
case, as required for pgtables that are not associated with init_mm. 
However the same allocators may also be used for the second use-case (or
others), and this is where it gets messy.  Patch 1 calls the ctor with
NULL as mm in those situations, as the actual mm isn't available. 
Practically this means that ptlocks will be unconditionally initialised. 
This is fine on arm - create_mapping_late() is only used for the EFI
mapping.  On arm64, __create_pgd_mapping() is also used by
arch_add_memory(); patch 8/9/11 ensure that ctors are called at all levels
with the appropriate mm.  The situation is similar on riscv, but
propagating the mm down to the ctor would require significant refactoring.
Since they are already called unconditionally, this series leaves riscv
no worse off - patch 10 adds comments to clarify the situation.

From a cursory look at other architectures implementing arch_add_memory(),
s390 and x86 may also need a similar treatment to add constructor calls. 
This is to be taken care of in a future version or as a follow-up.

===

The complications in those special pgtable allocators beg the question:
does it really make sense to treat efi_mm and init_mm differently in e.g. 
apply_to_pte_range()?  Maybe what we really need is a way to tell if an mm
corresponds to user memory or not, and never use split locks for non-user
mm's.  Feedback and suggestions welcome!


This patch (of 12):

In preparation for calling constructors for all kernel page tables while
eliding unnecessary ptlock initialisation, let's pass down the associated
mm to the PTE/PMD level ctors.  (These are the two levels where ptlocks
are used.)

In most cases the mm is already around at the point of calling the ctor so
we simply pass it down.  This is however not the case for special page
table allocators:

* arch/arm/mm/mmu.c
* arch/arm64/mm/mmu.c
* arch/riscv/mm/init.c

In those cases, the page tables being allocated are either for standard
kernel memory (init_mm) or special page directories, which may not be
associated to any mm.  For now let's pass NULL as mm; this will be refined
where possible in future patches.

No functional change in this patch.

Link: https://lore.kernel.org/linux-mm/20250103184415.2744423-1-kevin.brodsky@arm.com/ [1]
Link: https://lore.kernel.org/linux-hardening/20250203101839.1223008-1-kevin.brodsky@arm.com/ [2]
Link: https://lkml.kernel.org/r/20250408095222.860601-1-kevin.brodsky@arm.com
Link: https://lkml.kernel.org/r/20250408095222.860601-2-kevin.brodsky@arm.com
Signed-off-by: Kevin Brodsky &lt;kevin.brodsky@arm.com&gt;
Reviewed-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;	[s390]
Cc: Albert Ou &lt;aou@eecs.berkeley.edu&gt;
Cc: Andreas Larsson &lt;andreas@gaisler.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Kevin Brodsky &lt;kevin.brodsky@arm.com&gt;
Cc: Linus Waleij &lt;linus.walleij@linaro.org&gt;
Cc: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Mike Rapoport &lt;rppt@kernel.org&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Yang Shi &lt;yang@os.amperecomputing.com&gt;
Cc: &lt;x86@kernel.org&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>s390/sysctl: Remove "vm/allocate_pgste" sysctl</title>
<updated>2025-03-18T16:13:05+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2025-03-10T11:43:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=46ba4d0bfcc1cafa2336cab03c02124ba0da0552'/>
<id>urn:sha1:46ba4d0bfcc1cafa2336cab03c02124ba0da0552</id>
<content type='text'>
Remove the not needed "vm/allocate_pgste" sysctl. It has no effect
anymore. However this is a user space visible change. It shouldn't cause
any problems, however if it does this needs to be partially reverted.

Note that some distributions set
vm/allocate_pgste=1

in one of the various sysctl configuration files. Besides a warning about
the (now) non-existent procfs file this doesn't cause any problems.

Reviewed-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
</content>
</entry>
<entry>
<title>mm: introduce ctor/dtor at PGD level</title>
<updated>2025-01-26T04:22:24+00:00</updated>
<author>
<name>Kevin Brodsky</name>
<email>kevin.brodsky@arm.com</email>
</author>
<published>2025-01-03T18:44:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d95936a2267c11a38917d5fc7bf3862a64fe13d8'/>
<id>urn:sha1:d95936a2267c11a38917d5fc7bf3862a64fe13d8</id>
<content type='text'>
Following on from the introduction of P4D-level ctor/dtor, let's finish
the job and introduce ctor/dtor at PGD level.  The incurred improvement in
page accounting is minimal - the main motivation is to create a single,
generic place where construction/destruction hooks can be added for all
page table pages.

This patch should cover all architectures and all configurations where
PGDs are one or more regular pages.  This excludes any configuration where
PGDs are allocated from a kmem_cache object.

Link: https://lkml.kernel.org/r/20250103184415.2744423-7-kevin.brodsky@arm.com
Signed-off-by: Kevin Brodsky &lt;kevin.brodsky@arm.com&gt;
Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Acked-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: pgtable: introduce pagetable_dtor()</title>
<updated>2025-01-26T04:22:22+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2025-01-08T06:57:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=db6b435d731a8d82c38e558175db55466cb5832a'/>
<id>urn:sha1:db6b435d731a8d82c38e558175db55466cb5832a</id>
<content type='text'>
The pagetable_p*_dtor() are exactly the same except for the handling of
ptlock.  If we make ptlock_free() handle the case where ptdesc-&gt;ptl is
NULL and remove VM_BUG_ON_PAGE() from pmd_ptlock_free(), we can unify
pagetable_p*_dtor() into one function.  Let's introduce pagetable_dtor()
to do this.

Later, pagetable_dtor() will be moved to tlb_remove_ptdesc(), so that
ptlock and page table pages can be freed together (regardless of whether
RCU is used).  This prevents the use-after-free problem where the ptlock
is freed immediately but the page table pages is freed later via RCU.

Link: https://lkml.kernel.org/r/47f44fff9dc68d9d9e9a0d6c036df275f820598a.1736317725.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Originally-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Kevin Brodsky &lt;kevin.brodsky@arm.com&gt;
Acked-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;	[s390]
Cc: Alexandre Ghiti &lt;alex@ghiti.fr&gt;
Cc: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Cc: Andreas Larsson &lt;andreas@gaisler.com&gt;
Cc: Aneesh Kumar K.V (Arm) &lt;aneesh.kumar@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>s390: pgtable: add statistics for PUD and P4D level page table</title>
<updated>2025-01-26T04:22:22+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2025-01-08T06:57:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b7dcd539bfc7301b56fdaa6ebbb23752b670ee81'/>
<id>urn:sha1:b7dcd539bfc7301b56fdaa6ebbb23752b670ee81</id>
<content type='text'>
Like PMD and PTE level page table, also add statistics for PUD and P4D
page table.

Link: https://lkml.kernel.org/r/4707dffce228ccec5c6662810566dd12b5741c4b.1736317725.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Suggested-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Kevin Brodsky &lt;kevin.brodsky@arm.com&gt;
Acked-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Cc: Alexandre Ghiti &lt;alex@ghiti.fr&gt;
Cc: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Cc: Andreas Larsson &lt;andreas@gaisler.com&gt;
Cc: Aneesh Kumar K.V (Arm) &lt;aneesh.kumar@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>s390: supplement for ptdesc conversion</title>
<updated>2024-03-06T21:04:18+00:00</updated>
<author>
<name>Qi Zheng</name>
<email>zhengqi.arch@bytedance.com</email>
</author>
<published>2024-03-04T11:07:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=57b77b75caf02c7f1784ba5f9ea55275bd9a27b4'/>
<id>urn:sha1:57b77b75caf02c7f1784ba5f9ea55275bd9a27b4</id>
<content type='text'>
After commit 6326c26c1514 ("s390: convert various pgalloc functions to use
ptdescs"), there are still some positions that use page-&gt;{lru, index}
instead of ptdesc-&gt;{pt_list, pt_index}.  In order to make the use of
ptdesc-&gt;{pt_list, pt_index} clearer, it would be better to convert them as
well.

[zhengqi.arch@bytedance.com: fix build failure]
  Link: https://lkml.kernel.org/r/20240305072154.26168-1-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/04beaf3255056ffe131a5ea595736066c1e84756.1709541697.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Cc: Janosch Frank &lt;frankja@linux.ibm.com&gt;
Cc: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Mike Rapoport (IBM) &lt;rppt@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>s390/mm: make pte_free_tlb() similar to pXd_free_tlb()</title>
<updated>2023-11-05T21:34:58+00:00</updated>
<author>
<name>Alexander Gordeev</name>
<email>agordeev@linux.ibm.com</email>
</author>
<published>2023-11-03T15:43:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=02e790ee3077c0571794d0ab8f71413edbe129cc'/>
<id>urn:sha1:02e790ee3077c0571794d0ab8f71413edbe129cc</id>
<content type='text'>
Make pte_free_tlb() look similar to pXd_free_tlb() family
functions.

Reviewed-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Reviewed-by: Gerald Schaefer &lt;gerald.schaefer@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
</content>
</entry>
<entry>
<title>s390: convert various pgalloc functions to use ptdescs</title>
<updated>2023-08-21T20:37:54+00:00</updated>
<author>
<name>Vishal Moola (Oracle)</name>
<email>vishal.moola@gmail.com</email>
</author>
<published>2023-08-07T23:04:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6326c26c1514757242829b292b26eac589013200'/>
<id>urn:sha1:6326c26c1514757242829b292b26eac589013200</id>
<content type='text'>
As part of the conversions to replace pgtable constructor/destructors with
ptdesc equivalents, convert various page table functions to use ptdescs.

Some of the functions use the *get*page*() helper functions.  Convert
these to use pagetable_alloc() and ptdesc_address() instead to help
standardize page tables further.

Link: https://lkml.kernel.org/r/20230807230513.102486-15-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Acked-by: Mike Rapoport (IBM) &lt;rppt@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Dinh Nguyen &lt;dinguyen@kernel.org&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Geert Uytterhoeven &lt;geert+renesas@glider.be&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Huacai Chen &lt;chenhuacai@kernel.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: John Paul Adrian Glaubitz &lt;glaubitz@physik.fu-berlin.de&gt;
Cc: Jonas Bonn &lt;jonas@southpole.se&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Cc: Yoshinori Sato &lt;ysato@users.sourceforge.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>s390: add pte_free_defer() for pgtables sharing page</title>
<updated>2023-08-18T17:12:24+00:00</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2023-07-12T04:38:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8211dad6279817a8966ff6b74c2c588dd4166f45'/>
<id>urn:sha1:8211dad6279817a8966ff6b74c2c588dd4166f45</id>
<content type='text'>
Add s390-specific pte_free_defer(), to free table page via call_rcu(). 
pte_free_defer() will be called inside khugepaged's retract_page_tables()
loop, where allocating extra memory cannot be relied upon.  This precedes
the generic version to avoid build breakage from incompatible pgtable_t.

This version is more complicated than others: because s390 fits two 2K
page tables into one 4K page (so page-&gt;rcu_head must be shared between
both halves), and already uses page-&gt;lru (which page-&gt;rcu_head overlays)
to list any free halves; with clever management by page-&gt;_refcount bits.

Build upon the existing management, adjusted to follow a new rule: that a
page is never on the free list if pte_free_defer() was used on either half
(marked by PageActive).  And for simplicity, delay calling RCU until both
halves are freed.

Not adding back unallocated fragments to the list in pte_free_defer() can
result in wasting some amount of memory for pagetables, depending on how
long the allocated fragment will stay in use.  In practice, this effect is
expected to be insignificant, and not justify a far more complex approach,
which might allow to add the fragments back later in __tlb_remove_table(),
where we might not have a stable mm any more.

[hughd@google.com: Claudio finds warning on mm_has_pgste() more useful than on mm_alloc_pgste()]
  Link: https://lkml.kernel.org/r/3bc095ba-a180-ce3b-82b1-2bfc64612f3@google.com
Link: https://lkml.kernel.org/r/94eccf5f-264c-8abe-4567-e77f4b4e14a@google.com
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Reviewed-by: Gerald Schaefer &lt;gerald.schaefer@linux.ibm.com&gt;
Tested-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Acked-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Axel Rasmussen &lt;axelrasmussen@google.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Huang, Ying &lt;ying.huang@intel.com&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Lorenzo Stoakes &lt;lstoakes@gmail.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Mike Rapoport (IBM) &lt;rppt@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Cc: Pavel Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Ralph Campbell &lt;rcampbell@nvidia.com&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: SeongJae Park &lt;sj@kernel.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: Zack Rusin &lt;zackr@vmware.com&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
