<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/mmzone.h, branch linux-6.9.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-6.9.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-6.9.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-07-18T11:22:35+00:00</updated>
<entry>
<title>mm: prevent derefencing NULL ptr in pfn_section_valid()</title>
<updated>2024-07-18T11:22:35+00:00</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2024-06-26T00:16:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=adccdf702b4ea913ded5ff512239e382d7473b63'/>
<id>urn:sha1:adccdf702b4ea913ded5ff512239e382d7473b63</id>
<content type='text'>
[ Upstream commit 82f0b6f041fad768c28b4ad05a683065412c226e ]

Commit 5ec8e8ea8b77 ("mm/sparsemem: fix race in accessing
memory_section-&gt;usage") changed pfn_section_valid() to add a READ_ONCE()
call around "ms-&gt;usage" to fix a race with section_deactivate() where
ms-&gt;usage can be cleared.  The READ_ONCE() call, by itself, is not enough
to prevent NULL pointer dereference.  We need to check its value before
dereferencing it.

Link: https://lkml.kernel.org/r/20240626001639.1350646-1-longman@redhat.com
Fixes: 5ec8e8ea8b77 ("mm/sparsemem: fix race in accessing memory_section-&gt;usage")
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Cc: Charan Teja Kalla &lt;quic_charante@quicinc.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm/page_alloc: Separate THP PCP into movable and non-movable categories</title>
<updated>2024-07-05T07:38:17+00:00</updated>
<author>
<name>yangge</name>
<email>yangge1116@126.com</email>
</author>
<published>2024-06-20T00:59:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=68b460e134eb1983826aeb5c889fe43a98203ee7'/>
<id>urn:sha1:68b460e134eb1983826aeb5c889fe43a98203ee7</id>
<content type='text'>
commit bf14ed81f571f8dba31cd72ab2e50fbcc877cc31 upstream.

Since commit 5d0a661d808f ("mm/page_alloc: use only one PCP list for
THP-sized allocations") no longer differentiates the migration type of
pages in THP-sized PCP list, it's possible that non-movable allocation
requests may get a CMA page from the list, in some cases, it's not
acceptable.

If a large number of CMA memory are configured in system (for example, the
CMA memory accounts for 50% of the system memory), starting a virtual
machine with device passthrough will get stuck.  During starting the
virtual machine, it will call pin_user_pages_remote(..., FOLL_LONGTERM,
...) to pin memory.  Normally if a page is present and in CMA area,
pin_user_pages_remote() will migrate the page from CMA area to non-CMA
area because of FOLL_LONGTERM flag.  But if non-movable allocation
requests return CMA memory, migrate_longterm_unpinnable_pages() will
migrate a CMA page to another CMA page, which will fail to pass the check
in check_and_migrate_movable_pages() and cause migration endless.

Call trace:
pin_user_pages_remote
--__gup_longterm_locked // endless loops in this function
----_get_user_pages_locked
----check_and_migrate_movable_pages
------migrate_longterm_unpinnable_pages
--------alloc_migration_target

This problem will also have a negative impact on CMA itself.  For example,
when CMA is borrowed by THP, and we need to reclaim it through cma_alloc()
or dma_alloc_coherent(), we must move those pages out to ensure CMA's
users can retrieve that contigous memory.  Currently, CMA's memory is
occupied by non-movable pages, meaning we can't relocate them.  As a
result, cma_alloc() is more likely to fail.

To fix the problem above, we add one PCP list for THP, which will not
introduce a new cacheline for struct per_cpu_pages.  THP will have 2 PCP
lists, one PCP list is used by MOVABLE allocation, and the other PCP list
is used by UNMOVABLE allocation.  MOVABLE allocation contains GPF_MOVABLE,
and UNMOVABLE allocation contains GFP_UNMOVABLE and GFP_RECLAIMABLE.

Link: https://lkml.kernel.org/r/1718845190-4456-1-git-send-email-yangge1116@126.com
Fixes: 5d0a661d808f ("mm/page_alloc: use only one PCP list for THP-sized allocations")
Signed-off-by: yangge &lt;yangge1116@126.com&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: Barry Song &lt;21cnbao@gmail.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm: add __dump_folio()</title>
<updated>2024-03-06T21:04:18+00:00</updated>
<author>
<name>Matthew Wilcox (Oracle)</name>
<email>willy@infradead.org</email>
</author>
<published>2024-02-27T19:23:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fae7d834c43ccdb9fcecaf4d0f33145d884b3e5c'/>
<id>urn:sha1:fae7d834c43ccdb9fcecaf4d0f33145d884b3e5c</id>
<content type='text'>
Turn __dump_page() into a wrapper around __dump_folio().  Snapshot the
page &amp; folio into a stack variable so we don't hit BUG_ON() if an
allocation is freed under us and what was a folio pointer becomes a
pointer to a tail page.

[willy@infradead.org: fix build issue]
  Link: https://lkml.kernel.org/r/ZeAKCyTn_xS3O9cE@casper.infradead.org
[willy@infradead.org: fix __dump_folio]
  Link: https://lkml.kernel.org/r/ZeJJegP8zM7S9GTy@casper.infradead.org
[willy@infradead.org: fix pointer confusion]
  Link: https://lkml.kernel.org/r/ZeYa00ixxC4k1ot-@casper.infradead.org
[akpm@linux-foundation.org: s/printk/pr_warn/]
Link: https://lkml.kernel.org/r/20240227192337.757313-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mglru: improve struct lru_gen_mm_walk</title>
<updated>2024-02-22T18:24:58+00:00</updated>
<author>
<name>Kinsey Ho</name>
<email>kinseyho@google.com</email>
</author>
<published>2024-02-14T06:05:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cc25bbe10a86a7fe3ec8468fb01e579e6216d7e1'/>
<id>urn:sha1:cc25bbe10a86a7fe3ec8468fb01e579e6216d7e1</id>
<content type='text'>
Rename max_seq to seq in struct lru_gen_mm_walk to keep consistent with
struct lru_gen_mm_state.  Note that seq is not always up to date with
max_seq from lru_gen_folio.

No functional changes.

Link: https://lkml.kernel.org/r/20240214060538.3524462-5-kinseyho@google.com
Signed-off-by: Kinsey Ho &lt;kinseyho@google.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Cc: Donet Tom &lt;donettom@linux.vnet.ibm.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, kmsan: fix infinite recursion due to RCU critical section</title>
<updated>2024-01-26T07:52:21+00:00</updated>
<author>
<name>Marco Elver</name>
<email>elver@google.com</email>
</author>
<published>2024-01-18T10:59:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f6564fce256a3944aa1bc76cb3c40e792d97c1eb'/>
<id>urn:sha1:f6564fce256a3944aa1bc76cb3c40e792d97c1eb</id>
<content type='text'>
Alexander Potapenko writes in [1]: "For every memory access in the code
instrumented by KMSAN we call kmsan_get_metadata() to obtain the metadata
for the memory being accessed.  For virtual memory the metadata pointers
are stored in the corresponding `struct page`, therefore we need to call
virt_to_page() to get them.

According to the comment in arch/x86/include/asm/page.h,
virt_to_page(kaddr) returns a valid pointer iff virt_addr_valid(kaddr) is
true, so KMSAN needs to call virt_addr_valid() as well.

To avoid recursion, kmsan_get_metadata() must not call instrumented code,
therefore ./arch/x86/include/asm/kmsan.h forks parts of
arch/x86/mm/physaddr.c to check whether a virtual address is valid or not.

But the introduction of rcu_read_lock() to pfn_valid() added instrumented
RCU API calls to virt_to_page_or_null(), which is called by
kmsan_get_metadata(), so there is an infinite recursion now.  I do not
think it is correct to stop that recursion by doing
kmsan_enter_runtime()/kmsan_exit_runtime() in kmsan_get_metadata(): that
would prevent instrumented functions called from within the runtime from
tracking the shadow values, which might introduce false positives."

Fix the issue by switching pfn_valid() to the _sched() variant of
rcu_read_lock/unlock(), which does not require calling into RCU.  Given
the critical section in pfn_valid() is very small, this is a reasonable
trade-off (with preemptible RCU).

KMSAN further needs to be careful to suppress calls into the scheduler,
which would be another source of recursion.  This can be done by wrapping
the call to pfn_valid() into preempt_disable/enable_no_resched().  The
downside is that this sacrifices breaking scheduling guarantees; however,
a kernel compiled with KMSAN has already given up any performance
guarantees due to being heavily instrumented.

Note, KMSAN code already disables tracing via Makefile, and since mmzone.h
is included, it is not necessary to use the notrace variant, which is
generally preferred in all other cases.

Link: https://lkml.kernel.org/r/20240115184430.2710652-1-glider@google.com [1]
Link: https://lkml.kernel.org/r/20240118110022.2538350-1-elver@google.com
Fixes: 5ec8e8ea8b77 ("mm/sparsemem: fix race in accessing memory_section-&gt;usage")
Signed-off-by: Marco Elver &lt;elver@google.com&gt;
Reported-by: Alexander Potapenko &lt;glider@google.com&gt;
Reported-by: syzbot+93a9e8a3dea8d6085e12@syzkaller.appspotmail.com
Reviewed-by: Alexander Potapenko &lt;glider@google.com&gt;
Tested-by: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Charan Teja Kalla &lt;quic_charante@quicinc.com&gt;
Cc: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER</title>
<updated>2024-01-08T23:27:15+00:00</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2023-12-28T14:47:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5e0a760b44417f7cadd79de2204d6247109558a0'/>
<id>urn:sha1:5e0a760b44417f7cadd79de2204d6247109558a0</id>
<content type='text'>
commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has
changed the definition of MAX_ORDER to be inclusive.  This has caused
issues with code that was not yet upstream and depended on the previous
definition.

To draw attention to the altered meaning of the define, rename MAX_ORDER
to MAX_PAGE_ORDER.

Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, treewide: introduce NR_PAGE_ORDERS</title>
<updated>2024-01-08T23:27:15+00:00</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2023-12-28T14:47:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fd37721803c6e73619108f76ad2e12a9aa5fafaf'/>
<id>urn:sha1:fd37721803c6e73619108f76ad2e12a9aa5fafaf</id>
<content type='text'>
NR_PAGE_ORDERS defines the number of page orders supported by the page
allocator, ranging from 0 to MAX_ORDER, MAX_ORDER + 1 in total.

NR_PAGE_ORDERS assists in defining arrays of page orders and allows for
more natural iteration over them.

[kirill.shutemov@linux.intel.com: fixup for kerneldoc warning]
  Link: https://lkml.kernel.org/r/20240101111512.7empzyifq7kxtzk3@box
Link: https://lkml.kernel.org/r/20231228144704.14033-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Reviewed-by: Zi Yan &lt;ziy@nvidia.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING</title>
<updated>2024-01-05T18:17:47+00:00</updated>
<author>
<name>Li Zhijian</name>
<email>lizhijian@fujitsu.com</email>
</author>
<published>2023-12-29T02:26:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b805ab3c6935d14654ccc28f16ffce7a13c2c528'/>
<id>urn:sha1:b805ab3c6935d14654ccc28f16ffce7a13c2c528</id>
<content type='text'>
Demotion can work well without CONFIG_NUMA_BALANCING.  But the commit
23e9f0138963 ("mm/vmstat: move pgdemote_* to per-node stats") wrongly hid
it behind CONFIG_NUMA_BALANCING.

Fix it by moving them out of CONFIG_NUMA_BALANCING.

Link: https://lkml.kernel.org/r/20231229022651.3229174-1-lizhijian@fujitsu.com
Fixes: 23e9f0138963 ("mm/vmstat: move pgdemote_* to per-node stats")
Signed-off-by: Li Zhijian &lt;lizhijian@fujitsu.com&gt;
Cc: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: "Rafael J. Wysocki" &lt;rafael@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mglru: remove CONFIG_MEMCG</title>
<updated>2024-01-05T18:17:44+00:00</updated>
<author>
<name>Kinsey Ho</name>
<email>kinseyho@google.com</email>
</author>
<published>2023-12-27T14:12:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=745b13e647cd119e70d16b57698e12b7c86ca264'/>
<id>urn:sha1:745b13e647cd119e70d16b57698e12b7c86ca264</id>
<content type='text'>
Remove CONFIG_MEMCG in a refactoring to improve code readability at
the cost of a few bytes in struct lru_gen_folio per node when
CONFIG_MEMCG=n.

Link: https://lkml.kernel.org/r/20231227141205.2200125-4-kinseyho@google.com
Signed-off-by: Kinsey Ho &lt;kinseyho@google.com&gt;
Co-developed-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Tested-by: Donet Tom &lt;donettom@linux.vnet.ibm.com&gt;
Acked-by: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mglru: add CONFIG_LRU_GEN_WALKS_MMU</title>
<updated>2024-01-05T18:17:44+00:00</updated>
<author>
<name>Kinsey Ho</name>
<email>kinseyho@google.com</email>
</author>
<published>2023-12-27T14:12:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=61dd3f246b3adaabff3241c586f2210ac91b05a4'/>
<id>urn:sha1:61dd3f246b3adaabff3241c586f2210ac91b05a4</id>
<content type='text'>
Add CONFIG_LRU_GEN_WALKS_MMU such that if disabled, the code that
walks page tables to promote pages into the youngest generation will
not be built.

Also improves code readability by adding two helper functions
get_mm_state() and get_next_mm().

Link: https://lkml.kernel.org/r/20231227141205.2200125-3-kinseyho@google.com
Signed-off-by: Kinsey Ho &lt;kinseyho@google.com&gt;
Co-developed-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Tested-by: Donet Tom &lt;donettom@linux.vnet.ibm.com&gt;
Acked-by: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
