<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/mm/page_ext.c, branch v6.1.110</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.110</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.110'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2022-11-23T02:50:41+00:00</updated>
<entry>
<title>mm/page_exit: fix kernel doc warning in page_ext_put()</title>
<updated>2022-11-23T02:50:41+00:00</updated>
<author>
<name>Charan Teja Kalla</name>
<email>quic_charante@quicinc.com</email>
</author>
<published>2022-11-08T05:16:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ed86b74874f839f0e579bdf92ea0a5aabdfabebb'/>
<id>urn:sha1:ed86b74874f839f0e579bdf92ea0a5aabdfabebb</id>
<content type='text'>
Fix the below compiler warnings reported with 'make W=1 mm/'. 
mm/page_ext.c:178: warning: Function parameter or member 'page_ext' not
described in 'page_ext_put'.

[quic_pkondeti@quicinc.com: better patch title]
Link: https://lkml.kernel.org/r/1667884582-2465-1-git-send-email-quic_charante@quicinc.com
Fixes: b1d5488a252dc9 ("mm: fix use-after free of page_ext after race with memory-offline")
Signed-off-by: Charan Teja Kalla &lt;quic_charante@quicinc.com&gt;
Reported-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Tested-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Pavan Kondeti &lt;quic_pkondeti@quicinc.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>page_ext: introduce boot parameter 'early_page_ext'</title>
<updated>2022-09-12T03:26:02+00:00</updated>
<author>
<name>Li Zhe</name>
<email>lizhe.67@bytedance.com</email>
</author>
<published>2022-08-25T10:27:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c4f20f1479c456d9dd1c1e6d8bf956a25de742dc'/>
<id>urn:sha1:c4f20f1479c456d9dd1c1e6d8bf956a25de742dc</id>
<content type='text'>
In commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in
page_ext_init""), we call page_ext_init() after page_alloc_init_late() to
avoid some panic problem.  It seems that we cannot track early page
allocations in current kernel even if page structure has been initialized
early.

This patch introduces a new boot parameter 'early_page_ext' to resolve
this problem.  If we pass it to the kernel, page_ext_init() will be moved
up and the feature 'deferred initialization of struct pages' will be
disabled to initialize the page allocator early and prevent the panic
problem above.  It can help us to catch early page allocations.  This is
useful especially when we find that the free memory value is not the same
right after different kernel booting.

[akpm@linux-foundation.org: fix section issue by removing __meminitdata]
Link: https://lkml.kernel.org/r/20220825102714.669-1-lizhe.67@bytedance.com
Signed-off-by: Li Zhe &lt;lizhe.67@bytedance.com&gt;
Suggested-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Mark-PK Tsai &lt;mark-pk.tsai@mediatek.com&gt;
Cc: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix use-after free of page_ext after race with memory-offline</title>
<updated>2022-09-12T03:25:57+00:00</updated>
<author>
<name>Charan Teja Kalla</name>
<email>quic_charante@quicinc.com</email>
</author>
<published>2022-08-18T13:50:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b1d5488a252dc9c0d9574100d0b8d807bf154603'/>
<id>urn:sha1:b1d5488a252dc9c0d9574100d0b8d807bf154603</id>
<content type='text'>
The below is one path where race between page_ext and offline of the
respective memory blocks will cause use-after-free on the access of
page_ext structure.

process1		              process2
---------                             ---------
a)doing /proc/page_owner           doing memory offline
			           through offline_pages.

b) PageBuddy check is failed
   thus proceed to get the
   page_owner information
   through page_ext access.
page_ext = lookup_page_ext(page);

				    migrate_pages();
				    .................
				Since all pages are successfully
				migrated as part of the offline
				operation,send MEM_OFFLINE notification
				where for page_ext it calls:
				offline_page_ext()--&gt;
				__free_page_ext()--&gt;
				   free_page_ext()--&gt;
				     vfree(ms-&gt;page_ext)
			           mem_section-&gt;page_ext = NULL

c) Check for the PAGE_EXT
   flags in the page_ext-&gt;flags
   access results into the
   use-after-free (leading to
   the translation faults).

As mentioned above, there is really no synchronization between page_ext
access and its freeing in the memory_offline.

The memory offline steps(roughly) on a memory block is as below:

1) Isolate all the pages

2) while(1)
  try free the pages to buddy.(-&gt;free_list[MIGRATE_ISOLATE])

3) delete the pages from this buddy list.

4) Then free page_ext.(Note: The struct page is still alive as it is
   freed only during hot remove of the memory which frees the memmap,
   which steps the user might not perform).

This design leads to the state where struct page is alive but the struct
page_ext is freed, where the later is ideally part of the former which
just representing the page_flags (check [3] for why this design is
chosen).

The abovementioned race is just one example __but the problem persists in
the other paths too involving page_ext-&gt;flags access(eg:
page_is_idle())__.

Fix all the paths where offline races with page_ext access by maintaining
synchronization with rcu lock and is achieved in 3 steps:

1) Invalidate all the page_ext's of the sections of a memory block by
   storing a flag in the LSB of mem_section-&gt;page_ext.

2) Wait until all the existing readers to finish working with the
   -&gt;page_ext's with synchronize_rcu().  Any parallel process that starts
   after this call will not get page_ext, through lookup_page_ext(), for
   the block parallel offline operation is being performed.

3) Now safely free all sections -&gt;page_ext's of the block on which
   offline operation is being performed.

Note: If synchronize_rcu() takes time then optimizations can be done in
this path through call_rcu()[2].

Thanks to David Hildenbrand for his views/suggestions on the initial
discussion[1] and Pavan kondeti for various inputs on this patch.

[1] https://lore.kernel.org/linux-mm/59edde13-4167-8550-86f0-11fc67882107@quicinc.com/
[2] https://lore.kernel.org/all/a26ce299-aed1-b8ad-711e-a49e82bdd180@quicinc.com/T/#u
[3] https://lore.kernel.org/all/6fa6b7aa-731e-891c-3efb-a03d6a700efa@redhat.com/

[quic_charante@quicinc.com: rename label `loop' to `ext_put_continue' per David]
  Link: https://lkml.kernel.org/r/1661496993-11473-1-git-send-email-quic_charante@quicinc.com
Link: https://lkml.kernel.org/r/1660830600-9068-1-git-send-email-quic_charante@quicinc.com
Signed-off-by: Charan Teja Kalla &lt;quic_charante@quicinc.com&gt;
Suggested-by: David Hildenbrand &lt;david@redhat.com&gt;
Suggested-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Fernand Sieber &lt;sieberf@amazon.com&gt;
Cc: Minchan Kim &lt;minchan@google.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Pavan Kondeti &lt;quic_pkondeti@quicinc.com&gt;
Cc: SeongJae Park &lt;sjpark@amazon.de&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: William Kucharski &lt;william.kucharski@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_ext: remove unused variable in offline_page_ext</title>
<updated>2022-09-12T03:25:47+00:00</updated>
<author>
<name>Charan Teja Kalla</name>
<email>quic_charante@quicinc.com</email>
</author>
<published>2022-08-01T05:06:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7b5a0b664ebe2625965a0fdba2614c88c4b9bbc6'/>
<id>urn:sha1:7b5a0b664ebe2625965a0fdba2614c88c4b9bbc6</id>
<content type='text'>
Remove unused variable 'nid' in offline_page_ext().  This is not used
since the page_ext code inception.

Link: https://lkml.kernel.org/r/1659330397-11817-1-git-send-email-quic_charante@quicinc.com
Signed-off-by: Charan Teja Kalla &lt;quic_charante@quicinc.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Pavan Kondeti &lt;quic_pkondeti@quicinc.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: use for_each_online_node and node_online instead of open coding</title>
<updated>2022-04-29T21:36:58+00:00</updated>
<author>
<name>Peng Liu</name>
<email>liupeng256@huawei.com</email>
</author>
<published>2022-04-29T21:36:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=30a514002db23fd630e3e52a2bdfb05c0de03378'/>
<id>urn:sha1:30a514002db23fd630e3e52a2bdfb05c0de03378</id>
<content type='text'>
Use more generic functions to deal with issues related to online nodes. 
The changes will make the code simplified.

Link: https://lkml.kernel.org/r/20220429030218.644635-1-liupeng256@huawei.com
Signed-off-by: Peng Liu &lt;liupeng256@huawei.com&gt;
Suggested-by: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Suggested-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: make some vars and functions static or __init</title>
<updated>2022-01-15T14:30:31+00:00</updated>
<author>
<name>Ting Liu</name>
<email>liuting.0x7c00@bytedance.com</email>
</author>
<published>2022-01-14T22:09:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cab0a7c115546a4865fb7439558af9077a569574'/>
<id>urn:sha1:cab0a7c115546a4865fb7439558af9077a569574</id>
<content type='text'>
"page_idle_ops" as a global var, but its scope of use within this
document.  So it should be static.

"page_ext_ops" is a var used in the kernel initial phase.  And other
functions are aslo used in the kernel initial phase.  So they should be
__init or __initdata to reclaim memory.

Link: https://lkml.kernel.org/r/20211217095023.67293-1-liuting.0x7c00@bytedance.com
Signed-off-by: Ting Liu &lt;liuting.0x7c00@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: page table check</title>
<updated>2022-01-15T14:30:28+00:00</updated>
<author>
<name>Pasha Tatashin</name>
<email>pasha.tatashin@soleen.com</email>
</author>
<published>2022-01-14T22:06:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=df4e817b710809425d899340dbfa8504a3ca4ba5'/>
<id>urn:sha1:df4e817b710809425d899340dbfa8504a3ca4ba5</id>
<content type='text'>
Check user page table entries at the time they are added and removed.

Allows to synchronously catch memory corruption issues related to double
mapping.

When a pte for an anonymous page is added into page table, we verify
that this pte does not already point to a file backed page, and vice
versa if this is a file backed page that is being added we verify that
this page does not have an anonymous mapping

We also enforce that read-only sharing for anonymous pages is allowed
(i.e.  cow after fork).  All other sharing must be for file pages.

Page table check allows to protect and debug cases where "struct page"
metadata became corrupted for some reason.  For example, when refcnt or
mapcount become invalid.

Link: https://lkml.kernel.org/r/20211221154650.1047963-4-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Slaby &lt;jirislaby@kernel.org&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Cc: Mike Rapoport &lt;rppt@kernel.org&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Wei Xu &lt;weixugc@google.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_ext.c: fix a comment</title>
<updated>2021-11-06T20:30:34+00:00</updated>
<author>
<name>Yinan Zhang</name>
<email>zhangyinan2019@email.szu.edu.cn</email>
</author>
<published>2021-11-05T20:36:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d1fea155ee3d90fb3adcb3d6c42751860be3b158'/>
<id>urn:sha1:d1fea155ee3d90fb3adcb3d6c42751860be3b158</id>
<content type='text'>
I have noticed that the previous macro is #ifndef CONFIG_SPARSEMEM.  I
think the comment of #else should be CONFIG_SPARSEMEM.

Link: https://lkml.kernel.org/r/20211008140312.6492-1-zhangyinan2019@email.szu.edu.cn
Signed-off-by: Yinan Zhang &lt;zhangyinan2019@email.szu.edu.cn&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/migrate: add CPU hotplug to demotion #ifdef</title>
<updated>2021-10-19T06:22:02+00:00</updated>
<author>
<name>Dave Hansen</name>
<email>dave.hansen@linux.intel.com</email>
</author>
<published>2021-10-18T22:15:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=76af6a054da4055305ddb28c5eb151b9ee4f74f9'/>
<id>urn:sha1:76af6a054da4055305ddb28c5eb151b9ee4f74f9</id>
<content type='text'>
Once upon a time, the node demotion updates were driven solely by memory
hotplug events.  But now, there are handlers for both CPU and memory
hotplug.

However, the #ifdef around the code checks only memory hotplug.  A
system that has HOTPLUG_CPU=y but MEMORY_HOTPLUG=n would miss CPU
hotplug events.

Update the #ifdef around the common code.  Add memory and CPU-specific
#ifdefs for their handlers.  These memory/CPU #ifdefs avoid unused
function warnings when their Kconfig option is off.

[arnd@arndb.de: rework hotplug_memory_notifier() stub]
  Link: https://lkml.kernel.org/r/20211013144029.2154629-1-arnd@kernel.org

Link: https://lkml.kernel.org/r/20210924161255.E5FE8F7E@davehans-spike.ostc.intel.com
Fixes: 884a6e5d1f93 ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Wei Xu &lt;weixugc@google.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Cc: Yang Shi &lt;yang.shi@linux.alibaba.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/idle_page_tracking: make PG_idle reusable</title>
<updated>2021-09-08T18:50:24+00:00</updated>
<author>
<name>SeongJae Park</name>
<email>sjpark@amazon.de</email>
</author>
<published>2021-09-08T02:56:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1c676e0d9b1a59b98885b24a0e16a81fe4cc8301'/>
<id>urn:sha1:1c676e0d9b1a59b98885b24a0e16a81fe4cc8301</id>
<content type='text'>
PG_idle and PG_young allow the two PTE Accessed bit users, Idle Page
Tracking and the reclaim logic concurrently work while not interfering
with each other.  That is, when they need to clear the Accessed bit, they
set PG_young to represent the previous state of the bit, respectively.
And when they need to read the bit, if the bit is cleared, they further
read the PG_young to know whether the other has cleared the bit meanwhile
or not.

For yet another user of the PTE Accessed bit, we could add another page
flag, or extend the mechanism to use the flags.  For the DAMON usecase,
however, we don't need to do that just yet.  IDLE_PAGE_TRACKING and DAMON
are mutually exclusive, so there's only ever going to be one user of the
current set of flags.

In this commit, we split out the CONFIG options to allow for the use of
PG_young and PG_idle outside of idle page tracking.

In the next commit, DAMON's reference implementation of the virtual memory
address space monitoring primitives will use it.

[sjpark@amazon.de: set PAGE_EXTENSION for non-64BIT]
  Link: https://lkml.kernel.org/r/20210806095153.6444-1-sj38.park@gmail.com
[akpm@linux-foundation.org: tweak Kconfig text]
[sjpark@amazon.de: hide PAGE_IDLE_FLAG from users]
  Link: https://lkml.kernel.org/r/20210813081238.34705-1-sj38.park@gmail.com

Link: https://lkml.kernel.org/r/20210716081449.22187-5-sj38.park@gmail.com
Signed-off-by: SeongJae Park &lt;sjpark@amazon.de&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Reviewed-by: Fernand Sieber &lt;sieberf@amazon.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Amit Shah &lt;amit@kernel.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Brendan Higgins &lt;brendanhiggins@google.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: David Woodhouse &lt;dwmw@amazon.com&gt;
Cc: Fan Du &lt;fan.du@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;greg@kroah.com&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Joe Perches &lt;joe@perches.com&gt;
Cc: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Leonard Foerster &lt;foersleo@amazon.de&gt;
Cc: Marco Elver &lt;elver@google.com&gt;
Cc: Markus Boehme &lt;markubo@amazon.de&gt;
Cc: Maximilian Heyne &lt;mheyne@amazon.de&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Cc: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
