<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/mm/page_ext.c, branch v6.6.141</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.141</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.141'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-08-18T17:12:31+00:00</updated>
<entry>
<title>mm/page_ext: move functions around for minor cleanups to page_ext</title>
<updated>2023-08-18T17:12:31+00:00</updated>
<author>
<name>Kemeng Shi</name>
<email>shikemeng@huaweicloud.com</email>
</author>
<published>2023-07-14T11:47:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=eb0da7f6e0832a689d845ca2d62152dc6b43e780'/>
<id>urn:sha1:eb0da7f6e0832a689d845ca2d62152dc6b43e780</id>
<content type='text'>
1. move page_ext_get and page_ext_put down to remove forward
   declaration of lookup_page_ext.

2. move page_ext_init_flatmem_late down to existing non SPARS block to
   remove a new non SPARS block and to keep code for non SPARS tight.

Link: https://lkml.kernel.org/r/20230714114749.1743032-4-shikemeng@huaweicloud.com
Signed-off-by: Kemeng Shi &lt;shikemeng@huaweicloud.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_ext: remove rollback for untouched mem_section in online_page_ext</title>
<updated>2023-08-18T17:12:31+00:00</updated>
<author>
<name>Kemeng Shi</name>
<email>shikemeng@huaweicloud.com</email>
</author>
<published>2023-07-14T11:47:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3c09be5a2be861d7f74b0251a8e77859b4c654cc'/>
<id>urn:sha1:3c09be5a2be861d7f74b0251a8e77859b4c654cc</id>
<content type='text'>
If init_section_page_ext failed, we only need rollback for mem_section
before failed mem_section.  Make rollback end point to failed mem_section
to remove unnecessary rollback.

As pfn += PAGES_PER_SECTION will be executed even if init_section_page_ext
failed.  So pfn points to mem_section after failed mem_section.  Subtract
one mem_section from pfn to get failed mem_section.

Link: https://lkml.kernel.org/r/20230714114749.1743032-3-shikemeng@huaweicloud.com
Signed-off-by: Kemeng Shi &lt;shikemeng@huaweicloud.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_ext: remove unused return value of offline_page_ext</title>
<updated>2023-08-18T17:12:31+00:00</updated>
<author>
<name>Kemeng Shi</name>
<email>shikemeng@huaweicloud.com</email>
</author>
<published>2023-07-14T11:47:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=063ff7cd8bf24aa14c897b6168591d3d0dae2a5e'/>
<id>urn:sha1:063ff7cd8bf24aa14c897b6168591d3d0dae2a5e</id>
<content type='text'>
Patch series "minor cleanups for page_ext".

This series contains some random minor cleanups for page_ext.  More
details can be found in respective patches.  


This patch (of 3):

offline_page_ext always returns 0 and no caller checks the return value. 
Just remove unused return value of offline_page_ext.

Link: https://lkml.kernel.org/r/20230714114749.1743032-1-shikemeng@huaweicloud.com
Link: https://lkml.kernel.org/r/20230714114749.1743032-2-shikemeng@huaweicloud.com
Signed-off-by: Kemeng Shi &lt;shikemeng@huaweicloud.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_ext: init page_ext early if there are no deferred struct pages</title>
<updated>2023-02-03T06:33:22+00:00</updated>
<author>
<name>Pasha Tatashin</name>
<email>pasha.tatashin@soleen.com</email>
</author>
<published>2023-01-17T20:46:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7ec7096b8577d3b899c1dae456a414f2d08c7ddb'/>
<id>urn:sha1:7ec7096b8577d3b899c1dae456a414f2d08c7ddb</id>
<content type='text'>
page_ext must be initialized after all struct pages are initialized. 
Therefore, page_ext is initialized after page_alloc_init_late(), and can
optionally be initialized earlier via early_page_ext kernel parameter
which as a side effect also disables deferred struct pages.

Allow to automatically init page_ext early when there are no deferred
struct pages in order to be able to use page_ext during kernel boot and
track for example page allocations early.

[pasha.tatashin@soleen.com: fix build with CONFIG_PAGE_EXTENSION=n]
  Link: https://lkml.kernel.org/r/20230118155251.2522985-1-pasha.tatashin@soleen.com
Link: https://lkml.kernel.org/r/20230117204617.1553748-1-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Acked-by: Mike Rapoport (IBM) &lt;rppt@kernel.org&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Charan Teja Kalla &lt;quic_charante@quicinc.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Li Zhe &lt;lizhe.67@bytedance.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_ext: do not allocate space for page_ext-&gt;flags if not needed</title>
<updated>2023-02-03T06:33:11+00:00</updated>
<author>
<name>Pasha Tatashin</name>
<email>pasha.tatashin@soleen.com</email>
</author>
<published>2023-01-13T15:42:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6189eb82f0aec8a877190bf52e629c687ed02773'/>
<id>urn:sha1:6189eb82f0aec8a877190bf52e629c687ed02773</id>
<content type='text'>
There is 8 byte page_ext-&gt;flags field allocated per page whenever
CONFIG_PAGE_EXTENSION is enabled.  However, not every user of page_ext
uses flags.  Therefore, check whether flags is needed at least by one user
and if so allocate space for it.

For example when page_table_check is enabled, on a machine with 128G
of memory before the fix:

[    2.244288] allocated 536870912 bytes of page_ext
after the fix:
[    2.160154] allocated 268435456 bytes of page_ext

Also, add a kernel-doc comment before page_ext_operations that describes
the fields, and remove check if need() is set, as that is now a required
field.

[pasha.tatashin@soleen.com: address comments from Mike Rapoport]
  Link: https://lkml.kernel.org/r/20230117202103.1412449-1-pasha.tatashin@soleen.com
Link: https://lkml.kernel.org/r/20230113154253.92480-1-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Reviewed-by: Mike Rapoport (IBM) &lt;rppt@kernel.org&gt;
Cc: Charan Teja Kalla &lt;quic_charante@quicinc.com&gt;
Cc: Li Zhe &lt;lizhe.67@bytedance.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'mm-hotfixes-stable' into mm-stable</title>
<updated>2022-11-30T22:58:42+00:00</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2022-11-30T22:58:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a38358c934f66bdff12db762998b88038d7bc44b'/>
<id>urn:sha1:a38358c934f66bdff12db762998b88038d7bc44b</id>
<content type='text'>
</content>
</entry>
<entry>
<title>mm/page_exit: fix kernel doc warning in page_ext_put()</title>
<updated>2022-11-23T02:50:41+00:00</updated>
<author>
<name>Charan Teja Kalla</name>
<email>quic_charante@quicinc.com</email>
</author>
<published>2022-11-08T05:16:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ed86b74874f839f0e579bdf92ea0a5aabdfabebb'/>
<id>urn:sha1:ed86b74874f839f0e579bdf92ea0a5aabdfabebb</id>
<content type='text'>
Fix the below compiler warnings reported with 'make W=1 mm/'. 
mm/page_ext.c:178: warning: Function parameter or member 'page_ext' not
described in 'page_ext_put'.

[quic_pkondeti@quicinc.com: better patch title]
Link: https://lkml.kernel.org/r/1667884582-2465-1-git-send-email-quic_charante@quicinc.com
Fixes: b1d5488a252dc9 ("mm: fix use-after free of page_ext after race with memory-offline")
Signed-off-by: Charan Teja Kalla &lt;quic_charante@quicinc.com&gt;
Reported-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Tested-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Pavan Kondeti &lt;quic_pkondeti@quicinc.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memory: move hotplug memory notifier priority to same file for easy sorting</title>
<updated>2022-11-09T01:37:17+00:00</updated>
<author>
<name>Liu Shixin</name>
<email>liushixin2@huawei.com</email>
</author>
<published>2022-09-23T03:33:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1eeaa4fd39b0b1b3e986f8eab6978e69b01e3c5e'/>
<id>urn:sha1:1eeaa4fd39b0b1b3e986f8eab6978e69b01e3c5e</id>
<content type='text'>
The priority of hotplug memory callback is defined in a different file. 
And there are some callers using numbers directly.  Collect them together
into include/linux/memory.h for easy reading.  This allows us to sort
their priorities more intuitively without additional comments.

Link: https://lkml.kernel.org/r/20220923033347.3935160-9-liushixin2@huawei.com
Signed-off-by: Liu Shixin &lt;liushixin2@huawei.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Cc: zefan li &lt;lizefan.x@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>page_ext: introduce boot parameter 'early_page_ext'</title>
<updated>2022-09-12T03:26:02+00:00</updated>
<author>
<name>Li Zhe</name>
<email>lizhe.67@bytedance.com</email>
</author>
<published>2022-08-25T10:27:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c4f20f1479c456d9dd1c1e6d8bf956a25de742dc'/>
<id>urn:sha1:c4f20f1479c456d9dd1c1e6d8bf956a25de742dc</id>
<content type='text'>
In commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in
page_ext_init""), we call page_ext_init() after page_alloc_init_late() to
avoid some panic problem.  It seems that we cannot track early page
allocations in current kernel even if page structure has been initialized
early.

This patch introduces a new boot parameter 'early_page_ext' to resolve
this problem.  If we pass it to the kernel, page_ext_init() will be moved
up and the feature 'deferred initialization of struct pages' will be
disabled to initialize the page allocator early and prevent the panic
problem above.  It can help us to catch early page allocations.  This is
useful especially when we find that the free memory value is not the same
right after different kernel booting.

[akpm@linux-foundation.org: fix section issue by removing __meminitdata]
Link: https://lkml.kernel.org/r/20220825102714.669-1-lizhe.67@bytedance.com
Signed-off-by: Li Zhe &lt;lizhe.67@bytedance.com&gt;
Suggested-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Mark-PK Tsai &lt;mark-pk.tsai@mediatek.com&gt;
Cc: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix use-after free of page_ext after race with memory-offline</title>
<updated>2022-09-12T03:25:57+00:00</updated>
<author>
<name>Charan Teja Kalla</name>
<email>quic_charante@quicinc.com</email>
</author>
<published>2022-08-18T13:50:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b1d5488a252dc9c0d9574100d0b8d807bf154603'/>
<id>urn:sha1:b1d5488a252dc9c0d9574100d0b8d807bf154603</id>
<content type='text'>
The below is one path where race between page_ext and offline of the
respective memory blocks will cause use-after-free on the access of
page_ext structure.

process1		              process2
---------                             ---------
a)doing /proc/page_owner           doing memory offline
			           through offline_pages.

b) PageBuddy check is failed
   thus proceed to get the
   page_owner information
   through page_ext access.
page_ext = lookup_page_ext(page);

				    migrate_pages();
				    .................
				Since all pages are successfully
				migrated as part of the offline
				operation,send MEM_OFFLINE notification
				where for page_ext it calls:
				offline_page_ext()--&gt;
				__free_page_ext()--&gt;
				   free_page_ext()--&gt;
				     vfree(ms-&gt;page_ext)
			           mem_section-&gt;page_ext = NULL

c) Check for the PAGE_EXT
   flags in the page_ext-&gt;flags
   access results into the
   use-after-free (leading to
   the translation faults).

As mentioned above, there is really no synchronization between page_ext
access and its freeing in the memory_offline.

The memory offline steps(roughly) on a memory block is as below:

1) Isolate all the pages

2) while(1)
  try free the pages to buddy.(-&gt;free_list[MIGRATE_ISOLATE])

3) delete the pages from this buddy list.

4) Then free page_ext.(Note: The struct page is still alive as it is
   freed only during hot remove of the memory which frees the memmap,
   which steps the user might not perform).

This design leads to the state where struct page is alive but the struct
page_ext is freed, where the later is ideally part of the former which
just representing the page_flags (check [3] for why this design is
chosen).

The abovementioned race is just one example __but the problem persists in
the other paths too involving page_ext-&gt;flags access(eg:
page_is_idle())__.

Fix all the paths where offline races with page_ext access by maintaining
synchronization with rcu lock and is achieved in 3 steps:

1) Invalidate all the page_ext's of the sections of a memory block by
   storing a flag in the LSB of mem_section-&gt;page_ext.

2) Wait until all the existing readers to finish working with the
   -&gt;page_ext's with synchronize_rcu().  Any parallel process that starts
   after this call will not get page_ext, through lookup_page_ext(), for
   the block parallel offline operation is being performed.

3) Now safely free all sections -&gt;page_ext's of the block on which
   offline operation is being performed.

Note: If synchronize_rcu() takes time then optimizations can be done in
this path through call_rcu()[2].

Thanks to David Hildenbrand for his views/suggestions on the initial
discussion[1] and Pavan kondeti for various inputs on this patch.

[1] https://lore.kernel.org/linux-mm/59edde13-4167-8550-86f0-11fc67882107@quicinc.com/
[2] https://lore.kernel.org/all/a26ce299-aed1-b8ad-711e-a49e82bdd180@quicinc.com/T/#u
[3] https://lore.kernel.org/all/6fa6b7aa-731e-891c-3efb-a03d6a700efa@redhat.com/

[quic_charante@quicinc.com: rename label `loop' to `ext_put_continue' per David]
  Link: https://lkml.kernel.org/r/1661496993-11473-1-git-send-email-quic_charante@quicinc.com
Link: https://lkml.kernel.org/r/1660830600-9068-1-git-send-email-quic_charante@quicinc.com
Signed-off-by: Charan Teja Kalla &lt;quic_charante@quicinc.com&gt;
Suggested-by: David Hildenbrand &lt;david@redhat.com&gt;
Suggested-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Fernand Sieber &lt;sieberf@amazon.com&gt;
Cc: Minchan Kim &lt;minchan@google.com&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Pavan Kondeti &lt;quic_pkondeti@quicinc.com&gt;
Cc: SeongJae Park &lt;sjpark@amazon.de&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: William Kucharski &lt;william.kucharski@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
