<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/base/memory.c, branch v4.14.263</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.263</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.263'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2019-12-01T08:14:14+00:00</updated>
<entry>
<title>mm/memory_hotplug: Do not unlock when fails to take the device_hotplug_lock</title>
<updated>2019-12-01T08:14:14+00:00</updated>
<author>
<name>zhong jiang</name>
<email>zhongjiang@huawei.com</email>
</author>
<published>2019-04-08T04:07:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=058aa7307046ee8afd4386bb9710df5518d94569'/>
<id>urn:sha1:058aa7307046ee8afd4386bb9710df5518d94569</id>
<content type='text'>
[ Upstream commit d2ab99403ee00d8014e651728a4702ea1ae5e52c ]

When adding the memory by probing memory block in sysfs interface, there is an
obvious issue that we will unlock the device_hotplug_lock when fails to takes it.

That issue was introduced in Commit 8df1d0e4a265
("mm/memory_hotplug: make add_memory() take the device_hotplug_lock")

We should drop out in time when fails to take the device_hotplug_lock.

Fixes: 8df1d0e4a265 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock")
Reported-by: Yang yingliang &lt;yangyingliang@huawei.com&gt;
Signed-off-by: zhong jiang &lt;zhongjiang@huawei.com&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: stable &lt;stable@vger.kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm/memory_hotplug: make add_memory() take the device_hotplug_lock</title>
<updated>2019-12-01T08:13:57+00:00</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2018-10-30T22:10:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5cb8388a680a363ba9a8cca8f81687f9b0d238bb'/>
<id>urn:sha1:5cb8388a680a363ba9a8cca8f81687f9b0d238bb</id>
<content type='text'>
[ Upstream commit 8df1d0e4a265f25dc1e7e7624ccdbcb4a6630c89 ]

add_memory() currently does not take the device_hotplug_lock, however
is aleady called under the lock from
	arch/powerpc/platforms/pseries/hotplug-memory.c
	drivers/acpi/acpi_memhotplug.c
to synchronize against CPU hot-remove and similar.

In general, we should hold the device_hotplug_lock when adding memory to
synchronize against online/offline request (e.g.  from user space) - which
already resulted in lock inversions due to device_lock() and
mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory
hot-add deadlock").  add_memory()/add_memory_resource() will create memory
block devices, so this really feels like the right thing to do.

Holding the device_hotplug_lock makes sure that a memory block device
can really only be accessed (e.g. via .online/.state) from user space,
once the memory has been fully added to the system.

The lock is not held yet in
	drivers/xen/balloon.c
	arch/powerpc/platforms/powernv/memtrace.c
	drivers/s390/char/sclp_cmd.c
	drivers/hv/hv_balloon.c
So, let's either use the locked variants or take the lock.

Don't export add_memory_resource(), as it once was exported to be used by
XEN, which is never built as a module.  If somebody requires it, we also
have to export a locked variant (as device_hotplug_lock is never
exported).

Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.com
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Pavel Tatashin &lt;pavel.tatashin@microsoft.com&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Rashmica Gupta &lt;rashmica.g@gmail.com&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Nathan Fontenot &lt;nfont@linux.vnet.ibm.com&gt;
Cc: John Allen &lt;jallen@linux.vnet.ibm.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Mathieu Malaterre &lt;malat@debian.org&gt;
Cc: Pavel Tatashin &lt;pavel.tatashin@microsoft.com&gt;
Cc: YASUAKI ISHIMATSU &lt;yasu.isimatu@gmail.com&gt;
Cc: Balbir Singh &lt;bsingharora@gmail.com&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Kate Stewart &lt;kstewart@linuxfoundation.org&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Michael Neuling &lt;mikey@neuling.org&gt;
Cc: Philippe Ombredanne &lt;pombredanne@nexb.com&gt;
Cc: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>drivers/base/memory.c: don't access uninitialized memmaps in soft_offline_page_store()</title>
<updated>2019-10-29T08:17:36+00:00</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2019-10-19T03:19:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=872f16e35f53c8d3fb296051016cf761f718a31a'/>
<id>urn:sha1:872f16e35f53c8d3fb296051016cf761f718a31a</id>
<content type='text'>
commit 641fe2e9387a36f9ee01d7c69382d1fe147a5e98 upstream.

Uninitialized memmaps contain garbage and in the worst case trigger kernel
BUGs, especially with CONFIG_PAGE_POISONING.  They should not get touched.

Right now, when trying to soft-offline a PFN that resides on a memory
block that was never onlined, one gets a misleading error with
CONFIG_PAGE_POISONING:

  :/# echo 5637144576 &gt; /sys/devices/system/memory/soft_offline_page
  [   23.097167] soft offline: 0x150000 page already poisoned

But the actual result depends on the garbage in the memmap.

soft_offline_page() can only work with online pages, it returns -EIO in
case of ZONE_DEVICE.  Make sure to only forward pages that are online
(iow, managed by the buddy) and, therefore, have an initialized memmap.

Add a check against pfn_to_online_page() and similarly return -EIO.

Link: http://lkml.kernel.org/r/20191010141200.8985-1-david@redhat.com
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e86b319]
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: "Rafael J. Wysocki" &lt;rafael@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[4.13+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>License cleanup: add SPDX GPL-2.0 license identifier to files with no license</title>
<updated>2017-11-02T10:10:55+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2017-11-01T14:07:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b24413180f5600bcb3bb70fbed5cf186b60864bd'/>
<id>urn:sha1:b24413180f5600bcb3bb70fbed5cf186b60864bd</id>
<content type='text'>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode &amp; Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained &gt;5
   lines of source
 - File already had some variant of a license header in it (even if &lt;5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart &lt;kstewart@linuxfoundation.org&gt;
Reviewed-by: Philippe Ombredanne &lt;pombredanne@nexb.com&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm, memory_hotplug: remove zone restrictions</title>
<updated>2017-09-07T00:27:25+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2017-09-06T23:19:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c6f03e2903c9ecd8fd709a5b3fa8cf0a8ae0b3da'/>
<id>urn:sha1:c6f03e2903c9ecd8fd709a5b3fa8cf0a8ae0b3da</id>
<content type='text'>
Historically we have enforced that any kernel zone (e.g ZONE_NORMAL) has
to precede the Movable zone in the physical memory range.  The purpose
of the movable zone is, however, not bound to any physical memory
restriction.  It merely defines a class of migrateable and reclaimable
memory.

There are users (e.g.  CMA) who might want to reserve specific physical
memory ranges for their own purpose.  Moreover our pfn walkers have to
be prepared for zones overlapping in the physical range already because
we do support interleaving NUMA nodes and therefore zones can interleave
as well.  This means we can allow each memory block to be associated
with a different zone.

Loosen the current onlining semantic and allow explicit onlining type on
any memblock.  That means that online_{kernel,movable} will be allowed
regardless of the physical address of the memblock as long as it is
offline of course.  This might result in moveble zone overlapping with
other kernel zones.  Default onlining then becomes a bit tricky but
still sensible.  echo online &gt; memoryXY/state will online the given
block to

	1) the default zone if the given range is outside of any zone
	2) the enclosing zone if such a zone doesn't interleave with
	   any other zone
        3) the default zone if more zones interleave for this range

where default zone is movable zone only if movable_node is enabled
otherwise it is a kernel zone.

Here is an example of the semantic with (movable_node is not present but
it work in an analogous way). We start with following memblocks, all of
them offline:

  memory34/valid_zones:Normal Movable
  memory35/valid_zones:Normal Movable
  memory36/valid_zones:Normal Movable
  memory37/valid_zones:Normal Movable
  memory38/valid_zones:Normal Movable
  memory39/valid_zones:Normal Movable
  memory40/valid_zones:Normal Movable
  memory41/valid_zones:Normal Movable

Now, we online block 34 in default mode and block 37 as movable

  root@test1:/sys/devices/system/node/node1# echo online &gt; memory34/state
  root@test1:/sys/devices/system/node/node1# echo online_movable &gt; memory37/state
  memory34/valid_zones:Normal
  memory35/valid_zones:Normal Movable
  memory36/valid_zones:Normal Movable
  memory37/valid_zones:Movable
  memory38/valid_zones:Normal Movable
  memory39/valid_zones:Normal Movable
  memory40/valid_zones:Normal Movable
  memory41/valid_zones:Normal Movable

As we can see all other blocks can still be onlined both into Normal and
Movable zones and the Normal is default because the Movable zone spans
only block37 now.

  root@test1:/sys/devices/system/node/node1# echo online_movable &gt; memory41/state
  memory34/valid_zones:Normal
  memory35/valid_zones:Normal Movable
  memory36/valid_zones:Normal Movable
  memory37/valid_zones:Movable
  memory38/valid_zones:Movable Normal
  memory39/valid_zones:Movable Normal
  memory40/valid_zones:Movable Normal
  memory41/valid_zones:Movable

Now the default zone for blocks 37-41 has changed because movable zone
spans that range.

  root@test1:/sys/devices/system/node/node1# echo online_kernel &gt; memory39/state
  memory34/valid_zones:Normal
  memory35/valid_zones:Normal Movable
  memory36/valid_zones:Normal Movable
  memory37/valid_zones:Movable
  memory38/valid_zones:Normal Movable
  memory39/valid_zones:Normal
  memory40/valid_zones:Movable Normal
  memory41/valid_zones:Movable

Note that the block 39 now belongs to the zone Normal and so block38
falls into Normal by default as well.

For completness

  root@test1:/sys/devices/system/node/node1# for i in memory[34]?
  do
	echo online &gt; $i/state 2&gt;/dev/null
  done

  memory34/valid_zones:Normal
  memory35/valid_zones:Normal
  memory36/valid_zones:Normal
  memory37/valid_zones:Movable
  memory38/valid_zones:Normal
  memory39/valid_zones:Normal
  memory40/valid_zones:Movable
  memory41/valid_zones:Movable

Implementation wise the change is quite straightforward.  We can get rid
of allow_online_pfn_range altogether.  online_pages allows only offline
nodes already.  The original default_zone_for_pfn will become
default_kernel_zone_for_pfn.  New default_zone_for_pfn implements the
above semantic.  zone_for_pfn_range is slightly reorganized to implement
kernel and movable online type explicitly and MMOP_ONLINE_KEEP becomes a
catch all default behavior.

Link: http://lkml.kernel.org/r/20170714121233.16861-3-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Reza Arbab &lt;arbab@linux.vnet.ibm.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Yasuaki Ishimatsu &lt;yasu.isimatu@gmail.com&gt;
Cc: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Cc: Kani Toshimitsu &lt;toshi.kani@hpe.com&gt;
Cc: &lt;slaoub@gmail.com&gt;
Cc: Daniel Kiper &lt;daniel.kiper@oracle.com&gt;
Cc: Igor Mammedov &lt;imammedo@redhat.com&gt;
Cc: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Cc: &lt;linux-api@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, memory_hotplug: display allowed zones in the preferred ordering</title>
<updated>2017-09-07T00:27:25+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2017-09-06T23:19:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e5e68930263377c6d4f6da0ff06f36b55d83a83f'/>
<id>urn:sha1:e5e68930263377c6d4f6da0ff06f36b55d83a83f</id>
<content type='text'>
Prior to commit f1dd2cd13c4b ("mm, memory_hotplug: do not associate
hotadded memory to zones until online") we used to allow to change the
valid zone types of a memory block if it is adjacent to a different zone
type.

This fact was reflected in memoryNN/valid_zones by the ordering of
printed zones.  The first one was default (echo online &gt; memoryNN/state)
and the other one could be onlined explicitly by online_{movable,kernel}.

This behavior was removed by the said patch and as such the ordering was
not all that important.  In most cases a kernel zone would be default
anyway.  The only exception is movable_node handled by "mm,
memory_hotplug: support movable_node for hotpluggable nodes".

Let's reintroduce this behavior again because later patch will remove
the zone overlap restriction and so user will be allowed to online
kernel resp.  movable block regardless of its placement.  Original
behavior will then become significant again because it would be
non-trivial for users to see what is the default zone to online into.

Implementation is really simple.  Pull out zone selection out of
move_pfn_range into zone_for_pfn_range helper and use it in
show_valid_zones to display the zone for default onlining and then both
kernel and movable if they are allowed.  Default online zone is not
duplicated.

Link: http://lkml.kernel.org/r/20170714121233.16861-2-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Reza Arbab &lt;arbab@linux.vnet.ibm.com&gt;
Cc: Yasuaki Ishimatsu &lt;yasu.isimatu@gmail.com&gt;
Cc: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Cc: Kani Toshimitsu &lt;toshi.kani@hpe.com&gt;
Cc: &lt;slaoub@gmail.com&gt;
Cc: Daniel Kiper &lt;daniel.kiper@oracle.com&gt;
Cc: Igor Mammedov &lt;imammedo@redhat.com&gt;
Cc: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, memory_hotplug: do not assume ZONE_NORMAL is default kernel zone</title>
<updated>2017-07-06T23:24:32+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2017-07-06T22:38:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c246a213f5bad687c6c2cea27d7265eaf8f6f5d7'/>
<id>urn:sha1:c246a213f5bad687c6c2cea27d7265eaf8f6f5d7</id>
<content type='text'>
Heiko Carstens has noticed that he can generate overlapping zones for
ZONE_DMA and ZONE_NORMAL:

  DMA      [mem 0x0000000000000000-0x000000007fffffff]
  Normal   [mem 0x0000000080000000-0x000000017fffffff]

  $ cat /sys/devices/system/memory/block_size_bytes
  10000000
  $ cat /sys/devices/system/memory/memory5/valid_zones
  DMA
  $ echo 0 &gt; /sys/devices/system/memory/memory5/online
  $ cat /sys/devices/system/memory/memory5/valid_zones
  Normal
  $ echo 1 &gt; /sys/devices/system/memory/memory5/online
  Normal

  $ cat /proc/zoneinfo
  Node 0, zone      DMA
  spanned  524288        &lt;-----
  present  458752
  managed  455078
  start_pfn:           0 &lt;-----

  Node 0, zone   Normal
  spanned  720896
  present  589824
  managed  571648
  start_pfn:           327680 &lt;-----

The reason is that we assume that the default zone for kernel onlining
is ZONE_NORMAL.  This was a simplification introduced by the memory
hotplug rework and it is easily fixable by checking the range overlap in
the zone order and considering the first matching zone as the default
one.  If there is no such zone then assume ZONE_NORMAL as we have been
doing so far.

Fixes: "mm, memory_hotplug: do not associate hotadded memory to zones until online"
Link: http://lkml.kernel.org/r/20170601083746.4924-3-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reported-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Tested-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Reza Arbab &lt;arbab@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, memory_hotplug: do not associate hotadded memory to zones until online</title>
<updated>2017-07-06T23:24:32+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2017-07-06T22:38:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f1dd2cd13c4bbbc9a7c4617b3b034fa643de98fe'/>
<id>urn:sha1:f1dd2cd13c4bbbc9a7c4617b3b034fa643de98fe</id>
<content type='text'>
The current memory hotplug implementation relies on having all the
struct pages associate with a zone/node during the physical hotplug
phase (arch_add_memory-&gt;__add_pages-&gt;__add_section-&gt;__add_zone).  In the
vast majority of cases this means that they are added to ZONE_NORMAL.
This has been so since 9d99aaa31f59 ("[PATCH] x86_64: Support memory
hotadd without sparsemem") and it wasn't a big deal back then because
movable onlining didn't exist yet.

Much later memory hotplug wanted to (ab)use ZONE_MOVABLE for movable
onlining 511c2aba8f07 ("mm, memory-hotplug: dynamic configure movable
memory and portion memory") and then things got more complicated.
Rather than reconsidering the zone association which was no longer
needed (because the memory hotplug already depended on SPARSEMEM) a
convoluted semantic of zone shifting has been developed.  Only the
currently last memblock or the one adjacent to the zone_movable can be
onlined movable.  This essentially means that the online type changes as
the new memblocks are added.

Let's simulate memory hot online manually
  $ echo 0x100000000 &gt; /sys/devices/system/memory/probe
  $ grep . /sys/devices/system/memory/memory32/valid_zones
  Normal Movable

  $ echo $((0x100000000+(128&lt;&lt;20))) &gt; /sys/devices/system/memory/probe
  $ grep . /sys/devices/system/memory/memory3?/valid_zones
  /sys/devices/system/memory/memory32/valid_zones:Normal
  /sys/devices/system/memory/memory33/valid_zones:Normal Movable

  $ echo $((0x100000000+2*(128&lt;&lt;20))) &gt; /sys/devices/system/memory/probe
  $ grep . /sys/devices/system/memory/memory3?/valid_zones
  /sys/devices/system/memory/memory32/valid_zones:Normal
  /sys/devices/system/memory/memory33/valid_zones:Normal
  /sys/devices/system/memory/memory34/valid_zones:Normal Movable

  $ echo online_movable &gt; /sys/devices/system/memory/memory34/state
  $ grep . /sys/devices/system/memory/memory3?/valid_zones
  /sys/devices/system/memory/memory32/valid_zones:Normal
  /sys/devices/system/memory/memory33/valid_zones:Normal Movable
  /sys/devices/system/memory/memory34/valid_zones:Movable Normal

This is an awkward semantic because an udev event is sent as soon as the
block is onlined and an udev handler might want to online it based on
some policy (e.g.  association with a node) but it will inherently race
with new blocks showing up.

This patch changes the physical online phase to not associate pages with
any zone at all.  All the pages are just marked reserved and wait for
the onlining phase to be associated with the zone as per the online
request.  There are only two requirements

	- existing ZONE_NORMAL and ZONE_MOVABLE cannot overlap

	- ZONE_NORMAL precedes ZONE_MOVABLE in physical addresses

the latter one is not an inherent requirement and can be changed in the
future.  It preserves the current behavior and made the code slightly
simpler.  This is subject to change in future.

This means that the same physical online steps as above will lead to the
following state: Normal Movable

  /sys/devices/system/memory/memory32/valid_zones:Normal Movable
  /sys/devices/system/memory/memory33/valid_zones:Normal Movable

  /sys/devices/system/memory/memory32/valid_zones:Normal Movable
  /sys/devices/system/memory/memory33/valid_zones:Normal Movable
  /sys/devices/system/memory/memory34/valid_zones:Normal Movable

  /sys/devices/system/memory/memory32/valid_zones:Normal Movable
  /sys/devices/system/memory/memory33/valid_zones:Normal Movable
  /sys/devices/system/memory/memory34/valid_zones:Movable

Implementation:
The current move_pfn_range is reimplemented to check the above
requirements (allow_online_pfn_range) and then updates the respective
zone (move_pfn_range_to_zone), the pgdat and links all the pages in the
pfn range with the zone/node.  __add_pages is updated to not require the
zone and only initializes sections in the range.  This allowed to
simplify the arch_add_memory code (s390 could get rid of quite some of
code).

devm_memremap_pages is the only user of arch_add_memory which relies on
the zone association because it only hooks into the memory hotplug only
half way.  It uses it to associate the new memory with ZONE_DEVICE but
doesn't allow it to be {on,off}lined via sysfs.  This means that this
particular code path has to call move_pfn_range_to_zone explicitly.

The original zone shifting code is kept in place and will be removed in
the follow up patch for an easier review.

Please note that this patch also changes the original behavior when
offlining a memory block adjacent to another zone (Normal vs.  Movable)
used to allow to change its movable type.  This will be handled later.

[richard.weiyang@gmail.com: simplify zone_intersects()]
  Link: http://lkml.kernel.org/r/20170616092335.5177-1-richard.weiyang@gmail.com
[richard.weiyang@gmail.com: remove duplicate call for set_page_links]
  Link: http://lkml.kernel.org/r/20170616092335.5177-2-richard.weiyang@gmail.com
[akpm@linux-foundation.org: remove unused local `i']
Link: http://lkml.kernel.org/r/20170515085827.16474-12-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Tested-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Tested-by: Reza Arbab &lt;arbab@linux.vnet.ibm.com&gt;
Acked-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt; # For s390 bits
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Balbir Singh &lt;bsingharora@gmail.com&gt;
Cc: Daniel Kiper &lt;daniel.kiper@oracle.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Igor Mammedov &lt;imammedo@redhat.com&gt;
Cc: Jerome Glisse &lt;jglisse@redhat.com&gt;
Cc: Joonsoo Kim &lt;js1304@gmail.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Tobias Regnery &lt;tobias.regnery@gmail.com&gt;
Cc: Toshi Kani &lt;toshi.kani@hpe.com&gt;
Cc: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Cc: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Cc: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, memory_hotplug: consider offline memblocks removable</title>
<updated>2017-07-06T23:24:32+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2017-07-06T22:37:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8b0662f245a328df8873be949b0087760420f40c'/>
<id>urn:sha1:8b0662f245a328df8873be949b0087760420f40c</id>
<content type='text'>
is_pageblock_removable_nolock() relies on having zone association to
examine all the page blocks to check whether they are movable or free.
This is just wasting of cycles when the memblock is offline.  Later
patch in the series will also change the time when the page is
associated with a zone so we let's bail out early if the memblock is
offline.

Link: http://lkml.kernel.org/r/20170515085827.16474-7-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reported-by: Igor Mammedov &lt;imammedo@redhat.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Balbir Singh &lt;bsingharora@gmail.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Daniel Kiper &lt;daniel.kiper@oracle.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Jerome Glisse &lt;jglisse@redhat.com&gt;
Cc: Joonsoo Kim &lt;js1304@gmail.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Reza Arbab &lt;arbab@linux.vnet.ibm.com&gt;
Cc: Tobias Regnery &lt;tobias.regnery@gmail.com&gt;
Cc: Toshi Kani &lt;toshi.kani@hpe.com&gt;
Cc: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Cc: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Cc: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, memory_hotplug: get rid of is_zone_device_section</title>
<updated>2017-07-06T23:24:32+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2017-07-06T22:37:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1b862aecfbd419cdc4553645bf86d07554279bed'/>
<id>urn:sha1:1b862aecfbd419cdc4553645bf86d07554279bed</id>
<content type='text'>
Device memory hotplug hooks into regular memory hotplug only half way.
It needs memory sections to track struct pages but there is no
need/desire to associate those sections with memory blocks and export
them to the userspace via sysfs because they cannot be onlined anyway.

This is currently expressed by for_device argument to arch_add_memory
which then makes sure to associate the given memory range with
ZONE_DEVICE.  register_new_memory then relies on is_zone_device_section
to distinguish special memory hotplug from the regular one.  While this
works now, later patches in this series want to move __add_zone outside
of arch_add_memory path so we have to come up with something else.

Add want_memblock down the __add_pages path and use it to control
whether the section-&gt;memblock association should be done.
arch_add_memory then just trivially want memblock for everything but
for_device hotplug.

remove_memory_section doesn't need is_zone_device_section either.  We
can simply skip all the memblock specific cleanup if there is no
memblock for the given section.

This shouldn't introduce any functional change.

Link: http://lkml.kernel.org/r/20170515085827.16474-5-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Tested-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Balbir Singh &lt;bsingharora@gmail.com&gt;
Cc: Daniel Kiper &lt;daniel.kiper@oracle.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Igor Mammedov &lt;imammedo@redhat.com&gt;
Cc: Jerome Glisse &lt;jglisse@redhat.com&gt;
Cc: Joonsoo Kim &lt;js1304@gmail.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Reza Arbab &lt;arbab@linux.vnet.ibm.com&gt;
Cc: Tobias Regnery &lt;tobias.regnery@gmail.com&gt;
Cc: Toshi Kani &lt;toshi.kani@hpe.com&gt;
Cc: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Cc: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Cc: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
