diff options
author | Huang Ying <ying.huang@intel.com> | 2021-09-03 00:59:33 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-09-03 19:58:16 +0300 |
commit | 20b51af15e014cac63b58a4f8b8b323ac35bccce (patch) | |
tree | 1c718dff92e64dfef017c46ba2662c844353bce8 /Documentation/ABI | |
parent | 3a235693d3930e1276c8d9cc0ca5807ef292cf0a (diff) | |
download | linux-20b51af15e014cac63b58a4f8b8b323ac35bccce.tar.xz |
mm/migrate: add sysfs interface to enable reclaim migration
Some method is obviously needed to enable reclaim-based migration.
Just like traditional autonuma, there will be some workloads that will
benefit like workloads with more "static" configurations where hot pages
stay hot and cold pages stay cold. If pages come and go from the hot and
cold sets, the benefits of this approach will be more limited.
The benefits are truly workload-based and *not* hardware-based. We do not
believe that there is a viable threshold where certain hardware
configurations should have this mechanism enabled while others do not.
To be conservative, earlier work defaulted to disable reclaim- based
migration and did not include a mechanism to enable it. This proposes add
a new sysfs file
/sys/kernel/mm/numa/demotion_enabled
as a method to enable it.
We are open to any alternative that allows end users to enable this
mechanism or disable it if workload harm is detected (just like
traditional autonuma).
Once this is enabled page demotion may move data to a NUMA node that does
not fall into the cpuset of the allocating process. This could be
construed to violate the guarantees of cpusets. However, since this is an
opt-in mechanism, the assumption is that anyone enabling it is content to
relax the guarantees.
Link: https://lkml.kernel.org/r/20210721063926.3024591-9-ying.huang@intel.com
Link: https://lkml.kernel.org/r/20210715055145.195411-10-ying.huang@intel.com
Signed-off-by: Huang Ying <ying.huang@intel.com>
Originally-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'Documentation/ABI')
-rw-r--r-- | Documentation/ABI/testing/sysfs-kernel-mm-numa | 24 |
1 files changed, 24 insertions, 0 deletions
diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-numa b/Documentation/ABI/testing/sysfs-kernel-mm-numa new file mode 100644 index 000000000000..77e559d4ed80 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-kernel-mm-numa @@ -0,0 +1,24 @@ +What: /sys/kernel/mm/numa/ +Date: June 2021 +Contact: Linux memory management mailing list <linux-mm@kvack.org> +Description: Interface for NUMA + +What: /sys/kernel/mm/numa/demotion_enabled +Date: June 2021 +Contact: Linux memory management mailing list <linux-mm@kvack.org> +Description: Enable/disable demoting pages during reclaim + + Page migration during reclaim is intended for systems + with tiered memory configurations. These systems have + multiple types of memory with varied performance + characteristics instead of plain NUMA systems where + the same kind of memory is found at varied distances. + Allowing page migration during reclaim enables these + systems to migrate pages from fast tiers to slow tiers + when the fast tier is under pressure. This migration + is performed before swap. It may move data to a NUMA + node that does not fall into the cpuset of the + allocating process which might be construed to violate + the guarantees of cpusets. This should not be enabled + on systems which need strict cpuset location + guarantees. |