diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2020-12-15 23:53:37 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2020-12-15 23:53:37 +0300 |
commit | ac73e3dc8acd0a3be292755db30388c3580f5674 (patch) | |
tree | 5abef6cb82b205b5dbbb69dca950b8a5aae716de /Documentation/admin-guide | |
parent | 148842c98a24e508aecb929718818fbf4c2a6ff3 (diff) | |
parent | dfefd226b0bf7c435a58d75a0ce2f9273b9825f6 (diff) | |
download | linux-ac73e3dc8acd0a3be292755db30388c3580f5674.tar.xz |
Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
- a few random little subsystems
- almost all of the MM patches which are staged ahead of linux-next
material. I'll trickle to post-linux-next work in as the dependents
get merged up.
Subsystems affected by this patch series: kthread, kbuild, ide, ntfs,
ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache,
gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation,
kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction,
oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc,
uaccess, zram, and cleanups).
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (200 commits)
mm: cleanup kstrto*() usage
mm: fix fall-through warnings for Clang
mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at
mm: shmem: convert shmem_enabled_show to use sysfs_emit_at
mm:backing-dev: use sysfs_emit in macro defining functions
mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening
mm: use sysfs_emit for struct kobject * uses
mm: fix kernel-doc markups
zram: break the strict dependency from lzo
zram: add stat to gather incompressible pages since zram set up
zram: support page writeback
mm/process_vm_access: remove redundant initialization of iov_r
mm/zsmalloc.c: rework the list_add code in insert_zspage()
mm/zswap: move to use crypto_acomp API for hardware acceleration
mm/zswap: fix passing zero to 'PTR_ERR' warning
mm/zswap: make struct kernel_param_ops definitions const
userfaultfd/selftests: hint the test runner on required privilege
userfaultfd/selftests: fix retval check for userfaultfd_open()
userfaultfd/selftests: always dump something in modes
userfaultfd: selftests: make __{s,u}64 format specifiers portable
...
Diffstat (limited to 'Documentation/admin-guide')
-rw-r--r-- | Documentation/admin-guide/blockdev/zram.rst | 6 | ||||
-rw-r--r-- | Documentation/admin-guide/cgroup-v1/memcg_test.rst | 8 | ||||
-rw-r--r-- | Documentation/admin-guide/cgroup-v1/memory.rst | 40 | ||||
-rw-r--r-- | Documentation/admin-guide/cgroup-v2.rst | 11 | ||||
-rw-r--r-- | Documentation/admin-guide/mm/transhuge.rst | 15 | ||||
-rw-r--r-- | Documentation/admin-guide/sysctl/vm.rst | 15 |
6 files changed, 43 insertions, 52 deletions
diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst index 9093228cf18b..700329d25f57 100644 --- a/Documentation/admin-guide/blockdev/zram.rst +++ b/Documentation/admin-guide/blockdev/zram.rst @@ -266,6 +266,7 @@ line of text and contains the following stats separated by whitespace: No memory is allocated for such pages. pages_compacted the number of pages freed during compaction huge_pages the number of incompressible pages + huge_pages_since the number of incompressible pages since zram set up ================ ============================================================= File /sys/block/zram<id>/bd_stat @@ -334,6 +335,11 @@ Admin can request writeback of those idle pages at right timing via:: With the command, zram writeback idle pages from memory to the storage. +If admin want to write a specific page in zram device to backing device, +they could write a page index into the interface. + + echo "page_index=1251" > /sys/block/zramX/writeback + If there are lots of write IO with flash device, potentially, it has flash wearout problem so that admin needs to design write limitation to guarantee storage health for entire product life. diff --git a/Documentation/admin-guide/cgroup-v1/memcg_test.rst b/Documentation/admin-guide/cgroup-v1/memcg_test.rst index 3f7115e07b5d..4f83de2dab6e 100644 --- a/Documentation/admin-guide/cgroup-v1/memcg_test.rst +++ b/Documentation/admin-guide/cgroup-v1/memcg_test.rst @@ -219,13 +219,11 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y. This is an easy way to test page migration, too. -9.5 mkdir/rmdir ---------------- +9.5 nested cgroups +------------------ - When using hierarchy, mkdir/rmdir test should be done. - Use tests like the following:: + Use tests like the following for testing nested cgroups:: - echo 1 >/opt/cgroup/01/memory/use_hierarchy mkdir /opt/cgroup/01/child_a mkdir /opt/cgroup/01/child_b diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst index 12757e63b26c..a44cd467d218 100644 --- a/Documentation/admin-guide/cgroup-v1/memory.rst +++ b/Documentation/admin-guide/cgroup-v1/memory.rst @@ -77,6 +77,8 @@ Brief summary of control files. memory.soft_limit_in_bytes set/show soft limit of memory usage memory.stat show various statistics memory.use_hierarchy set/show hierarchical account enabled + This knob is deprecated and shouldn't be + used. memory.force_empty trigger forced page reclaim memory.pressure_level set memory pressure notifications memory.swappiness set/show swappiness parameter of vmscan @@ -495,16 +497,13 @@ cgroup might have some charge associated with it, even though all tasks have migrated away from it. (because we charge against pages, not against tasks.) -We move the stats to root (if use_hierarchy==0) or parent (if -use_hierarchy==1), and no change on the charge except uncharging +We move the stats to parent, and no change on the charge except uncharging from the child. Charges recorded in swap information is not updated at removal of cgroup. Recorded information is discarded and a cgroup which uses swap (swapcache) will be charged as a new owner of it. -About use_hierarchy, see Section 6. - 5. Misc. interfaces =================== @@ -527,8 +526,6 @@ About use_hierarchy, see Section 6. write will still return success. In this case, it is expected that memory.kmem.usage_in_bytes == memory.usage_in_bytes. - About use_hierarchy, see Section 6. - 5.2 stat file ------------- @@ -675,31 +672,20 @@ hierarchy:: d e In the diagram above, with hierarchical accounting enabled, all memory -usage of e, is accounted to its ancestors up until the root (i.e, c and root), -that has memory.use_hierarchy enabled. If one of the ancestors goes over its -limit, the reclaim algorithm reclaims from the tasks in the ancestor and the -children of the ancestor. - -6.1 Enabling hierarchical accounting and reclaim ------------------------------------------------- - -A memory cgroup by default disables the hierarchy feature. Support -can be enabled by writing 1 to memory.use_hierarchy file of the root cgroup:: +usage of e, is accounted to its ancestors up until the root (i.e, c and root). +If one of the ancestors goes over its limit, the reclaim algorithm reclaims +from the tasks in the ancestor and the children of the ancestor. - # echo 1 > memory.use_hierarchy - -The feature can be disabled by:: +6.1 Hierarchical accounting and reclaim +--------------------------------------- - # echo 0 > memory.use_hierarchy +Hierarchical accounting is enabled by default. Disabling the hierarchical +accounting is deprecated. An attempt to do it will result in a failure +and a warning printed to dmesg. -NOTE1: - Enabling/disabling will fail if either the cgroup already has other - cgroups created below it, or if the parent cgroup has use_hierarchy - enabled. +For compatibility reasons writing 1 to memory.use_hierarchy will always pass:: -NOTE2: - When panic_on_oom is set to "2", the whole system will panic in - case of an OOM event in any cgroup. + # echo 1 > memory.use_hierarchy 7. Soft limits ============== diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index 608d7c279396..63521cd36ce5 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1274,6 +1274,9 @@ PAGE_SIZE multiple when read back. kernel_stack Amount of memory allocated to kernel stacks. + pagetables + Amount of memory allocated for page tables. + percpu(npn) Amount of memory used for storing per-cpu kernel data structures. @@ -1300,6 +1303,14 @@ PAGE_SIZE multiple when read back. Amount of memory used in anonymous mappings backed by transparent hugepages + file_thp + Amount of cached filesystem data backed by transparent + hugepages + + shmem_thp + Amount of shm, tmpfs, shared anonymous mmap()s backed by + transparent hugepages + inactive_anon, active_anon, inactive_file, active_file, unevictable Amount of memory, swap-backed and filesystem-backed, on the internal memory management lists used by the diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index b2acd0d395ca..3b8a336511a4 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -401,21 +401,6 @@ compact_fail is incremented if the system tries to compact memory but failed. -compact_pages_moved - is incremented each time a page is moved. If - this value is increasing rapidly, it implies that the system - is copying a lot of data to satisfy the huge page allocation. - It is possible that the cost of copying exceeds any savings - from reduced TLB misses. - -compact_pagemigrate_failed - is incremented when the underlying mechanism - for moving a page failed. - -compact_blocks_moved - is incremented each time memory compaction examines - a huge page aligned range of pages. - It is possible to establish how long the stalls were using the function tracer to record how long was spent in __alloc_pages_nodemask and using the mm_page_alloc tracepoint to identify which allocations were diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst index e0cf17ad2ae5..e972caa43bd4 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -873,12 +873,17 @@ file-backed pages is less than the high watermark in a zone. unprivileged_userfaultfd ======================== -This flag controls whether unprivileged users can use the userfaultfd -system calls. Set this to 1 to allow unprivileged users to use the -userfaultfd system calls, or set this to 0 to restrict userfaultfd to only -privileged users (with SYS_CAP_PTRACE capability). +This flag controls the mode in which unprivileged users can use the +userfaultfd system calls. Set this to 0 to restrict unprivileged users +to handle page faults in user mode only. In this case, users without +SYS_CAP_PTRACE must pass UFFD_USER_MODE_ONLY in order for userfaultfd to +succeed. Prohibiting use of userfaultfd for handling faults from kernel +mode may make certain vulnerabilities more difficult to exploit. -The default value is 1. +Set this to 1 to allow unprivileged users to use the userfaultfd system +calls without any restrictions. + +The default value is 0. user_reserve_kbytes |