diff options
Diffstat (limited to 'Documentation/admin-guide/mm/transhuge.rst')
-rw-r--r-- | Documentation/admin-guide/mm/transhuge.rst | 119 |
1 files changed, 95 insertions, 24 deletions
diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index a1bb495eab59..dff8d5985f0f 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -326,11 +326,53 @@ PMD_ORDER THP policy will be overridden. If the policy for PMD_ORDER is not defined within a valid ``thp_anon``, its policy will default to ``never``. +Similarly to ``transparent_hugepage``, you can control the hugepage +allocation policy for the internal shmem mount by using the kernel parameter +``transparent_hugepage_shmem=<policy>``, where ``<policy>`` is one of the +seven valid policies for shmem (``always``, ``within_size``, ``advise``, +``never``, ``deny``, and ``force``). + +Similarly to ``transparent_hugepage_shmem``, you can control the default +hugepage allocation policy for the tmpfs mount by using the kernel parameter +``transparent_hugepage_tmpfs=<policy>``, where ``<policy>`` is one of the +four valid policies for tmpfs (``always``, ``within_size``, ``advise``, +``never``). The tmpfs mount default policy is ``never``. + +In the same manner as ``thp_anon`` controls each supported anonymous THP +size, ``thp_shmem`` controls each supported shmem THP size. ``thp_shmem`` +has the same format as ``thp_anon``, but also supports the policy +``within_size``. + +``thp_shmem=`` may be specified multiple times to configure all THP sizes +as required. If ``thp_shmem=`` is specified at least once, any shmem THP +sizes not explicitly configured on the command line are implicitly set to +``never``. + +``transparent_hugepage_shmem`` setting only affects the global toggle. If +``thp_shmem`` is not specified, PMD_ORDER hugepage will default to +``inherit``. However, if a valid ``thp_shmem`` setting is provided by the +user, the PMD_ORDER hugepage policy will be overridden. If the policy for +PMD_ORDER is not defined within a valid ``thp_shmem``, its policy will +default to ``never``. + Hugepages in tmpfs/shmem ======================== -You can control hugepage allocation policy in tmpfs with mount option -``huge=``. It can have following values: +Traditionally, tmpfs only supported a single huge page size ("PMD"). Today, +it also supports smaller sizes just like anonymous memory, often referred +to as "multi-size THP" (mTHP). Huge pages of any size are commonly +represented in the kernel as "large folios". + +While there is fine control over the huge page sizes to use for the internal +shmem mount (see below), ordinary tmpfs mounts will make use of all available +huge page sizes without any control over the exact sizes, behaving more like +other file systems. + +tmpfs mounts +------------ + +The THP allocation policy for tmpfs mounts can be adjusted using the mount +option: ``huge=``. It can have following values: always Attempt to allocate huge pages every time we need a new page; @@ -340,24 +382,24 @@ never within_size Only allocate huge page if it will be fully within i_size. - Also respect fadvise()/madvise() hints; + Also respect madvise() hints; advise - Only allocate huge pages if requested with fadvise()/madvise(); + Only allocate huge pages if requested with madvise(); + +Remember, that the kernel may use huge pages of all available sizes, and +that no fine control as for the internal tmpfs mount is available. -The default policy is ``never``. +The default policy in the past was ``never``, but it can now be adjusted +using the kernel parameter ``transparent_hugepage_tmpfs=<policy>``. ``mount -o remount,huge= /mountpoint`` works fine after mount: remounting ``huge=never`` will not attempt to break up huge pages at all, just stop more from being allocated. -There's also sysfs knob to control hugepage allocation policy for internal -shmem mount: /sys/kernel/mm/transparent_hugepage/shmem_enabled. The mount -is used for SysV SHM, memfds, shared anonymous mmaps (of /dev/zero or -MAP_ANONYMOUS), GPU drivers' DRM objects, Ashmem. - -In addition to policies listed above, shmem_enabled allows two further -values: +In addition to policies listed above, the sysfs knob +/sys/kernel/mm/transparent_hugepage/shmem_enabled will affect the +allocation policy of tmpfs mounts, when set to the following values: deny For use in emergencies, to force the huge option off from @@ -365,13 +407,24 @@ deny force Force the huge option on for all - very useful for testing; -Shmem can also use "multi-size THP" (mTHP) by adding a new sysfs knob to -control mTHP allocation: -'/sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/shmem_enabled', -and its value for each mTHP is essentially consistent with the global -setting. An 'inherit' option is added to ensure compatibility with these -global settings. Conversely, the options 'force' and 'deny' are dropped, -which are rather testing artifacts from the old ages. +shmem / internal tmpfs +---------------------- +The mount internal tmpfs mount is used for SysV SHM, memfds, shared anonymous +mmaps (of /dev/zero or MAP_ANONYMOUS), GPU drivers' DRM objects, Ashmem. + +To control the THP allocation policy for this internal tmpfs mount, the +sysfs knob /sys/kernel/mm/transparent_hugepage/shmem_enabled and the knobs +per THP size in +'/sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/shmem_enabled' +can be used. + +The global knob has the same semantics as the ``huge=`` mount options +for tmpfs mounts, except that the different huge page sizes can be controlled +individually, and will only use the setting of the global knob when the +per-size knob is set to 'inherit'. + +The options 'force' and 'deny' are dropped for the individual sizes, which +are rather testing artifacts from the old ages. always Attempt to allocate <size> huge pages every time we need a new page; @@ -385,10 +438,10 @@ never within_size Only allocate <size> huge page if it will be fully within i_size. - Also respect fadvise()/madvise() hints; + Also respect madvise() hints; advise - Only allocate <size> huge pages if requested with fadvise()/madvise(); + Only allocate <size> huge pages if requested with madvise(); Need of application restart =========================== @@ -413,7 +466,7 @@ AnonHugePmdMapped). The number of file transparent huge pages mapped to userspace is available by reading ShmemPmdMapped and ShmemHugePages fields in ``/proc/meminfo``. To identify what applications are mapping file transparent huge pages, it -is necessary to read ``/proc/PID/smaps`` and count the FileHugeMapped fields +is necessary to read ``/proc/PID/smaps`` and count the FilePmdMapped fields for each mapping. Note that reading the smaps file is expensive and reading it @@ -530,10 +583,28 @@ anon_fault_fallback_charge instead falls back to using huge pages with lower orders or small pages even though the allocation was successful. -swpout - is incremented every time a huge page is swapped out in one +zswpout + is incremented every time a huge page is swapped out to zswap in one piece without splitting. +swpin + is incremented every time a huge page is swapped in from a non-zswap + swap device in one piece. + +swpin_fallback + is incremented if swapin fails to allocate or charge a huge page + and instead falls back to using huge pages with lower orders or + small pages. + +swpin_fallback_charge + is incremented if swapin fails to charge a huge page and instead + falls back to using huge pages with lower orders or small pages + even though the allocation was successful. + +swpout + is incremented every time a huge page is swapped out to a non-zswap + swap device in one piece without splitting. + swpout_fallback is incremented if a huge page has to be split before swapout. Usually because failed to allocate some continuous swap space |