summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorTejun Heo <tj@kernel.org>2017-09-25 19:00:19 +0300
committerTejun Heo <tj@kernel.org>2017-09-30 00:30:37 +0300
commit0d5936344f30aba0f6ddb92b030cb6a05168efe6 (patch)
tree5eabe1efd54035cac1594286f60d44f2dc2df786 /Documentation
parenta1f7164c7b8b0d46f63bfb4ca0bb5971c760b921 (diff)
downloadlinux-0d5936344f30aba0f6ddb92b030cb6a05168efe6.tar.xz
sched: Implement interface for cgroup unified hierarchy
There are a couple interface issues which can be addressed in cgroup2 interface. * Stats from cpuacct being reported separately from the cpu stats. * Use of different time units. Writable control knobs use microseconds, some stat fields use nanoseconds while other cpuacct stat fields use centiseconds. * Control knobs which can't be used in the root cgroup still show up in the root. * Control knob names and semantics aren't consistent with other controllers. This patchset implements cpu controller's interface on cgroup2 which adheres to the controller file conventions described in Documentation/cgroups/cgroup-v2.txt. Overall, the following changes are made. * cpuacct is implictly enabled and disabled by cpu and its information is reported through "cpu.stat" which now uses microseconds for all time durations. All time duration fields now have "_usec" appended to them for clarity. Note that cpuacct.usage_percpu is currently not included in "cpu.stat". If this information is actually called for, it will be added later. * "cpu.shares" is replaced with "cpu.weight" and operates on the standard scale defined by CGROUP_WEIGHT_MIN/DFL/MAX (1, 100, 10000). The weight is scaled to scheduler weight so that 100 maps to 1024 and the ratio relationship is preserved - if weight is W and its scaled value is S, W / 100 == S / 1024. While the mapped range is a bit smaller than the orignal scheduler weight range, the dead zones on both sides are relatively small and covers wider range than the nice value mappings. This file doesn't make sense in the root cgroup and isn't created on root. * "cpu.weight.nice" is added. When read, it reads back the nice value which is closest to the current "cpu.weight". When written, it sets "cpu.weight" to the weight value which matches the nice value. This makes it easy to configure cgroups when they're competing against threads in threaded subtrees. * "cpu.cfs_quota_us" and "cpu.cfs_period_us" are replaced by "cpu.max" which contains both quota and period. v4: - Use cgroup2 basic usage stat as the information source instead of cpuacct. v3: - Added "cpu.weight.nice" to allow using nice values when configuring the weight. The feature is requested by PeterZ. - Merge the patch to enable threaded support on cpu and cpuacct. - Dropped the bits about getting rid of cpuacct from patch description as there is a pretty strong case for making cpuacct an implicit controller so that basic cpu usage stats are always available. - Documentation updated accordingly. "cpu.rt.max" section is dropped for now. v2: - cpu_stats_show() was incorrectly using CONFIG_FAIR_GROUP_SCHED for CFS bandwidth stats and also using raw division for u64. Use CONFIG_CFS_BANDWITH and do_div() instead. "cpu.rt.max" is not included yet. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org>
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/cgroup-v2.txt36
1 files changed, 12 insertions, 24 deletions
diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index 3f8216912df0..0bbdc720dd7c 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -902,10 +902,6 @@ Controllers
CPU
---
-.. note::
-
- The interface for the cpu controller hasn't been merged yet
-
The "cpu" controllers regulates distribution of CPU cycles. This
controller implements weight and absolute bandwidth limit models for
normal scheduling policy and absolute bandwidth allocation model for
@@ -935,6 +931,18 @@ All time durations are in microseconds.
The weight in the range [1, 10000].
+ cpu.weight.nice
+ A read-write single value file which exists on non-root
+ cgroups. The default is "0".
+
+ The nice value is in the range [-20, 19].
+
+ This interface file is an alternative interface for
+ "cpu.weight" and allows reading and setting weight using the
+ same values used by nice(2). Because the range is smaller and
+ granularity is coarser for the nice values, the read value is
+ the closest approximation of the current weight.
+
cpu.max
A read-write two value file which exists on non-root cgroups.
The default is "max 100000".
@@ -947,26 +955,6 @@ All time durations are in microseconds.
$PERIOD duration. "max" for $MAX indicates no limit. If only
one number is written, $MAX is updated.
- cpu.rt.max
- .. note::
-
- The semantics of this file is still under discussion and the
- interface hasn't been merged yet
-
- A read-write two value file which exists on all cgroups.
- The default is "0 100000".
-
- The maximum realtime runtime allocation. Over-committing
- configurations are disallowed and process migrations are
- rejected if not enough bandwidth is available. It's in the
- following format::
-
- $MAX $PERIOD
-
- which indicates that the group may consume upto $MAX in each
- $PERIOD duration. If only one number is written, $MAX is
- updated.
-
Memory
------