cgroup: cgroup v2 freezer

Cgroup v1 implements the freezer controller, which provides an ability to stop the workload in a cgroup and temporarily free up some resources (cpu, io, network bandwidth and, potentially, memory) for some other tasks. Cgroup v2 lacks this functionality. This patch implements freezer for cgroup v2. Cgroup v2 freezer tries to put tasks into a state similar to jobctl stop. This means that tasks can be killed, ptraced (using PTRACE_SEIZE*), and interrupted. It is possible to attach to a frozen task, get some information (e.g. read registers) and detach. It's also possible to migrate a frozen tasks to another cgroup. This differs cgroup v2 freezer from cgroup v1 freezer, which mostly tried to imitate the system-wide freezer. However uninterruptible sleep is fine when all tasks are going to be frozen (hibernation case), it's not the acceptable state for some subset of the system. Cgroup v2 freezer is not supporting freezing kthreads. If a non-root cgroup contains kthread, the cgroup still can be frozen, but the kthread will remain running, the cgroup will be shown as non-frozen, and the notification will not be delivered. * PTRACE_ATTACH is not working because non-fatal signal delivery is blocked in frozen state. There are some interface differences between cgroup v1 and cgroup v2 freezer too, which are required to conform the cgroup v2 interface design principles: 1) There is no separate controller, which has to be turned on: the functionality is always available and is represented by cgroup.freeze and cgroup.events cgroup control files. 2) The desired state is defined by the cgroup.freeze control file. Any hierarchical configuration is allowed. 3) The interface is asynchronous. The actual state is available using cgroup.events control file ("frozen" field). There are no dedicated transitional states. 4) It's allowed to make any changes with the cgroup hierarchy (create new cgroups, remove old cgroups, move tasks between cgroups) no matter if some cgroups are frozen. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org> No-objection-from-me-by: Oleg Nesterov <oleg@redhat.com> Cc: kernel-team@fb.com
author: Roman Gushchin <guro@fb.com> 2019-04-19 20:03:04 +0300
committer: Tejun Heo <tj@kernel.org> 2019-04-19 21:26:48 +0300
commit: 76f969e8948d82e78e1bc4beb6b9465908e74873 (patch)
tree: 1f5459d94820c5e5ea7293b103e8531d389c15c1 /kernel/fork.c
parent: 4dcabece4c3a9f9522127be12cc12cc120399b2f (diff)
download: linux-76f969e8948d82e78e1bc4beb6b9465908e74873.tar.xz
1 files changed, 2 insertions, 0 deletions
diff --git a/kernel/fork.c b/kernel/fork.c
index 9dcd18aa210b..8097a0cce4db 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1222,7 +1222,9 @@ static int wait_for_vfork_done(struct task_struct *child,
 	int killed;
 
 	freezer_do_not_count();
+	cgroup_enter_frozen();
 	killed = wait_for_completion_killable(vfork);
+	cgroup_leave_frozen(false);
 	freezer_count();
 
 	if (killed) {
author	Roman Gushchin <guro@fb.com>	2019-04-19 20:03:04 +0300
committer	Tejun Heo <tj@kernel.org>	2019-04-19 21:26:48 +0300
commit	76f969e8948d82e78e1bc4beb6b9465908e74873 (patch)
tree	1f5459d94820c5e5ea7293b103e8531d389c15c1 /kernel/fork.c
parent	4dcabece4c3a9f9522127be12cc12cc120399b2f (diff)
download	linux-76f969e8948d82e78e1bc4beb6b9465908e74873.tar.xz