From 4ad5a3217a193e933bc41168b000672417486c87 Mon Sep 17 00:00:00 2001
From: Roman Gushchin <guro@fb.com>
Date: Wed, 13 Dec 2017 19:49:03 +0000
Subject: cgroup, docs: document cgroup v2 device controller

Add the corresponding section in cgroup v2 documentation.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: kernel-team@fb.com
Cc: cgroups@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 Documentation/cgroup-v2.txt | 33 +++++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index 2cddab7efb20..d6efabb487e3 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -53,10 +53,11 @@ v1 is available under Documentation/cgroup-v1/.
        5-3-2. Writeback
      5-4. PID
        5-4-1. PID Interface Files
-     5-5. RDMA
-       5-5-1. RDMA Interface Files
-     5-6. Misc
-       5-6-1. perf_event
+     5-5. Device
+     5-6. RDMA
+       5-6-1. RDMA Interface Files
+     5-7. Misc
+       5-7-1. perf_event
    6. Namespace
      6-1. Basics
      6-2. The Root and Views
@@ -1429,6 +1430,30 @@ through fork() or clone(). These will return -EAGAIN if the creation
 of a new process would cause a cgroup policy to be violated.
 
 
+Device controller
+-----------------
+
+Device controller manages access to device files. It includes both
+creation of new device files (using mknod), and access to the
+existing device files.
+
+Cgroup v2 device controller has no interface files and is implemented
+on top of cgroup BPF. To control access to device files, a user may
+create bpf programs of the BPF_CGROUP_DEVICE type and attach them
+to cgroups. On an attempt to access a device file, corresponding
+BPF programs will be executed, and depending on the return value
+the attempt will succeed or fail with -EPERM.
+
+A BPF_CGROUP_DEVICE program takes a pointer to the bpf_cgroup_dev_ctx
+structure, which describes the device access attempt: access type
+(mknod/read/write) and device (type, major and minor numbers).
+If the program returns 0, the attempt fails with -EPERM, otherwise
+it succeeds.
+
+An example of BPF_CGROUP_DEVICE program may be found in the kernel
+source tree in the tools/testing/selftests/bpf/dev_cgroup.c file.
+
+
 RDMA
 ----
 
-- 
cgit v1.2.3


From d51b9dae2eb3b7b0c90dd0fd9ccd66d6934e9d8f Mon Sep 17 00:00:00 2001
From: Matt Roper <matthew.d.roper@intel.com>
Date: Fri, 29 Dec 2017 12:01:59 -0800
Subject: Documentation/cgroup-v1: fix outdated programming details

The cgroup-v1 documentation is out of date in a few places:

 * cgroup controllers can no longer be compiled as modules since commit
   3ed80a6 ("cgroup: drop module support"); the functions and fields
   referenced here no longer exist.

 * Controllers need to create of a cgroup_subsys object named
   "<name>_cgrp_subsys" instead of "<name>_subsys" since commit
   073219e ("cgroup: clean up cgroup_subsys names and initialization")

Cc: Tejun Heo <tj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 Documentation/cgroup-v1/cgroups.txt | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/Documentation/cgroup-v1/cgroups.txt b/Documentation/cgroup-v1/cgroups.txt
index 308e5ff7207a..059f7063eea6 100644
--- a/Documentation/cgroup-v1/cgroups.txt
+++ b/Documentation/cgroup-v1/cgroups.txt
@@ -523,12 +523,7 @@ Accessing a task's cgroup pointer may be done in the following ways:
 Each subsystem should:
 
 - add an entry in linux/cgroup_subsys.h
-- define a cgroup_subsys object called <name>_subsys
-
-If a subsystem can be compiled as a module, it should also have in its
-module initcall a call to cgroup_load_subsys(), and in its exitcall a
-call to cgroup_unload_subsys(). It should also set its_subsys.module =
-THIS_MODULE in its .c file.
+- define a cgroup_subsys object called <name>_cgrp_subsys
 
 Each subsystem may export the following methods. The only mandatory
 methods are css_alloc/free. Any others that are null are presumed to
-- 
cgit v1.2.3


From 392536b731cfe82eea414f4b09c128ef37cd477e Mon Sep 17 00:00:00 2001
From: Matt Roper <matthew.d.roper@intel.com>
Date: Fri, 29 Dec 2017 12:02:00 -0800
Subject: cgroup: Update documentation reference

The cgroup_subsys structure references a documentation file that has been
renamed after the v1/v2 split.  Since the v2 documentation doesn't
currently contain any information on kernel interfaces for controllers,
point the user to the v1 docs.

Cc: Tejun Heo <tj@kernel.org>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/cgroup-defs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 8b7fd8eeccee..9f242b876fde 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -561,7 +561,7 @@ struct cftype {
 
 /*
  * Control Group subsystem type.
- * See Documentation/cgroups/cgroups.txt for details
+ * See Documentation/cgroup-v1/cgroups.txt for details
  */
 struct cgroup_subsys {
 	struct cgroup_subsys_state *(*css_alloc)(struct cgroup_subsys_state *parent_css);
-- 
cgit v1.2.3


From 2877cbe6501e20dd076c7677f768d50bcc87267b Mon Sep 17 00:00:00 2001
From: Vladimir Rutsky <rutsky@google.com>
Date: Tue, 2 Jan 2018 17:27:41 +0100
Subject: cgroup-v2.txt: fix typos

Signed-off-by: Vladimir Rutsky <rutsky@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 Documentation/cgroup-v2.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index d6efabb487e3..c783bbb9d0fb 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -280,7 +280,7 @@ thread mode, the following conditions must be met.
   exempt from this requirement.
 
 Topology-wise, a cgroup can be in an invalid state.  Please consider
-the following toplogy::
+the following topology::
 
   A (threaded domain) - B (threaded) - C (domain, just created)
 
@@ -1064,10 +1064,10 @@ PAGE_SIZE multiple when read back.
 		reached the limit and allocation was about to fail.
 
 		Depending on context result could be invocation of OOM
-		killer and retrying allocation or failing alloction.
+		killer and retrying allocation or failing allocation.
 
 		Failed allocation in its turn could be returned into
-		userspace as -ENOMEM or siletly ignored in cases like
+		userspace as -ENOMEM or silently ignored in cases like
 		disk readahead.  For now OOM in memory cgroup kills
 		tasks iff shortage has happened inside page fault.
 
@@ -1192,7 +1192,7 @@ PAGE_SIZE multiple when read back.
 	cgroups.  The default is "max".
 
 	Swap usage hard limit.  If a cgroup's swap usage reaches this
-	limit, anonymous meomry of the cgroup will not be swapped out.
+	limit, anonymous memory of the cgroup will not be swapped out.
 
 
 Usage Guidelines
-- 
cgit v1.2.3


From c4e0842bf141178fd5bc0cfe364609ec782fb384 Mon Sep 17 00:00:00 2001
From: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
Date: Wed, 10 Jan 2018 23:33:19 +0100
Subject: cgroup, docs: document the root cgroup behavior of cpu and io
 controllers

Currently, cgroups v2 documentation contains only a generic remark that
"How resource consumption in the root cgroup is governed is up to each
controller", which isn't really telling users much, who need to dig in the
code and / or commit messages to learn the exact behavior.

In cgroups v1 at least the blkio controller had its operation with respect
to competition between child threads and child cgroups documented in
blkio-controller.txt, with references to cfq-iosched.txt.
Also, cgroups v2 documentation describes v1 behavior of both cpu and
blkio controllers in an "Issues with v1" section.

Let's document this behavior also for cgroups v2 to make life easier for
users.

Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 Documentation/cgroup-v2.txt | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index c783bbb9d0fb..74cdeaed9f7a 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -58,6 +58,9 @@ v1 is available under Documentation/cgroup-v1/.
        5-6-1. RDMA Interface Files
      5-7. Misc
        5-7-1. perf_event
+     5-N. Non-normative information
+       5-N-1. CPU controller root cgroup process behaviour
+       5-N-2. IO controller root cgroup process behaviour
    6. Namespace
      6-1. Basics
      6-2. The Root and Views
@@ -421,7 +424,9 @@ The root cgroup is exempt from this restriction.  Root contains
 processes and anonymous resource consumption which can't be associated
 with any other cgroups and requires special treatment from most
 controllers.  How resource consumption in the root cgroup is governed
-is up to each controller.
+is up to each controller (for more information on this topic please
+refer to the Non-normative information section in the Controllers
+chapter).
 
 Note that the restriction doesn't get in the way if there is no
 enabled controller in the cgroup's "cgroup.subtree_control".  This is
@@ -1506,6 +1511,35 @@ always be filtered by cgroup v2 path.  The controller can still be
 moved to a legacy hierarchy after v2 hierarchy is populated.
 
 
+Non-normative information
+-------------------------
+
+This section contains information that isn't considered to be a part of
+the stable kernel API and so is subject to change.
+
+
+CPU controller root cgroup process behaviour
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When distributing CPU cycles in the root cgroup each thread in this
+cgroup is treated as if it was hosted in a separate child cgroup of the
+root cgroup. This child cgroup weight is dependent on its thread nice
+level.
+
+For details of this mapping see sched_prio_to_weight array in
+kernel/sched/core.c file (values from this array should be scaled
+appropriately so the neutral - nice 0 - value is 100 instead of 1024).
+
+
+IO controller root cgroup process behaviour
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Root cgroup processes are hosted in an implicit leaf child node.
+When distributing IO resources this implicit child node is taken into
+account as if it was a normal child cgroup of the root cgroup with a
+weight value of 200.
+
+
 Namespace
 =========
 
-- 
cgit v1.2.3


From 08a77676f9c5fc69a681ccd2cd8140e65dcb26c7 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Tue, 9 Jan 2018 07:21:15 -0800
Subject: string: drop __must_check from strscpy() and restore strscpy() usages
 in cgroup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

e7fd37ba1217 ("cgroup: avoid copying strings longer than the buffers")
converted possibly unsafe strncpy() usages in cgroup to strscpy().
However, although the callsites are completely fine with truncated
copied, because strscpy() is marked __must_check, it led to the
following warnings.

  kernel/cgroup/cgroup.c: In function ‘cgroup_file_name’:
  kernel/cgroup/cgroup.c:1400:10: warning: ignoring return value of ‘strscpy’, declared with attribute warn_unused_result [-Wunused-result]
     strscpy(buf, cft->name, CGROUP_FILE_NAME_MAX);
	       ^

To avoid the warnings, 50034ed49645 ("cgroup: use strlcpy() instead of
strscpy() to avoid spurious warning") switched them to strlcpy().

strlcpy() is worse than strlcpy() because it unconditionally runs
strlen() on the source string, and the only reason we switched to
strlcpy() here was because it was lacking __must_check, which doesn't
reflect any material differences between the two function.  It's just
that someone added __must_check to strscpy() and not to strlcpy().

These basic string copy operations are used in variety of ways, and
one of not-so-uncommon use cases is safely handling truncated copies,
where the caller naturally doesn't care about the return value.  The
__must_check doesn't match the actual use cases and forces users to
opt for inferior variants which lack __must_check by happenstance or
spread ugly (void) casts.

Remove __must_check from strscpy() and restore strscpy() usages in
cgroup.

Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
---
 include/linux/string.h | 2 +-
 kernel/cgroup/cgroup.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/string.h b/include/linux/string.h
index 410ecf17de3c..dfdf8afeac64 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -28,7 +28,7 @@ extern char * strncpy(char *,const char *, __kernel_size_t);
 size_t strlcpy(char *, const char *, size_t);
 #endif
 #ifndef __HAVE_ARCH_STRSCPY
-ssize_t __must_check strscpy(char *, const char *, size_t);
+ssize_t strscpy(char *, const char *, size_t);
 #endif
 #ifndef __HAVE_ARCH_STRCAT
 extern char * strcat(char *, const char *);
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 7e4c44538119..8cda3bc3ae22 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1397,7 +1397,7 @@ static char *cgroup_file_name(struct cgroup *cgrp, const struct cftype *cft,
 			 cgroup_on_dfl(cgrp) ? ss->name : ss->legacy_name,
 			 cft->name);
 	else
-		strlcpy(buf, cft->name, CGROUP_FILE_NAME_MAX);
+		strscpy(buf, cft->name, CGROUP_FILE_NAME_MAX);
 	return buf;
 }
 
@@ -1864,9 +1864,9 @@ void init_cgroup_root(struct cgroup_root *root, struct cgroup_sb_opts *opts)
 
 	root->flags = opts->flags;
 	if (opts->release_agent)
-		strlcpy(root->release_agent_path, opts->release_agent, PATH_MAX);
+		strscpy(root->release_agent_path, opts->release_agent, PATH_MAX);
 	if (opts->name)
-		strlcpy(root->name, opts->name, MAX_CGROUP_ROOT_NAMELEN);
+		strscpy(root->name, opts->name, MAX_CGROUP_ROOT_NAMELEN);
 	if (opts->cpuset_clone_children)
 		set_bit(CGRP_CPUSET_CLONE_CHILDREN, &root->cgrp.flags);
 }
-- 
cgit v1.2.3


From 03eac8b22194becd77d253fc546429ebb544aebf Mon Sep 17 00:00:00 2001
From: Florian Schmidt <florian.schmidt@neclab.eu>
Date: Tue, 30 Jan 2018 17:42:13 +0100
Subject: Documentation: Fix 'file_mapped' -> 'mapped_file'

There is no entry file_mapped in the memory.stat file. This looks like a
simple word flip that's gone unnoticed since 2010 (dc10e281f5fc,
memcg: update documentation).

Signed-off-by: Florian Schmidt <florian.schmidt@neclab.eu>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
---
 Documentation/cgroup-v1/memory.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/cgroup-v1/memory.txt b/Documentation/cgroup-v1/memory.txt
index cefb63639070..a4af2e124e24 100644
--- a/Documentation/cgroup-v1/memory.txt
+++ b/Documentation/cgroup-v1/memory.txt
@@ -524,9 +524,9 @@ Note:
 	Only anonymous and swap cache memory is listed as part of 'rss' stat.
 	This should not be confused with the true 'resident set size' or the
 	amount of physical memory used by the cgroup.
-	'rss + file_mapped" will give you resident set size of cgroup.
+	'rss + mapped_file" will give you resident set size of cgroup.
 	(Note: file and shmem may be shared among other cgroups. In that case,
-	 file_mapped is accounted only when the memory cgroup is owner of page
+	 mapped_file is accounted only when the memory cgroup is owner of page
 	 cache.)
 
 5.3 swappiness
-- 
cgit v1.2.3