<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/security/selinux, branch v5.9.10</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.9.10</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.9.10'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-11-18T18:22:28+00:00</updated>
<entry>
<title>selinux: Fix error return code in sel_ib_pkey_sid_slow()</title>
<updated>2020-11-18T18:22:28+00:00</updated>
<author>
<name>Chen Zhou</name>
<email>chenzhou10@huawei.com</email>
</author>
<published>2020-11-12T13:53:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=23f880dc21bdfc20b50b967cef03dabd873329c2'/>
<id>urn:sha1:23f880dc21bdfc20b50b967cef03dabd873329c2</id>
<content type='text'>
commit c350f8bea271782e2733419bd2ab9bf4ec2051ef upstream.

Fix to return a negative error code from the error handling case
instead of 0 in function sel_ib_pkey_sid_slow(), as done elsewhere
in this function.

Cc: stable@vger.kernel.org
Fixes: 409dcf31538a ("selinux: Add a cache for quicker retreival of PKey SIDs")
Reported-by: Hulk Robot &lt;hulkci@huawei.com&gt;
Signed-off-by: Chen Zhou &lt;chenzhou10@huawei.com&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>selinux: access policycaps with READ_ONCE/WRITE_ONCE</title>
<updated>2020-11-05T10:51:21+00:00</updated>
<author>
<name>Stephen Smalley</name>
<email>stephen.smalley.work@gmail.com</email>
</author>
<published>2020-09-10T14:28:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4c210f8a0907d85687a9732f67f54e19d51847d9'/>
<id>urn:sha1:4c210f8a0907d85687a9732f67f54e19d51847d9</id>
<content type='text'>
[ Upstream commit e8ba53d0023a76ba0f50e6ee3e6288c5442f9d33 ]

Use READ_ONCE/WRITE_ONCE for all accesses to the
selinux_state.policycaps booleans to prevent compiler
mischief.

Signed-off-by: Stephen Smalley &lt;stephen.smalley.work@gmail.com&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>treewide: Use fallthrough pseudo-keyword</title>
<updated>2020-08-23T22:36:59+00:00</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>gustavoars@kernel.org</email>
</author>
<published>2020-08-23T22:36:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=df561f6688fef775baa341a0f5d960becd248b11'/>
<id>urn:sha1:df561f6688fef775baa341a0f5d960becd248b11</id>
<content type='text'>
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux</title>
<updated>2020-08-04T22:02:07+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-08-04T22:02:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=74858abbb1032222f922487fd1a24513bbed80f9'/>
<id>urn:sha1:74858abbb1032222f922487fd1a24513bbed80f9</id>
<content type='text'>
Pull checkpoint-restore updates from Christian Brauner:
 "This enables unprivileged checkpoint/restore of processes.

  Given that this work has been going on for quite some time the first
  sentence in this summary is hopefully more exciting than the actual
  final code changes required. Unprivileged checkpoint/restore has seen
  a frequent increase in interest over the last two years and has thus
  been one of the main topics for the combined containers &amp;
  checkpoint/restore microconference since at least 2018 (cf. [1]).

  Here are just the three most frequent use-cases that were brought forward:

   - The JVM developers are integrating checkpoint/restore into a Java
     VM to significantly decrease the startup time.

   - In high-performance computing environment a resource manager will
     typically be distributing jobs where users are always running as
     non-root. Long-running and "large" processes with significant
     startup times are supposed to be checkpointed and restored with
     CRIU.

   - Container migration as a non-root user.

  In all of these scenarios it is either desirable or required to run
  without CAP_SYS_ADMIN. The userspace implementation of
  checkpoint/restore CRIU already has the pull request for supporting
  unprivileged checkpoint/restore up (cf. [2]).

  To enable unprivileged checkpoint/restore a new dedicated capability
  CAP_CHECKPOINT_RESTORE is introduced. This solution has last been
  discussed in 2019 in a talk by Google at Linux Plumbers (cf. [1]
  "Update on Task Migration at Google Using CRIU") with Adrian and
  Nicolas providing the implementation now over the last months. In
  essence, this allows the CRIU binary to be installed with the
  CAP_CHECKPOINT_RESTORE vfs capability set thereby enabling
  unprivileged users to restore processes.

  To make this possible the following permissions are altered:

   - Selecting a specific PID via clone3() set_tid relaxed from userns
     CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE.

   - Selecting a specific PID via /proc/sys/kernel/ns_last_pid relaxed
     from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE.

   - Accessing /proc/pid/map_files relaxed from init userns
     CAP_SYS_ADMIN to init userns CAP_CHECKPOINT_RESTORE.

   - Changing /proc/self/exe from userns CAP_SYS_ADMIN to userns
     CAP_CHECKPOINT_RESTORE.

  Of these four changes the /proc/self/exe change deserves a few words
  because the reasoning behind even restricting /proc/self/exe changes
  in the first place is just full of historical quirks and tracking this
  down was a questionable version of fun that I'd like to spare others.

  In short, it is trivial to change /proc/self/exe as an unprivileged
  user, i.e. without userns CAP_SYS_ADMIN right now. Either via ptrace()
  or by simply intercepting the elf loader in userspace during exec.
  Nicolas was nice enough to even provide a POC for the latter (cf. [3])
  to illustrate this fact.

  The original patchset which introduced PR_SET_MM_MAP had no
  permissions around changing the exe link. They too argued that it is
  trivial to spoof the exe link already which is true. The argument
  brought up against this was that the Tomoyo LSM uses the exe link in
  tomoyo_manager() to detect whether the calling process is a policy
  manager. This caused changing the exe links to be guarded by userns
  CAP_SYS_ADMIN.

  All in all this rather seems like a "better guard it with something
  rather than nothing" argument which imho doesn't qualify as a great
  security policy. Again, because spoofing the exe link is possible for
  the calling process so even if this were security relevant it was
  broken back then and would be broken today. So technically, dropping
  all permissions around changing the exe link would probably be
  possible and would send a clearer message to any userspace that relies
  on /proc/self/exe for security reasons that they should stop doing
  this but for now we're only relaxing the exe link permissions from
  userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE.

  There's a final uapi change in here. Changing the exe link used to
  accidently return EINVAL when the caller lacked the necessary
  permissions instead of the more correct EPERM. This pr contains a
  commit fixing this. I assume that userspace won't notice or care and
  if they do I will revert this commit. But since we are changing the
  permissions anyway it seems like a good opportunity to try this fix.

  With these changes merged unprivileged checkpoint/restore will be
  possible and has already been tested by various users"

[1] LPC 2018
     1. "Task Migration at Google Using CRIU"
        https://www.youtube.com/watch?v=yI_1cuhoDgA&amp;t=12095
     2. "Securely Migrating Untrusted Workloads with CRIU"
        https://www.youtube.com/watch?v=yI_1cuhoDgA&amp;t=14400
     LPC 2019
     1. "CRIU and the PID dance"
         https://www.youtube.com/watch?v=LN2CUgp8deo&amp;list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&amp;index=9&amp;t=2m48s
     2. "Update on Task Migration at Google Using CRIU"
        https://www.youtube.com/watch?v=LN2CUgp8deo&amp;list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&amp;index=9&amp;t=1h2m8s

[2] https://github.com/checkpoint-restore/criu/pull/1155

[3] https://github.com/nviennot/run_as_exe

* tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  selftests: add clone3() CAP_CHECKPOINT_RESTORE test
  prctl: exe link permission error changed from -EINVAL to -EPERM
  prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe
  proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTORE
  pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid
  pid: use checkpoint_restore_ns_capable() for set_tid
  capabilities: Introduce CAP_CHECKPOINT_RESTORE
</content>
</entry>
<entry>
<title>Merge tag 'selinux-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux</title>
<updated>2020-08-04T21:18:01+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-08-04T21:18:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=49e917deeb81e263bcdb4b20e61ca18111995ffe'/>
<id>urn:sha1:49e917deeb81e263bcdb4b20e61ca18111995ffe</id>
<content type='text'>
Pull selinux updates from Paul Moore:
 "Beyond the usual smattering of bug fixes, we've got three small
  improvements worth highlighting:

   - improved SELinux policy symbol table performance due to a reworking
     of the insert and search functions

   - allow reading of SELinux labels before the policy is loaded,
     allowing for some more "exotic" initramfs approaches

   - improved checking an error reporting about process
     class/permissions during SELinux policy load"

* tag 'selinux-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: complete the inlining of hashtab functions
  selinux: prepare for inlining of hashtab functions
  selinux: specialize symtab insert and search functions
  selinux: Fix spelling mistakes in the comments
  selinux: fixed a checkpatch warning with the sizeof macro
  selinux: log error messages on required process class / permissions
  scripts/selinux/mdp: fix initial SID handling
  selinux: allow reading labels before policy is loaded
</content>
</entry>
<entry>
<title>capabilities: Introduce CAP_CHECKPOINT_RESTORE</title>
<updated>2020-07-19T18:14:42+00:00</updated>
<author>
<name>Adrian Reber</name>
<email>areber@redhat.com</email>
</author>
<published>2020-07-19T10:04:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=124ea650d3072b005457faed69909221c2905a1f'/>
<id>urn:sha1:124ea650d3072b005457faed69909221c2905a1f</id>
<content type='text'>
This patch introduces CAP_CHECKPOINT_RESTORE, a new capability facilitating
checkpoint/restore for non-root users.

Over the last years, The CRIU (Checkpoint/Restore In Userspace) team has
been asked numerous times if it is possible to checkpoint/restore a
process as non-root. The answer usually was: 'almost'.

The main blocker to restore a process as non-root was to control the PID
of the restored process. This feature available via the clone3 system
call, or via /proc/sys/kernel/ns_last_pid is unfortunately guarded by
CAP_SYS_ADMIN.

In the past two years, requests for non-root checkpoint/restore have
increased due to the following use cases:
* Checkpoint/Restore in an HPC environment in combination with a
  resource manager distributing jobs where users are always running as
  non-root. There is a desire to provide a way to checkpoint and
  restore long running jobs.
* Container migration as non-root
* We have been in contact with JVM developers who are integrating
  CRIU into a Java VM to decrease the startup time. These
  checkpoint/restore applications are not meant to be running with
  CAP_SYS_ADMIN.

We have seen the following workarounds:
* Use a setuid wrapper around CRIU:
  See https://github.com/FredHutch/slurm-examples/blob/master/checkpointer/lib/checkpointer/checkpointer-suid.c
* Use a setuid helper that writes to ns_last_pid.
  Unfortunately, this helper delegation technique is impossible to use
  with clone3, and is thus prone to races.
  See https://github.com/twosigma/set_ns_last_pid
* Cycle through PIDs with fork() until the desired PID is reached:
  This has been demonstrated to work with cycling rates of 100,000 PIDs/s
  See https://github.com/twosigma/set_ns_last_pid
* Patch out the CAP_SYS_ADMIN check from the kernel
* Run the desired application in a new user and PID namespace to provide
  a local CAP_SYS_ADMIN for controlling PIDs. This technique has limited
  use in typical container environments (e.g., Kubernetes) as /proc is
  typically protected with read-only layers (e.g., /proc/sys) for
  hardening purposes. Read-only layers prevent additional /proc mounts
  (due to proc's SB_I_USERNS_VISIBLE property), making the use of new
  PID namespaces limited as certain applications need access to /proc
  matching their PID namespace.

The introduced capability allows to:
* Control PIDs when the current user is CAP_CHECKPOINT_RESTORE capable
  for the corresponding PID namespace via ns_last_pid/clone3.
* Open files in /proc/pid/map_files when the current user is
  CAP_CHECKPOINT_RESTORE capable in the root namespace, useful for
  recovering files that are unreachable via the file system such as
  deleted files, or memfd files.

See corresponding selftest for an example with clone3().

Signed-off-by: Adrian Reber &lt;areber@redhat.com&gt;
Signed-off-by: Nicolas Viennot &lt;Nicolas.Viennot@twosigma.com&gt;
Reviewed-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Link: https://lore.kernel.org/r/20200719100418.2112740-2-areber@redhat.com
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
</content>
</entry>
<entry>
<title>selinux: complete the inlining of hashtab functions</title>
<updated>2020-07-09T23:08:16+00:00</updated>
<author>
<name>Ondrej Mosnacek</name>
<email>omosnace@redhat.com</email>
</author>
<published>2020-07-09T19:19:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=54b27f9287a7b3dfc85549f01fc9d292c92c68b9'/>
<id>urn:sha1:54b27f9287a7b3dfc85549f01fc9d292c92c68b9</id>
<content type='text'>
Move (most of) the definitions of hashtab_search() and hashtab_insert()
to the header file. In combination with the previous patch, this avoids
calling the callbacks indirectly by function pointers and allows for
better optimization, leading to a drastic performance improvement of
these operations.

With this patch, I measured a speed up in the following areas (measured
on x86_64 F32 VM with 4 CPUs):
  1. Policy load (`load_policy`) - takes ~150 ms instead of ~230 ms.
  2. `chcon -R unconfined_u:object_r:user_tmp_t:s0:c381,c519 /tmp/linux-src`
     where /tmp/linux-src is an extracted linux-5.7 source tarball -
     takes ~522 ms instead of ~576 ms. This is because of many
     symtab_search() calls in string_to_context_struct() when there are
     many categories specified in the context.
  3. `stress-ng --msg 1 --msg-ops 10000000` - takes 12.41 s instead of
     13.95 s (consumes 18.6 s of kernel CPU time instead of 21.6 s).
     This is thanks to security_transition_sid() being ~43% faster after
     this patch.

Signed-off-by: Ondrej Mosnacek &lt;omosnace@redhat.com&gt;
Acked-by: Stephen Smalley &lt;stephen.smalley.work@gmail.com&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
</content>
</entry>
<entry>
<title>selinux: prepare for inlining of hashtab functions</title>
<updated>2020-07-09T23:05:36+00:00</updated>
<author>
<name>Ondrej Mosnacek</name>
<email>omosnace@redhat.com</email>
</author>
<published>2020-07-09T19:19:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=24def7bb92c19337cee26d506f87dc4eeeba7a19'/>
<id>urn:sha1:24def7bb92c19337cee26d506f87dc4eeeba7a19</id>
<content type='text'>
Refactor searching and inserting into hashtabs to pave the way for
converting hashtab_search() and hashtab_insert() to inline functions in
the next patch. This will avoid indirect calls and allow the compiler to
better optimize individual callers, leading to a significant performance
improvement.

In order to avoid the indirect calls, the key hashing and comparison
callbacks need to be extracted from the hashtab struct and passed
directly to hashtab_search()/_insert() by the callers so that the
callback address is always known at compile time. The kernel's
rhashtable library (&lt;linux/rhashtable*.h&gt;) does the same thing.

This of course makes the hashtab functions slightly easier to misuse by
passing a wrong callback set, but unfortunately there is no better way
to implement a hash table that is both generic and efficient in C. This
patch tries to somewhat mitigate this by only calling the hashtab
functions in the same file where the corresponding callbacks are
defined (wrapping them into more specialized functions as needed).

Note that this patch doesn't bring any benefit without also moving the
definitions of hashtab_search() and -_insert() to the header file, which
is done in a follow-up patch for easier review of the hashtab.c changes
in this patch.

Signed-off-by: Ondrej Mosnacek &lt;omosnace@redhat.com&gt;
Acked-by: Stephen Smalley &lt;stephen.smalley.work@gmail.com&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
</content>
</entry>
<entry>
<title>selinux: specialize symtab insert and search functions</title>
<updated>2020-07-09T00:21:43+00:00</updated>
<author>
<name>Ondrej Mosnacek</name>
<email>omosnace@redhat.com</email>
</author>
<published>2020-07-08T11:24:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=237389e3015e0f4ceac7cf00c70a59746150561d'/>
<id>urn:sha1:237389e3015e0f4ceac7cf00c70a59746150561d</id>
<content type='text'>
This encapsulates symtab a little better and will help with further
refactoring later.

Signed-off-by: Ondrej Mosnacek &lt;omosnace@redhat.com&gt;
Acked-by: Stephen Smalley &lt;stephen.smalley.work@gmail.com&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
</content>
</entry>
<entry>
<title>selinux: Fix spelling mistakes in the comments</title>
<updated>2020-07-08T16:15:52+00:00</updated>
<author>
<name>lihao</name>
<email>fly.lihao@huawei.com</email>
</author>
<published>2020-07-07T04:00:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2c3d8dfecedd7f5e24e09730a992b59b1c6f956e'/>
<id>urn:sha1:2c3d8dfecedd7f5e24e09730a992b59b1c6f956e</id>
<content type='text'>
Fix spelling mistakes in the comments
    quering==&gt;querying

Signed-off-by: lihao &lt;fly.lihao@huawei.com&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
</content>
</entry>
</feed>
