<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/user_namespace.h, branch v6.6.132</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-11-14T12:19:39+00:00</updated>
<entry>
<title>signal: restore the override_rlimit logic</title>
<updated>2024-11-14T12:19:39+00:00</updated>
<author>
<name>Roman Gushchin</name>
<email>roman.gushchin@linux.dev</email>
</author>
<published>2024-11-04T19:54:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4877d9b2a2ebad3ae240127aaa4cb8258b145cf7'/>
<id>urn:sha1:4877d9b2a2ebad3ae240127aaa4cb8258b145cf7</id>
<content type='text'>
commit 9e05e5c7ee8758141d2db7e8fea2cab34500c6ed upstream.

Prior to commit d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of
ucounts") UCOUNT_RLIMIT_SIGPENDING rlimit was not enforced for a class of
signals.  However now it's enforced unconditionally, even if
override_rlimit is set.  This behavior change caused production issues.

For example, if the limit is reached and a process receives a SIGSEGV
signal, sigqueue_alloc fails to allocate the necessary resources for the
signal delivery, preventing the signal from being delivered with siginfo.
This prevents the process from correctly identifying the fault address and
handling the error.  From the user-space perspective, applications are
unaware that the limit has been reached and that the siginfo is
effectively 'corrupted'.  This can lead to unpredictable behavior and
crashes, as we observed with java applications.

Fix this by passing override_rlimit into inc_rlimit_get_ucounts() and skip
the comparison to max there if override_rlimit is set.  This effectively
restores the old behavior.

Link: https://lkml.kernel.org/r/20241104195419.3962584-1-roman.gushchin@linux.dev
Fixes: d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
Signed-off-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Co-developed-by: Andrei Vagin &lt;avagin@google.com&gt;
Signed-off-by: Andrei Vagin &lt;avagin@google.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ucounts: Split rlimit and ucount values and max values</title>
<updated>2022-05-18T23:24:57+00:00</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2022-05-18T17:17:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=de399236e240743ad2dd10d719c37b97ddf31996'/>
<id>urn:sha1:de399236e240743ad2dd10d719c37b97ddf31996</id>
<content type='text'>
Since the semantics of maximum rlimit values are different, it would be
better not to mix ucount and rlimit values. This will prevent the error
of using inc_count/dec_ucount for rlimit parameters.

This patch also renames the functions to emphasize the lack of
connection between rlimit and ucount.

v3:
- Fix BUG:KASAN:use-after-free_in_dec_ucount.

v2:
- Fix the array-index-out-of-bounds that was found by the lkp project.

Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Link: https://lkml.kernel.org/r/20220518171730.l65lmnnjtnxnftpq@example.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>ucounts: Fix signal ucount refcounting</title>
<updated>2021-10-18T21:02:30+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2021-10-16T20:59:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=15bc01effefe97757ef02ca09e9d1b927ab22725'/>
<id>urn:sha1:15bc01effefe97757ef02ca09e9d1b927ab22725</id>
<content type='text'>
In commit fda31c50292a ("signal: avoid double atomic counter
increments for user accounting") Linus made a clever optimization to
how rlimits and the struct user_struct.  Unfortunately that
optimization does not work in the obvious way when moved to nested
rlimits.  The problem is that the last decrement of the per user
namespace per user sigpending counter might also be the last decrement
of the sigpending counter in the parent user namespace as well.  Which
means that simply freeing the leaf ucount in __free_sigqueue is not
enough.

Maintain the optimization and handle the tricky cases by introducing
inc_rlimit_get_ucounts and dec_rlimit_put_ucounts.

By moving the entire optimization into functions that perform all of
the work it becomes possible to ensure that every level is handled
properly.

The new function inc_rlimit_get_ucounts returns 0 on failure to
increment the ucount.  This is different than inc_rlimit_ucounts which
increments the ucounts and returns LONG_MAX if the ucount counter has
exceeded it's maximum or it wrapped (to indicate the counter needs to
decremented).

I wish we had a single user to account all pending signals to across
all of the threads of a process so this complexity was not necessary

Cc: stable@vger.kernel.org
Fixes: d64696905554 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
v1: https://lkml.kernel.org/r/87mtnavszx.fsf_-_@disp2133
Link: https://lkml.kernel.org/r/87fssytizw.fsf_-_@disp2133
Reviewed-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Tested-by: Rune Kleveland &lt;rune.kleveland@infomedia.dk&gt;
Tested-by: Yu Zhao &lt;yuzhao@google.com&gt;
Tested-by: Jordan Glover &lt;Golden_Miller83@protonmail.ch&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace</title>
<updated>2021-06-29T03:39:26+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-06-29T03:39:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c54b245d011855ea91c5beff07f1db74143ce614'/>
<id>urn:sha1:c54b245d011855ea91c5beff07f1db74143ce614</id>
<content type='text'>
Pull user namespace rlimit handling update from Eric Biederman:
 "This is the work mainly by Alexey Gladkov to limit rlimits to the
  rlimits of the user that created a user namespace, and to allow users
  to have stricter limits on the resources created within a user
  namespace."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  cred: add missing return error code when set_cred_ucounts() failed
  ucounts: Silence warning in dec_rlimit_ucounts
  ucounts: Set ucount_max to the largest positive value the type can hold
  kselftests: Add test to check for rlimit changes in different user namespaces
  Reimplement RLIMIT_MEMLOCK on top of ucounts
  Reimplement RLIMIT_SIGPENDING on top of ucounts
  Reimplement RLIMIT_MSGQUEUE on top of ucounts
  Reimplement RLIMIT_NPROC on top of ucounts
  Use atomic_t for ucounts reference counting
  Add a reference to ucounts for each cred
  Increase size of ucounts to atomic_long_t
</content>
</entry>
<entry>
<title>ucounts: Set ucount_max to the largest positive value the type can hold</title>
<updated>2021-04-30T19:14:03+00:00</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2021-04-22T12:27:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c1ada3dc7219b02b3467aa906c2f5f8b098578d1'/>
<id>urn:sha1:c1ada3dc7219b02b3467aa906c2f5f8b098578d1</id>
<content type='text'>
The ns-&gt;ucount_max[] is signed long which is less than the rlimit size.
We have to protect ucount_max[] from overflow and only use the largest
value that we can hold.

On 32bit using "long" instead of "unsigned long" to hold the counts has
the downside that RLIMIT_MSGQUEUE and RLIMIT_MEMLOCK are limited to 2GiB
instead of 4GiB. I don't think anyone cares but it should be mentioned
in case someone does.

The RLIMIT_NPROC and RLIMIT_SIGPENDING used atomic_t so their maximum
hasn't changed.

Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Link: https://lkml.kernel.org/r/1825a5dfa18bc5a570e79feb05e2bd07fd57e7e3.1619094428.git.legion@kernel.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>Reimplement RLIMIT_MEMLOCK on top of ucounts</title>
<updated>2021-04-30T19:14:02+00:00</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2021-04-22T12:27:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d7c9e99aee48e6bc0b427f3e3c658a6aba15001e'/>
<id>urn:sha1:d7c9e99aee48e6bc0b427f3e3c658a6aba15001e</id>
<content type='text'>
The rlimit counter is tied to uid in the user_namespace. This allows
rlimit values to be specified in userns even if they are already
globally exceeded by the user. However, the value of the previous
user_namespaces cannot be exceeded.

Changelog

v11:
* Fix issue found by lkp robot.

v8:
* Fix issues found by lkp-tests project.

v7:
* Keep only ucounts for RLIMIT_MEMLOCK checks instead of struct cred.

v6:
* Fix bug in hugetlb_file_setup() detected by trinity.

Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Link: https://lkml.kernel.org/r/970d50c70c71bfd4496e0e8d2a0a32feebebb350.1619094428.git.legion@kernel.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>Reimplement RLIMIT_SIGPENDING on top of ucounts</title>
<updated>2021-04-30T19:14:02+00:00</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2021-04-22T12:27:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d64696905554e919321e31afc210606653b8f6a4'/>
<id>urn:sha1:d64696905554e919321e31afc210606653b8f6a4</id>
<content type='text'>
The rlimit counter is tied to uid in the user_namespace. This allows
rlimit values to be specified in userns even if they are already
globally exceeded by the user. However, the value of the previous
user_namespaces cannot be exceeded.

Changelog

v11:
* Revert most of changes to fix performance issues.

v10:
* Fix memory leak on get_ucounts failure.

Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Link: https://lkml.kernel.org/r/df9d7764dddd50f28616b7840de74ec0f81711a8.1619094428.git.legion@kernel.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>Reimplement RLIMIT_MSGQUEUE on top of ucounts</title>
<updated>2021-04-30T19:14:01+00:00</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2021-04-22T12:27:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6e52a9f0532f912af37bab4caf18b57d1b9845f4'/>
<id>urn:sha1:6e52a9f0532f912af37bab4caf18b57d1b9845f4</id>
<content type='text'>
The rlimit counter is tied to uid in the user_namespace. This allows
rlimit values to be specified in userns even if they are already
globally exceeded by the user. However, the value of the previous
user_namespaces cannot be exceeded.

Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Link: https://lkml.kernel.org/r/2531f42f7884bbfee56a978040b3e0d25cdf6cde.1619094428.git.legion@kernel.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>Reimplement RLIMIT_NPROC on top of ucounts</title>
<updated>2021-04-30T19:14:01+00:00</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2021-04-22T12:27:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=21d1c5e386bc751f1953b371d72cd5b7d9c9e270'/>
<id>urn:sha1:21d1c5e386bc751f1953b371d72cd5b7d9c9e270</id>
<content type='text'>
The rlimit counter is tied to uid in the user_namespace. This allows
rlimit values to be specified in userns even if they are already
globally exceeded by the user. However, the value of the previous
user_namespaces cannot be exceeded.

To illustrate the impact of rlimits, let's say there is a program that
does not fork. Some service-A wants to run this program as user X in
multiple containers. Since the program never fork the service wants to
set RLIMIT_NPROC=1.

service-A
 \- program (uid=1000, container1, rlimit_nproc=1)
 \- program (uid=1000, container2, rlimit_nproc=1)

The service-A sets RLIMIT_NPROC=1 and runs the program in container1.
When the service-A tries to run a program with RLIMIT_NPROC=1 in
container2 it fails since user X already has one running process.

We cannot use existing inc_ucounts / dec_ucounts because they do not
allow us to exceed the maximum for the counter. Some rlimits can be
overlimited by root or if the user has the appropriate capability.

Changelog

v11:
* Change inc_rlimit_ucounts() which now returns top value of ucounts.
* Drop inc_rlimit_ucounts_and_test() because the return code of
  inc_rlimit_ucounts() can be checked.

Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Link: https://lkml.kernel.org/r/c5286a8aa16d2d698c222f7532f3d735c82bc6bc.1619094428.git.legion@kernel.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>Use atomic_t for ucounts reference counting</title>
<updated>2021-04-30T19:14:01+00:00</updated>
<author>
<name>Alexey Gladkov</name>
<email>legion@kernel.org</email>
</author>
<published>2021-04-22T12:27:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b6c336528926ef73b0f70260f2636de2c3b94c14'/>
<id>urn:sha1:b6c336528926ef73b0f70260f2636de2c3b94c14</id>
<content type='text'>
The current implementation of the ucounts reference counter requires the
use of spin_lock. We're going to use get_ucounts() in more performance
critical areas like a handling of RLIMIT_SIGPENDING.

Now we need to use spin_lock only if we want to change the hashtable.

v10:
* Always try to put ucounts in case we cannot increase ucounts-&gt;count.
  This will allow to cover the case when all consumers will return
  ucounts at once.

v9:
* Use a negative value to check that the ucounts-&gt;count is close to
  overflow.

Signed-off-by: Alexey Gladkov &lt;legion@kernel.org&gt;
Link: https://lkml.kernel.org/r/94d1dbecab060a6b116b0a2d1accd8ca1bbb4f5f.1619094428.git.legion@kernel.org
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
</feed>
