<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/pid.h, branch v6.6.132</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-06-30T16:04:01+00:00</updated>
<entry>
<title>pid: Replace struct pid 1-element array with flex-array</title>
<updated>2023-06-30T16:04:01+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2023-06-30T07:46:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b69f0aeb068980af983d399deafc7477cec8bc04'/>
<id>urn:sha1:b69f0aeb068980af983d399deafc7477cec8bc04</id>
<content type='text'>
For pid namespaces, struct pid uses a dynamically sized array member,
"numbers".  This was implemented using the ancient 1-element fake
flexible array, which has been deprecated for decades.

Replace it with a C99 flexible array, refactor the array size
calculations to use struct_size(), and address elements via indexes.
Note that the static initializer (which defines a single element) works
as-is, and requires no special handling.

Without this, CONFIG_UBSAN_BOUNDS (and potentially
CONFIG_FORTIFY_SOURCE) will trigger bounds checks:

  https://lore.kernel.org/lkml/20230517-bushaltestelle-super-e223978c1ba6@brauner

Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jeff Xu &lt;jeffxu@google.com&gt;
Cc: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Cc: Daniel Verkamp &lt;dverkamp@chromium.org&gt;
Cc: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Cc: Jeff Xu &lt;jeffxu@google.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Cc: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Reported-by: syzbot+ac3b41786a2d0565b6d5@syzkaller.appspotmail.com
[brauner: dropped unrelated changes and remove 0 with NULL cast]
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>pid: add pidfd_prepare()</title>
<updated>2023-04-03T09:16:56+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2023-03-27T18:22:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6ae930d9dbf2d093157be33428538c91966d8a9f'/>
<id>urn:sha1:6ae930d9dbf2d093157be33428538c91966d8a9f</id>
<content type='text'>
Add a new helper that allows to reserve a pidfd and allocates a new
pidfd file that stashes the provided struct pid. This will allow us to
remove places that either open code this function or that call
pidfd_create() but then have to call close_fd() because there are still
failure points after pidfd_create() has been called.

Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Message-Id: &lt;20230327-pidfd-file-api-v1-1-5c0e9a3158e4@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>pid: add pidfd_get_task() helper</title>
<updated>2021-10-14T11:29:18+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>christian.brauner@ubuntu.com</email>
</author>
<published>2021-10-11T13:32:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e9bdcdbf6936dd1fbf419e00222dc9038b7812c2'/>
<id>urn:sha1:e9bdcdbf6936dd1fbf419e00222dc9038b7812c2</id>
<content type='text'>
The number of system calls making use of pidfds is constantly
increasing. Some of those new system calls duplicate the code to turn a
pidfd into task_struct it refers to. Give them a simple helper for this.

Link: https://lore.kernel.org/r/20211004125050.1153693-2-christian.brauner@ubuntu.com
Link: https://lore.kernel.org/r/20211011133245.1703103-2-brauner@kernel.org
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Matthew Bobrowski &lt;repnop@google.com&gt;
Cc: Alexander Duyck &lt;alexander.h.duyck@linux.intel.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Reviewed-by: Matthew Bobrowski &lt;repnop@google.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
</content>
</entry>
<entry>
<title>kernel/pid.c: remove static qualifier from pidfd_create()</title>
<updated>2021-08-10T10:53:04+00:00</updated>
<author>
<name>Matthew Bobrowski</name>
<email>repnop@google.com</email>
</author>
<published>2021-08-08T05:24:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c576e0fcd6188d0edb50b0fb83f853433ef4819b'/>
<id>urn:sha1:c576e0fcd6188d0edb50b0fb83f853433ef4819b</id>
<content type='text'>
With the idea of returning pidfds from the fanotify API, we need to
expose a mechanism for creating pidfds. We drop the static qualifier
from pidfd_create() and add its declaration to linux/pid.h so that the
pidfd_create() helper can be called from other kernel subsystems
i.e. fanotify.

Link: https://lore.kernel.org/r/0c68653ec32f1b7143301f0231f7ed14062fd82b.1628398044.git.repnop@google.com
Signed-off-by: Matthew Bobrowski &lt;repnop@google.com&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>pid: move pidfd_get_pid() to pid.c</title>
<updated>2020-10-18T16:27:10+00:00</updated>
<author>
<name>Minchan Kim</name>
<email>minchan@kernel.org</email>
</author>
<published>2020-10-17T23:14:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1aa92cd31c1c032ddfed27e79d646bbb429e9b52'/>
<id>urn:sha1:1aa92cd31c1c032ddfed27e79d646bbb429e9b52</id>
<content type='text'>
process_madvise syscall needs pidfd_get_pid function to translate pidfd to
pid so this patch move the function to kernel/pid.c.

Suggested-by: Alexander Duyck &lt;alexander.h.duyck@linux.intel.com&gt;
Signed-off-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Reviewed-by: Alexander Duyck &lt;alexander.h.duyck@linux.intel.com&gt;
Reviewed-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Brian Geffon &lt;bgeffon@google.com&gt;
Cc: Daniel Colascione &lt;dancol@google.com&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: John Dias &lt;joaodias@google.com&gt;
Cc: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Oleksandr Natalenko &lt;oleksandr@redhat.com&gt;
Cc: Sandeep Patil &lt;sspatil@google.com&gt;
Cc: SeongJae Park &lt;sj38.park@gmail.com&gt;
Cc: SeongJae Park &lt;sjpark@amazon.de&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Sonny Rao &lt;sonnyrao@google.com&gt;
Cc: Tim Murray &lt;timmurray@google.com&gt;
Cc: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Cc: Florian Weimer &lt;fw@deneb.enyo.de&gt;
Cc: &lt;linux-man@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/20200302193630.68771-5-minchan@kernel.org
Link: http://lkml.kernel.org/r/20200622192900.22757-3-minchan@kernel.org
Link: https://lkml.kernel.org/r/20200901000633.1920247-3-minchan@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace</title>
<updated>2020-06-04T20:54:34+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-06-04T20:54:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9ff7258575d5fee011649d20cc56de720a395191'/>
<id>urn:sha1:9ff7258575d5fee011649d20cc56de720a395191</id>
<content type='text'>
Pull proc updates from Eric Biederman:
 "This has four sets of changes:

   - modernize proc to support multiple private instances

   - ensure we see the exit of each process tid exactly

   - remove has_group_leader_pid

   - use pids not tasks in posix-cpu-timers lookup

  Alexey updated proc so each mount of proc uses a new superblock. This
  allows people to actually use mount options with proc with no fear of
  messing up another mount of proc. Given the kernel's internal mounts
  of proc for things like uml this was a real problem, and resulted in
  Android's hidepid mount options being ignored and introducing security
  issues.

  The rest of the changes are small cleanups and fixes that came out of
  my work to allow this change to proc. In essence it is swapping the
  pids in de_thread during exec which removes a special case the code
  had to handle. Then updating the code to stop handling that special
  case"

* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  proc: proc_pid_ns takes super_block as an argument
  remove the no longer needed pid_alive() check in __task_pid_nr_ns()
  posix-cpu-timers: Replace __get_task_for_clock with pid_for_clock
  posix-cpu-timers: Replace cpu_timer_pid_type with clock_pid_type
  posix-cpu-timers: Extend rcu_read_lock removing task_struct references
  signal: Remove has_group_leader_pid
  exec: Remove BUG_ON(has_group_leader_pid)
  posix-cpu-timer:  Unify the now redundant code in lookup_task
  posix-cpu-timer: Tidy up group_leader logic in lookup_task
  proc: Ensure we see the exit of each process tid exactly once
  rculist: Add hlists_swap_heads_rcu
  proc: Use PIDTYPE_TGID in next_tgid
  Use proc_pid_ns() to get pid_namespace from the proc superblock
  proc: use named enums for better readability
  proc: use human-readable values for hidepid
  docs: proc: add documentation for "hidepid=4" and "subset=pid" options and new mount behavior
  proc: add option to mount only a pids subset
  proc: instantiate only pids that we can ptrace on 'hidepid=4' mount option
  proc: allow to mount many instances of proc in one pid namespace
  proc: rename struct proc_fs_info to proc_fs_opts
</content>
</entry>
<entry>
<title>proc: Ensure we see the exit of each process tid exactly once</title>
<updated>2020-04-28T21:13:58+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2020-04-19T11:35:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6b03d1304a32dc8450c7516000a0fe07bef7c446'/>
<id>urn:sha1:6b03d1304a32dc8450c7516000a0fe07bef7c446</id>
<content type='text'>
When the thread group leader changes during exec and the old leaders
thread is reaped proc_flush_pid will flush the dentries for the entire
process because the leader still has it's original pid.

Fix this by exchanging the pids in an rcu safe manner,
and wrapping the code to do that up in a helper exchange_tids.

When I removed switch_exec_pids and introduced this behavior
in d73d65293e3e ("[PATCH] pidhash: kill switch_exec_pids") there
really was nothing that cared as flushing happened with
the cached dentry and de_thread flushed both of them on exec.

This lack of fully exchanging pids became a problem a few months later
when I introduced 48e6484d4902 ("[PATCH] proc: Rewrite the proc dentry
flush on exit optimization").  Which overlooked the de_thread case
was no longer swapping pids, and I was looking up proc dentries
by task-&gt;pid.

The current behavior isn't properly a bug as everything in proc will
continue to work correctly just a little bit less efficiently.  Fix
this just so there are no little surprise corner cases waiting to bite
people.

-- Oleg points out this could be an issue in next_tgid in proc where
   has_group_leader_pid is called, and reording some of the assignments
   should fix that.

-- Oleg points out this will break the 10 year old hack in __exit_signal.c
&gt;	/*
&gt;	 * This can only happen if the caller is de_thread().
&gt;	 * FIXME: this is the temporary hack, we should teach
&gt;	 * posix-cpu-timers to handle this case correctly.
&gt;	 */
&gt;	if (unlikely(has_group_leader_pid(tsk)))
&gt;		posix_cpu_timers_exit_group(tsk);

The code in next_tgid has been changed to use PIDTYPE_TGID,
and the posix cpu timers code has been fixed so it does not
need the 10 year old hack, so this should be safe to merge
now.

Link: https://lore.kernel.org/lkml/87h7x3ajll.fsf_-_@x220.int.ebiederm.org/
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Fixes: 48e6484d4902 ("[PATCH] proc: Rewrite the proc dentry flush on exit optimization").
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>sysctl: remove all extern declaration from sysctl.c</title>
<updated>2020-04-27T06:06:53+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-04-24T06:43:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2374c09b1c8a883bb9b4b2fc3756703eeb618f4a'/>
<id>urn:sha1:2374c09b1c8a883bb9b4b2fc3756703eeb618f4a</id>
<content type='text'>
Extern declarations in .c files are a bad style and can lead to
mismatches.  Use existing definitions in headers where they exist,
and otherwise move the external declarations to suitable header
files.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>proc: Use a dedicated lock in struct pid</title>
<updated>2020-04-09T17:15:35+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2020-04-07T14:43:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=63f818f46af9f8b3f17b9695501e8d08959feb60'/>
<id>urn:sha1:63f818f46af9f8b3f17b9695501e8d08959feb60</id>
<content type='text'>
syzbot wrote:
&gt; ========================================================
&gt; WARNING: possible irq lock inversion dependency detected
&gt; 5.6.0-syzkaller #0 Not tainted
&gt; --------------------------------------------------------
&gt; swapper/1/0 just changed the state of lock:
&gt; ffffffff898090d8 (tasklist_lock){.+.?}-{2:2}, at: send_sigurg+0x9f/0x320 fs/fcntl.c:840
&gt; but this lock took another, SOFTIRQ-unsafe lock in the past:
&gt;  (&amp;pid-&gt;wait_pidfd){+.+.}-{2:2}
&gt;
&gt;
&gt; and interrupts could create inverse lock ordering between them.
&gt;
&gt;
&gt; other info that might help us debug this:
&gt;  Possible interrupt unsafe locking scenario:
&gt;
&gt;        CPU0                    CPU1
&gt;        ----                    ----
&gt;   lock(&amp;pid-&gt;wait_pidfd);
&gt;                                local_irq_disable();
&gt;                                lock(tasklist_lock);
&gt;                                lock(&amp;pid-&gt;wait_pidfd);
&gt;   &lt;Interrupt&gt;
&gt;     lock(tasklist_lock);
&gt;
&gt;  *** DEADLOCK ***
&gt;
&gt; 4 locks held by swapper/1/0:

The problem is that because wait_pidfd.lock is taken under the tasklist
lock.  It must always be taken with irqs disabled as tasklist_lock can be
taken from interrupt context and if wait_pidfd.lock was already taken this
would create a lock order inversion.

Oleg suggested just disabling irqs where I have added extra calls to
wait_pidfd.lock.  That should be safe and I think the code will eventually
do that.  It was rightly pointed out by Christian that sharing the
wait_pidfd.lock was a premature optimization.

It is also true that my pre-merge window testing was insufficient.  So
remove the premature optimization and give struct pid a dedicated lock of
it's own for struct pid things.  I have verified that lockdep sees all 3
paths where we take the new pid-&gt;lock and lockdep does not complain.

It is my current day dream that one day pid-&gt;lock can be used to guard the
task lists as well and then the tasklist_lock won't need to be held to
deliver signals.  That will require taking pid-&gt;lock with irqs disabled.

Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Link: https://lore.kernel.org/lkml/00000000000011d66805a25cd73f@google.com/
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Reported-by: syzbot+343f75cdeea091340956@syzkaller.appspotmail.com
Reported-by: syzbot+832aabf700bc3ec920b9@syzkaller.appspotmail.com
Reported-by: syzbot+f675f964019f884dbd0f@syzkaller.appspotmail.com
Reported-by: syzbot+a9fb1457d720a55d6dc5@syzkaller.appspotmail.com
Fixes: 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>proc: Use a list of inodes to flush from proc</title>
<updated>2020-02-24T16:14:44+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2020-02-20T00:22:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7bc3e6e55acf065500a24621f3b313e7e5998acf'/>
<id>urn:sha1:7bc3e6e55acf065500a24621f3b313e7e5998acf</id>
<content type='text'>
Rework the flushing of proc to use a list of directory inodes that
need to be flushed.

The list is kept on struct pid not on struct task_struct, as there is
a fixed connection between proc inodes and pids but at least for the
case of de_thread the pid of a task_struct changes.

This removes the dependency on proc_mnt which allows for different
mounts of proc having different mount options even in the same pid
namespace and this allows for the removal of proc_mnt which will
trivially the first mount of proc to honor it's mount options.

This flushing remains an optimization.  The functions
pid_delete_dentry and pid_revalidate ensure that ordinary dcache
management will not attempt to use dentries past the point their
respective task has died.  When unused the shrinker will
eventually be able to remove these dentries.

There is a case in de_thread where proc_flush_pid can be
called early for a given pid.  Which winds up being
safe (if suboptimal) as this is just an optiimization.

Only pid directories are put on the list as the other
per pid files are children of those directories and
d_invalidate on the directory will get them as well.

So that the pid can be used during flushing it's reference count is
taken in release_task and dropped in proc_flush_pid.  Further the call
of proc_flush_pid is moved after the tasklist_lock is released in
release_task so that it is certain that the pid has already been
unhashed when flushing it taking place.  This removes a small race
where a dentry could recreated.

As struct pid is supposed to be small and I need a per pid lock
I reuse the only lock that currently exists in struct pid the
the wait_pidfd.lock.

The net result is that this adds all of this functionality
with just a little extra list management overhead and
a single extra pointer in struct pid.

v2: Initialize pid-&gt;inodes.  I somehow failed to get that
    initialization into the initial version of the patch.  A boot
    failure was reported by "kernel test robot &lt;lkp@intel.com&gt;", and
    failure to initialize that pid-&gt;inodes matches all of the reported
    symptoms.

Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
</feed>
