<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/exec.c, branch linux-6.9.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-6.9.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-6.9.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-05-30T07:44:56+00:00</updated>
<entry>
<title>mm/ksm: fix ksm exec support for prctl</title>
<updated>2024-05-30T07:44:56+00:00</updated>
<author>
<name>Jinjiang Tu</name>
<email>tujinjiang@huawei.com</email>
</author>
<published>2024-03-28T11:10:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a08d94502b25ecd181338909a56322cf58197c70'/>
<id>urn:sha1:a08d94502b25ecd181338909a56322cf58197c70</id>
<content type='text'>
[ Upstream commit 3a9e567ca45fb5280065283d10d9a11f0db61d2b ]

Patch series "mm/ksm: fix ksm exec support for prctl", v4.

commit 3c6f33b7273a ("mm/ksm: support fork/exec for prctl") inherits
MMF_VM_MERGE_ANY flag when a task calls execve().  However, it doesn't
create the mm_slot, so ksmd will not try to scan this task.  The first
patch fixes the issue.

The second patch refactors to prepare for the third patch.  The third
patch extends the selftests of ksm to verfity the deduplication really
happens after fork/exec inherits ths KSM setting.

This patch (of 3):

commit 3c6f33b7273a ("mm/ksm: support fork/exec for prctl") inherits
MMF_VM_MERGE_ANY flag when a task calls execve().  Howerver, it doesn't
create the mm_slot, so ksmd will not try to scan this task.

To fix it, allocate and add the mm_slot to ksm_mm_head in __bprm_mm_init()
when the mm has MMF_VM_MERGE_ANY flag.

Link: https://lkml.kernel.org/r/20240328111010.1502191-1-tujinjiang@huawei.com
Link: https://lkml.kernel.org/r/20240328111010.1502191-2-tujinjiang@huawei.com
Fixes: 3c6f33b7273a ("mm/ksm: support fork/exec for prctl")
Signed-off-by: Jinjiang Tu &lt;tujinjiang@huawei.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Cc: Nanyong Sun &lt;sunnanyong@huawei.com&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Stefan Roesch &lt;shr@devkernel.io&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'execve-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux</title>
<updated>2024-03-27T16:57:30+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-27T16:57:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f4a432914af728be2c149934295f337351aa774c'/>
<id>urn:sha1:f4a432914af728be2c149934295f337351aa774c</id>
<content type='text'>
Pull execve fixes from Kees Cook:

 - Fix selftests to conform to the TAP output format (Muhammad Usama
   Anjum)

 - Fix NOMMU linux_binprm::exec pointer in auxv (Max Filippov)

 - Replace deprecated strncpy usage (Justin Stitt)

 - Replace another /bin/sh instance in selftests

* tag 'execve-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  binfmt: replace deprecated strncpy
  exec: Fix NOMMU linux_binprm::exec in transfer_args_to_stack()
  selftests/exec: Convert remaining /bin/sh to /bin/bash
  selftests/exec: execveat: Improve debug reporting
  selftests/exec: recursion-depth: conform test to TAP format output
  selftests/exec: load_address: conform test to TAP format output
  selftests/exec: binfmt_script: Add the overall result line according to TAP
</content>
</entry>
<entry>
<title>exec: Fix NOMMU linux_binprm::exec in transfer_args_to_stack()</title>
<updated>2024-03-21T17:05:47+00:00</updated>
<author>
<name>Max Filippov</name>
<email>jcmvbkbc@gmail.com</email>
</author>
<published>2024-03-20T18:26:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2aea94ac14d1e0a8ae9e34febebe208213ba72f7'/>
<id>urn:sha1:2aea94ac14d1e0a8ae9e34febebe208213ba72f7</id>
<content type='text'>
In NOMMU kernel the value of linux_binprm::p is the offset inside the
temporary program arguments array maintained in separate pages in the
linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
thus must be adjusted when that array is copied to the user stack.
Without that adjustment the value passed by the NOMMU kernel to the ELF
program in the AT_EXECFN entry of the aux array doesn't make any sense
and it may break programs that try to access memory pointed to by that
entry.

Adjust linux_binprm::exec before the successful return from the
transfer_args_to_stack().

Cc: &lt;stable@vger.kernel.org&gt;
Fixes: b6a2fea39318 ("mm: variable length argument support")
Fixes: 5edc2a5123a7 ("binfmt_elf_fdpic: wire up AT_EXECFD, AT_EXECFN, AT_SECURE")
Signed-off-by: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Link: https://lore.kernel.org/r/20240320182607.1472887-1-jcmvbkbc@gmail.com
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'execve-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux</title>
<updated>2024-03-12T21:45:12+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-12T21:45:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b32273ee89a866b01b316b9a8de407efde01090c'/>
<id>urn:sha1:b32273ee89a866b01b316b9a8de407efde01090c</id>
<content type='text'>
Pull execve updates from Kees Cook:

 - Drop needless error path code in remove_arg_zero() (Li kunyu, Kees
   Cook)

 - binfmt_elf_efpic: Don't use missing interpreter's properties (Max
   Filippov)

 - Use /bin/bash for execveat selftests

* tag 'execve-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  exec: Simplify remove_arg_zero() error path
  selftests/exec: Perform script checks with /bin/bash
  exec: Delete unnecessary statements in remove_arg_zero()
  fs: binfmt_elf_efpic: don't use missing interpreter's properties
</content>
</entry>
<entry>
<title>Merge tag 'vfs-6.9.pidfd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2024-03-11T17:21:06+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-11T17:21:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b5683a37c881e2e08065f1670086e281430ee19f'/>
<id>urn:sha1:b5683a37c881e2e08065f1670086e281430ee19f</id>
<content type='text'>
Pull pdfd updates from Christian Brauner:

 - Until now pidfds could only be created for thread-group leaders but
   not for threads. There was no technical reason for this. We simply
   had no users that needed support for this. Now we do have users that
   need support for this.

   This introduces a new PIDFD_THREAD flag for pidfd_open(). If that
   flag is set pidfd_open() creates a pidfd that refers to a specific
   thread.

   In addition, we now allow clone() and clone3() to be called with
   CLONE_PIDFD | CLONE_THREAD which wasn't possible before.

   A pidfd that refers to an individual thread differs from a pidfd that
   refers to a thread-group leader:

    (1) Pidfds are pollable. A task may poll a pidfd and get notified
        when the task has exited.

        For thread-group leader pidfds the polling task is woken if the
        thread-group is empty. In other words, if the thread-group
        leader task exits when there are still threads alive in its
        thread-group the polling task will not be woken when the
        thread-group leader exits but rather when the last thread in the
        thread-group exits.

        For thread-specific pidfds the polling task is woken if the
        thread exits.

    (2) Passing a thread-group leader pidfd to pidfd_send_signal() will
        generate thread-group directed signals like kill(2) does.

        Passing a thread-specific pidfd to pidfd_send_signal() will
        generate thread-specific signals like tgkill(2) does.

        The default scope of the signal is thus determined by the type
        of the pidfd.

        Since use-cases exist where the default scope of the provided
        pidfd needs to be overriden the following flags are added to
        pidfd_send_signal():

         - PIDFD_SIGNAL_THREAD
           Send a thread-specific signal.

         - PIDFD_SIGNAL_THREAD_GROUP
           Send a thread-group directed signal.

         - PIDFD_SIGNAL_PROCESS_GROUP
           Send a process-group directed signal.

        The scope change will only work if the struct pid is actually
        used for this scope.

        For example, in order to send a thread-group directed signal the
        provided pidfd must be used as a thread-group leader and
        similarly for PIDFD_SIGNAL_PROCESS_GROUP the struct pid must be
        used as a process group leader.

 - Move pidfds from the anonymous inode infrastructure to a tiny pseudo
   filesystem. This will unblock further work that we weren't able to do
   simply because of the very justified limitations of anonymous inodes.
   Moving pidfds to a tiny pseudo filesystem allows for statx on pidfds
   to become useful for the first time. They can now be compared by
   inode number which are unique for the system lifetime.

   Instead of stashing struct pid in file-&gt;private_data we can now stash
   it in inode-&gt;i_private. This makes it possible to introduce concepts
   that operate on a process once all file descriptors have been closed.
   A concrete example is kill-on-last-close. Another side-effect is that
   file-&gt;private_data is now freed up for per-file options for pidfds.

   Now, each struct pid will refer to a different inode but the same
   struct pid will refer to the same inode if it's opened multiple
   times. In contrast to now where each struct pid refers to the same
   inode.

   The tiny pseudo filesystem is not visible anywhere in userspace
   exactly like e.g., pipefs and sockfs. There's no lookup, there's no
   complex inode operations, nothing. Dentries and inodes are always
   deleted when the last pidfd is closed.

   We allocate a new inode and dentry for each struct pid and we reuse
   that inode and dentry for all pidfds that refer to the same struct
   pid. The code is entirely optional and fairly small. If it's not
   selected we fallback to anonymous inodes. Heavily inspired by nsfs.

   The dentry and inode allocation mechanism is moved into generic
   infrastructure that is now shared between nsfs and pidfs. The
   path_from_stashed() helper must be provided with a stashing location,
   an inode number, a mount, and the private data that is supposed to be
   used and it will provide a path that can be passed to dentry_open().

   The helper will try retrieve an existing dentry from the provided
   stashing location. If a valid dentry is found it is reused. If not a
   new one is allocated and we try to stash it in the provided location.
   If this fails we retry until we either find an existing dentry or the
   newly allocated dentry could be stashed. Subsequent openers of the
   same namespace or task are then able to reuse it.

 - Currently it is only possible to get notified when a task has exited,
   i.e., become a zombie and userspace gets notified with EPOLLIN. We
   now also support waiting until the task has been reaped, notifying
   userspace with EPOLLHUP.

 - Ensure that ESRCH is reported for getfd if a task is exiting instead
   of the confusing EBADF.

 - Various smaller cleanups to pidfd functions.

* tag 'vfs-6.9.pidfd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits)
  libfs: improve path_from_stashed()
  libfs: add stashed_dentry_prune()
  libfs: improve path_from_stashed() helper
  pidfs: convert to path_from_stashed() helper
  nsfs: convert to path_from_stashed() helper
  libfs: add path_from_stashed()
  pidfd: add pidfs
  pidfd: move struct pidfd_fops
  pidfd: allow to override signal scope in pidfd_send_signal()
  pidfd: change pidfd_send_signal() to respect PIDFD_THREAD
  signal: fill in si_code in prepare_kill_siginfo()
  selftests: add ESRCH tests for pidfd_getfd()
  pidfd: getfd should always report ESRCH if a task is exiting
  pidfd: clone: allow CLONE_THREAD | CLONE_PIDFD together
  pidfd: exit: kill the no longer used thread_group_exited()
  pidfd: change do_notify_pidfd() to use __wake_up(poll_to_key(EPOLLIN))
  pid: kill the obsolete PIDTYPE_PID code in transfer_pid()
  pidfd: kill the no longer needed do_notify_pidfd() in de_thread()
  pidfd_poll: report POLLHUP when pid_task() == NULL
  pidfd: implement PIDFD_THREAD flag for pidfd_open()
  ...
</content>
</entry>
<entry>
<title>exec: Simplify remove_arg_zero() error path</title>
<updated>2024-03-09T21:46:30+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2024-03-09T21:46:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=725d50261285ccf02501f2a1a6d10b31ce014597'/>
<id>urn:sha1:725d50261285ccf02501f2a1a6d10b31ce014597</id>
<content type='text'>
We don't need the "out" label any more, so remove "ret" and return
directly on error.

Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
---
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: linux-mm@kvack.org
Cc: linux-fsdevel@vger.kernel.org
</content>
</entry>
<entry>
<title>exec: Delete unnecessary statements in remove_arg_zero()</title>
<updated>2024-02-24T01:07:31+00:00</updated>
<author>
<name>Li kunyu</name>
<email>kunyu@nfschina.com</email>
</author>
<published>2024-02-20T05:24:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d3f0d7bbaefd3465e95da2500640732a5cc8bedf'/>
<id>urn:sha1:d3f0d7bbaefd3465e95da2500640732a5cc8bedf</id>
<content type='text'>
'ret=0; ' In actual operation, the ret was not modified, so this
sentence can be removed.

Signed-off-by: Li kunyu &lt;kunyu@nfschina.com&gt;
Link: https://lore.kernel.org/r/20240220052426.62018-1-kunyu@nfschina.com
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>pidfd: kill the no longer needed do_notify_pidfd() in de_thread()</title>
<updated>2024-02-02T13:57:53+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2024-02-02T13:12:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=90f92b68c9869913753f8bc1d87b7762a5f36873'/>
<id>urn:sha1:90f92b68c9869913753f8bc1d87b7762a5f36873</id>
<content type='text'>
Now that __change_pid() does wake_up_all(&amp;pid-&gt;wait_pidfd) we can kill
do_notify_pidfd(leader) in de_thread(), it calls release_task(leader)
right after that and this implies detach_pid(leader, PIDTYPE_PID).

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Link: https://lore.kernel.org/r/20240202131248.GA26022@redhat.com
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>pidfd: implement PIDFD_THREAD flag for pidfd_open()</title>
<updated>2024-02-02T12:12:28+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2024-01-31T13:26:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=64bef697d33b75fc06c5789b3f8108680271529f'/>
<id>urn:sha1:64bef697d33b75fc06c5789b3f8108680271529f</id>
<content type='text'>
With this flag:

	- pidfd_open() doesn't require that the target task must be
	  a thread-group leader

	- pidfd_poll() succeeds when the task exits and becomes a
	  zombie (iow, passes exit_notify()), even if it is a leader
	  and thread-group is not empty.

	  This means that the behaviour of pidfd_poll(PIDFD_THREAD,
	  pid-of-group-leader) is not well defined if it races with
	  exec() from its sub-thread; pidfd_poll() can succeed or not
	  depending on whether pidfd_task_exited() is called before
	  or after exchange_tids().

	  Perhaps we can improve this behaviour later, pidfd_poll()
	  can probably take sig-&gt;group_exec_task into account. But
	  this doesn't really differ from the case when the leader
	  exits before other threads (so pidfd_poll() succeeds) and
	  then another thread execs and pidfd_poll() will block again.

thread_group_exited() is no longer used, perhaps it can die.

Co-developed-by: Tycho Andersen &lt;tycho@tycho.pizza&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Link: https://lore.kernel.org/r/20240131132602.GA23641@redhat.com
Tested-by: Tycho Andersen &lt;tandersen@netflix.com&gt;
Reviewed-by: Tycho Andersen &lt;tandersen@netflix.com&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'execve-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux</title>
<updated>2024-01-24T21:32:29+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-01-24T21:32:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cf10015a24f36a82370151a88cb8610c8779e927'/>
<id>urn:sha1:cf10015a24f36a82370151a88cb8610c8779e927</id>
<content type='text'>
Pull execve fixes from Kees Cook:

 - Fix error handling in begin_new_exec() (Bernd Edlinger)

 - MAINTAINERS: specifically mention ELF (Alexey Dobriyan)

 - Various cleanups related to earlier open() (Askar Safin, Kees Cook)

* tag 'execve-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  exec: Distinguish in_execve from in_exec
  exec: Fix error handling in begin_new_exec()
  exec: Add do_close_execat() helper
  exec: remove useless comment
  ELF, MAINTAINERS: specifically mention ELF
</content>
</entry>
</feed>
