diff options
| author | Christian Brauner <brauner@kernel.org> | 2025-02-04 12:27:36 +0300 | 
|---|---|---|
| committer | Christian Brauner <brauner@kernel.org> | 2025-02-05 17:14:56 +0300 | 
| commit | b1e809e7f64ad47dd232ff072d8ef59c1fe414c5 (patch) | |
| tree | f39bf585e93aa1a7c6accf07e61b1fe31c2cfb87 /tools/perf/scripts/python/stackcollapse.py | |
| parent | 2014c95afecee3e76ca4a56956a936e23283f05b (diff) | |
| parent | 62d648c4429a5edcfcfae03da15d2268d66e1a78 (diff) | |
| download | linux-b1e809e7f64ad47dd232ff072d8ef59c1fe414c5.tar.xz | |
Merge patch series "introduce PIDFD_SELF* sentinels"
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> says:
If you wish to utilise a pidfd interface to refer to the current process or
thread then there is a lot of ceremony involved, looking something like:
        pid_t pid = getpid();
        int pidfd = pidfd_open(pid, PIDFD_THREAD);
        if (pidfd < 0) {
                ... error handling ...
        }
        if (process_madvise(pidfd, iovec, 8, MADV_GUARD_INSTALL, 0)) {
                ... cleanup pidfd ...
                ... error handling ...
        }
        ...
        ... cleanup pidfd ...
This adds unnecessary overhead + system calls, complicated error handling
and in addition pidfd_open() is subject to RLIMIT_NOFILE (i.e. the limit of
per-process number of open file descriptors), so the call may fail
spuriously on this basis.
Rather than doing this we can make use of sentinels for this purpose which can
be passed as the pidfd instead. This looks like:
        if (process_madvise(PIDFD_SELF, iovec, 8, MADV_GUARD_INSTALL, 0)) {
                ... error handling ...
        }
And avoids all of the aforementioned issues. This series introduces such
sentinels.
It is useful to refer to both the current thread from the userland's
perspective for which we use PIDFD_SELF, and the current process from the
userland's perspective, for which we use PIDFD_SELF_PROCESS.
There is unfortunately some confusion between the kernel and userland as to
what constitutes a process - a thread from the userland perspective is a
process in userland, and a userland process is a thread group (more
specifically the thread group leader from the kernel perspective). We
therefore alias things thusly:
* PIDFD_SELF_THREAD aliased by PIDFD_SELF - use PIDTYPE_PID.
* PIDFD_SELF_THREAD_GROUP alised by PIDFD_SELF_PROCESS - use PIDTYPE_TGID.
In all of the kernel code we refer to PIDFD_SELF_THREAD and
PIDFD_SELF_THREAD_GROUP. However we expect users to use PIDFD_SELF and
PIDFD_SELF_PROCESS.
This matters for cases where, for instance, a user unshare()'s FDs or does
thread-specific signal handling and where the user would be hugely confused
if the FDs referenced or signal processed referred to the thread group
leader rather than the individual thread.
Another use-case comes from Android. When installing multiple
MADV_GUARD_INSTALL guard pages they considered caching the result of
pidfd_open() for reuse. That however wouldn't work in shared libraries
where users can fork() in between such calls.
For now we only adjust pidfd_get_task() and the pidfd_send_signal() system
call with specific handling for this, implementing this functionality for
process_madvise(), process_mrelease() (albeit, using it here wouldn't
really make sense) and pidfd_send_signal().
We defer making further changes, as this would require a significant rework
of the pidfd mechanism.
The motivating case here is to support PIDFD_SELF in process_madvise(), so
this suffices for immediate uses. Moving forward, this can be further
expanded to other uses.
* patches from https://lore.kernel.org/r/cover.1738268370.git.lorenzo.stoakes@oracle.com:
  selftests/mm: use PIDFD_SELF in guard pages test
  selftests/pidfd: add tests for PIDFD_SELF_*
  selftests/pidfd: add new PIDFD_SELF* defines
  pidfd: add PIDFD_SELF* sentinels to refer to own thread/process
Link: https://lore.kernel.org/r/cover.1738268370.git.lorenzo.stoakes@oracle.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Diffstat (limited to 'tools/perf/scripts/python/stackcollapse.py')
0 files changed, 0 insertions, 0 deletions
