<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/proc/fd.c, branch v6.6.132</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-06-16T11:47:33+00:00</updated>
<entry>
<title>proc: Move fdinfo PTRACE_MODE_READ check into the inode .permission operation</title>
<updated>2024-06-16T11:47:33+00:00</updated>
<author>
<name>Tyler Hicks (Microsoft)</name>
<email>code@tyhicks.com</email>
</author>
<published>2024-05-01T00:56:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0c08b92f982731c2a1b808a6b55e0cfb3307305c'/>
<id>urn:sha1:0c08b92f982731c2a1b808a6b55e0cfb3307305c</id>
<content type='text'>
commit 0a960ba49869ebe8ff859d000351504dd6b93b68 upstream.

The following commits loosened the permissions of /proc/&lt;PID&gt;/fdinfo/
directory, as well as the files within it, from 0500 to 0555 while also
introducing a PTRACE_MODE_READ check between the current task and
&lt;PID&gt;'s task:

 - commit 7bc3fa0172a4 ("procfs: allow reading fdinfo with PTRACE_MODE_READ")
 - commit 1927e498aee1 ("procfs: prevent unprivileged processes accessing fdinfo dir")

Before those changes, inode based system calls like inotify_add_watch(2)
would fail when the current task didn't have sufficient read permissions:

 [...]
 lstat("/proc/1/task/1/fdinfo", {st_mode=S_IFDIR|0500, st_size=0, ...}) = 0
 inotify_add_watch(64, "/proc/1/task/1/fdinfo",
		   IN_MODIFY|IN_ATTRIB|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|
		   IN_ONLYDIR|IN_DONT_FOLLOW|IN_EXCL_UNLINK) = -1 EACCES (Permission denied)
 [...]

This matches the documented behavior in the inotify_add_watch(2) man
page:

 ERRORS
       EACCES Read access to the given file is not permitted.

After those changes, inotify_add_watch(2) started succeeding despite the
current task not having PTRACE_MODE_READ privileges on the target task:

 [...]
 lstat("/proc/1/task/1/fdinfo", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
 inotify_add_watch(64, "/proc/1/task/1/fdinfo",
		   IN_MODIFY|IN_ATTRIB|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|
		   IN_ONLYDIR|IN_DONT_FOLLOW|IN_EXCL_UNLINK) = 1757
 openat(AT_FDCWD, "/proc/1/task/1/fdinfo",
	O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = -1 EACCES (Permission denied)
 [...]

This change in behavior broke .NET prior to v7. See the github link
below for the v7 commit that inadvertently/quietly (?) fixed .NET after
the kernel changes mentioned above.

Return to the old behavior by moving the PTRACE_MODE_READ check out of
the file .open operation and into the inode .permission operation:

 [...]
 lstat("/proc/1/task/1/fdinfo", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
 inotify_add_watch(64, "/proc/1/task/1/fdinfo",
		   IN_MODIFY|IN_ATTRIB|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|
		   IN_ONLYDIR|IN_DONT_FOLLOW|IN_EXCL_UNLINK) = -1 EACCES (Permission denied)
 [...]

Reported-by: Kevin Parsons (Microsoft) &lt;parsonskev@gmail.com&gt;
Link: https://github.com/dotnet/runtime/commit/89e5469ac591b82d38510fe7de98346cce74ad4f
Link: https://stackoverflow.com/questions/75379065/start-self-contained-net6-build-exe-as-service-on-raspbian-system-unauthorizeda
Fixes: 7bc3fa0172a4 ("procfs: allow reading fdinfo with PTRACE_MODE_READ")
Cc: stable@vger.kernel.org
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Kalesh Singh &lt;kaleshsingh@google.com&gt;
Cc: Hardik Garg &lt;hargar@linux.microsoft.com&gt;
Cc: Allen Pais &lt;apais@linux.microsoft.com&gt;
Signed-off-by: Tyler Hicks (Microsoft) &lt;code@tyhicks.com&gt;
Link: https://lore.kernel.org/r/20240501005646.745089-1-code@tyhicks.com
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>fs: pass the request_mask to generic_fillattr</title>
<updated>2023-08-09T06:56:36+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@kernel.org</email>
</author>
<published>2023-08-07T19:38:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0d72b92883c651a11059d93335f33d65c6eb653b'/>
<id>urn:sha1:0d72b92883c651a11059d93335f33d65c6eb653b</id>
<content type='text'>
generic_fillattr just fills in the entire stat struct indiscriminately
today, copying data from the inode. There is at least one attribute
(STATX_CHANGE_COOKIE) that can have side effects when it is reported,
and we're looking at adding more with the addition of multigrain
timestamps.

Add a request_mask argument to generic_fillattr and have most callers
just pass in the value that is passed to getattr. Have other callers
(e.g. ksmbd) just pass in STATX_BASIC_STATS. Also move the setting of
STATX_CHANGE_COOKIE into generic_fillattr.

Acked-by: Joseph Qi &lt;joseph.qi@linux.alibaba.com&gt;
Reviewed-by: Xiubo Li &lt;xiubli@redhat.com&gt;
Reviewed-by: "Paulo Alcantara (SUSE)" &lt;pc@manguebit.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Message-Id: &lt;20230807-mgctime-v7-2-d1dec143a704@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping</title>
<updated>2023-02-20T19:53:11+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-02-20T19:53:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=05e6295f7b5e05f09e369a3eb2882ec5b40fff20'/>
<id>urn:sha1:05e6295f7b5e05f09e369a3eb2882ec5b40fff20</id>
<content type='text'>
Pull vfs idmapping updates from Christian Brauner:

 - Last cycle we introduced the dedicated struct mnt_idmap type for
   mount idmapping and the required infrastucture in 256c8aed2b42 ("fs:
   introduce dedicated idmap type for mounts"). As promised in last
   cycle's pull request message this converts everything to rely on
   struct mnt_idmap.

   Currently we still pass around the plain namespace that was attached
   to a mount. This is in general pretty convenient but it makes it easy
   to conflate namespaces that are relevant on the filesystem with
   namespaces that are relevant on the mount level. Especially for
   non-vfs developers without detailed knowledge in this area this was a
   potential source for bugs.

   This finishes the conversion. Instead of passing the plain namespace
   around this updates all places that currently take a pointer to a
   mnt_userns with a pointer to struct mnt_idmap.

   Now that the conversion is done all helpers down to the really
   low-level helpers only accept a struct mnt_idmap argument instead of
   two namespace arguments.

   Conflating mount and other idmappings will now cause the compiler to
   complain loudly thus eliminating the possibility of any bugs. This
   makes it impossible for filesystem developers to mix up mount and
   filesystem idmappings as they are two distinct types and require
   distinct helpers that cannot be used interchangeably.

   Everything associated with struct mnt_idmap is moved into a single
   separate file. With that change no code can poke around in struct
   mnt_idmap. It can only be interacted with through dedicated helpers.
   That means all filesystems are and all of the vfs is completely
   oblivious to the actual implementation of idmappings.

   We are now also able to extend struct mnt_idmap as we see fit. For
   example, we can decouple it completely from namespaces for users that
   don't require or don't want to use them at all. We can also extend
   the concept of idmappings so we can cover filesystem specific
   requirements.

   In combination with the vfs{g,u}id_t work we finished in v6.2 this
   makes this feature substantially more robust and thus difficult to
   implement wrong by a given filesystem and also protects the vfs.

 - Enable idmapped mounts for tmpfs and fulfill a longstanding request.

   A long-standing request from users had been to make it possible to
   create idmapped mounts for tmpfs. For example, to share the host's
   tmpfs mount between multiple sandboxes. This is a prerequisite for
   some advanced Kubernetes cases. Systemd also has a range of use-cases
   to increase service isolation. And there are more users of this.

   However, with all of the other work going on this was way down on the
   priority list but luckily someone other than ourselves picked this
   up.

   As usual the patch is tiny as all the infrastructure work had been
   done multiple kernel releases ago. In addition to all the tests that
   we already have I requested that Rodrigo add a dedicated tmpfs
   testsuite for idmapped mounts to xfstests. It is to be included into
   xfstests during the v6.3 development cycle. This should add a slew of
   additional tests.

* tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (26 commits)
  shmem: support idmapped mounts for tmpfs
  fs: move mnt_idmap
  fs: port vfs{g,u}id helpers to mnt_idmap
  fs: port fs{g,u}id helpers to mnt_idmap
  fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap
  fs: port i_{g,u}id_{needs_}update() to mnt_idmap
  quota: port to mnt_idmap
  fs: port privilege checking helpers to mnt_idmap
  fs: port inode_owner_or_capable() to mnt_idmap
  fs: port inode_init_owner() to mnt_idmap
  fs: port acl to mnt_idmap
  fs: port xattr to mnt_idmap
  fs: port -&gt;permission() to pass mnt_idmap
  fs: port -&gt;fileattr_set() to pass mnt_idmap
  fs: port -&gt;set_acl() to pass mnt_idmap
  fs: port -&gt;get_acl() to pass mnt_idmap
  fs: port -&gt;tmpfile() to pass mnt_idmap
  fs: port -&gt;rename() to pass mnt_idmap
  fs: port -&gt;mknod() to pass mnt_idmap
  fs: port -&gt;mkdir() to pass mnt_idmap
  ...
</content>
</entry>
<entry>
<title>fs: port -&gt;permission() to pass mnt_idmap</title>
<updated>2023-01-19T08:24:28+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2023-01-13T11:49:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4609e1f18e19c3b302e1eb4858334bca1532f780'/>
<id>urn:sha1:4609e1f18e19c3b302e1eb4858334bca1532f780</id>
<content type='text'>
Convert to struct mnt_idmap.

Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.

Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.

Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.

Acked-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Christian Brauner (Microsoft) &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>fs: port -&gt;getattr() to pass mnt_idmap</title>
<updated>2023-01-19T08:24:25+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2023-01-13T11:49:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b74d24f7a74ffd2d42ca883d84b7422b8d545901'/>
<id>urn:sha1:b74d24f7a74ffd2d42ca883d84b7422b8d545901</id>
<content type='text'>
Convert to struct mnt_idmap.

Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.

Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.

Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.

Acked-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Christian Brauner (Microsoft) &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>filelock: move file locking definitions to separate header file</title>
<updated>2023-01-11T11:52:32+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@kernel.org</email>
</author>
<published>2022-11-20T14:15:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5970e15dbcfeb0ed3a0bf1954f35bbe60a048754'/>
<id>urn:sha1:5970e15dbcfeb0ed3a0bf1954f35bbe60a048754</id>
<content type='text'>
The file locking definitions have lived in fs.h since the dawn of time,
but they are only used by a small subset of the source files that
include it.

Move the file locking definitions to a new header file, and add the
appropriate #include directives to the source files that need them. By
doing this we trim down fs.h a bit and limit the amount of rebuilding
that has to be done when we make changes to the file locking APIs.

Reviewed-by: Xiubo Li &lt;xiubli@redhat.com&gt;
Reviewed-by: Christian Brauner (Microsoft) &lt;brauner@kernel.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: David Howells &lt;dhowells@redhat.com&gt;
Reviewed-by: Russell King (Oracle) &lt;rmk+kernel@armlinux.org.uk&gt;
Acked-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Acked-by: Joseph Qi &lt;joseph.qi@linux.alibaba.com&gt;
Acked-by: Steve French &lt;stfrench@microsoft.com&gt;
Acked-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Acked-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
</entry>
<entry>
<title>proc: report open files as size in stat() for /proc/pid/fd</title>
<updated>2022-11-18T21:55:07+00:00</updated>
<author>
<name>Ivan Babrou</name>
<email>ivan@cloudflare.com</email>
</author>
<published>2022-09-22T22:40:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f1f1f2569901ec5b9d425f2e91c09a0e320768f3'/>
<id>urn:sha1:f1f1f2569901ec5b9d425f2e91c09a0e320768f3</id>
<content type='text'>
Many monitoring tools include open file count as a metric.  Currently the
only way to get this number is to enumerate the files in /proc/pid/fd.

The problem with the current approach is that it does many things people
generally don't care about when they need one number for a metric.  In our
tests for cadvisor, which reports open file counts per cgroup, we observed
that reading the number of open files is slow.  Out of 35.23% of CPU time
spent in `proc_readfd_common`, we see 29.43% spent in `proc_fill_cache`,
which is responsible for filling dentry info.  Some of this extra time is
spinlock contention, but it's a contention for the lock we don't want to
take to begin with.

We considered putting the number of open files in /proc/pid/status. 
Unfortunately, counting the number of fds involves iterating the
open_files bitmap, which has a linear complexity in proportion with the
number of open files (bitmap slots really, but it's close).  We don't want
to make /proc/pid/status any slower, so instead we put this info in
/proc/pid/fd as a size member of the stat syscall result.  Previously the
reported number was zero, so there's very little risk of breaking
anything, while still providing a somewhat logical way to count the open
files with a fallback if it's zero.

RFC for this patch included iterating open fds under RCU.  Thanks to Frank
Hofmann for the suggestion to use the bitmap instead.

Previously:

```
$ sudo stat /proc/1/fd | head -n2
  File: /proc/1/fd
  Size: 0         	Blocks: 0          IO Block: 1024   directory
```

With this patch:

```
$ sudo stat /proc/1/fd | head -n2
  File: /proc/1/fd
  Size: 65        	Blocks: 0          IO Block: 1024   directory
```

Correctness check:

```
$ sudo ls /proc/1/fd | wc -l
65
```

I added the docs for /proc/&lt;pid&gt;/fd while I'm at it.

[ivan@cloudflare.com: use bitmap_weight() to count the bits]
  Link: https://lkml.kernel.org/r/20221018045844.37697-1-ivan@cloudflare.com
[akpm@linux-foundation.org: include linux/bitmap.h for bitmap_weight()]
[ivan@cloudflare.com: return errno from proc_fd_getattr() instead of setting negative size]
  Link: https://lkml.kernel.org/r/20221024173140.30673-1-ivan@cloudflare.com
Link: https://lkml.kernel.org/r/20220922224027.59266-1-ivan@cloudflare.com
Signed-off-by: Ivan Babrou &lt;ivan@cloudflare.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christoph Anton Mitterer &lt;mail@christoph.anton.mitterer.name&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Laight &lt;David.Laight@ACULAB.COM&gt;
Cc: Ivan Babrou &lt;ivan@cloudflare.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Kalesh Singh &lt;kaleshsingh@google.com&gt;
Cc: Mike Rapoport &lt;rppt@kernel.org&gt;
Cc: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Cc: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>procfs: prevent unprivileged processes accessing fdinfo dir</title>
<updated>2022-05-10T00:34:28+00:00</updated>
<author>
<name>Kalesh Singh</name>
<email>kaleshsingh@google.com</email>
</author>
<published>2022-05-10T00:34:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1927e498aee1757b3df755a194cbfc5cc0f2b663'/>
<id>urn:sha1:1927e498aee1757b3df755a194cbfc5cc0f2b663</id>
<content type='text'>
The file permissions on the fdinfo dir from were changed from
S_IRUSR|S_IXUSR to S_IRUGO|S_IXUGO, and a PTRACE_MODE_READ check was added
for opening the fdinfo files [1].  However, the ptrace permission check
was not added to the directory, allowing anyone to get the open FD numbers
by reading the fdinfo directory.

Add the missing ptrace permission check for opening the fdinfo directory.

[1] https://lkml.kernel.org/r/20210308170651.919148-1-kaleshsingh@google.com

Link: https://lkml.kernel.org/r/20210713162008.1056986-1-kaleshsingh@google.com
Fixes: 7bc3fa0172a4 ("procfs: allow reading fdinfo with PTRACE_MODE_READ")
Signed-off-by: Kalesh Singh &lt;kaleshsingh@google.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Hridya Valsaraju &lt;hridya@google.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>procfs/dmabuf: add inode number to /proc/*/fdinfo</title>
<updated>2021-07-01T18:06:04+00:00</updated>
<author>
<name>Kalesh Singh</name>
<email>kaleshsingh@google.com</email>
</author>
<published>2021-07-01T01:54:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3845f256a8b527127bfbd4ced21e93d9e89aa6d7'/>
<id>urn:sha1:3845f256a8b527127bfbd4ced21e93d9e89aa6d7</id>
<content type='text'>
And 'ino' field to /proc/&lt;pid&gt;/fdinfo/&lt;FD&gt; and
/proc/&lt;pid&gt;/task/&lt;tid&gt;/fdinfo/&lt;FD&gt;.

The inode numbers can be used to uniquely identify DMA buffers in user
space and avoids a dependency on /proc/&lt;pid&gt;/fd/* when accounting
per-process DMA buffer sizes.

Link: https://lkml.kernel.org/r/20210308170651.919148-2-kaleshsingh@google.com
Signed-off-by: Kalesh Singh &lt;kaleshsingh@google.com&gt;
Acked-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Jeff Vander Stoep &lt;jeffv@google.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Hridya Valsaraju &lt;hridya@google.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Kalesh Singh &lt;kaleshsingh@google.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab+huawei@kernel.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Alexey Gladkov &lt;gladkov.alexey@gmail.com&gt;
Cc: Szabolcs Nagy &lt;szabolcs.nagy@arm.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Cc: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Bernd Edlinger &lt;bernd.edlinger@hotmail.de&gt;
Cc: Andrei Vagin &lt;avagin@gmail.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: James Morris &lt;jamorris@linux.microsoft.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>procfs: allow reading fdinfo with PTRACE_MODE_READ</title>
<updated>2021-07-01T18:06:04+00:00</updated>
<author>
<name>Kalesh Singh</name>
<email>kaleshsingh@google.com</email>
</author>
<published>2021-07-01T01:54:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7bc3fa0172a423afb34e6df7a3998e5f23b1a94a'/>
<id>urn:sha1:7bc3fa0172a423afb34e6df7a3998e5f23b1a94a</id>
<content type='text'>
Android captures per-process system memory state when certain low memory
events (e.g a foreground app kill) occur, to identify potential memory
hoggers.  In order to measure how much memory a process actually consumes,
it is necessary to include the DMA buffer sizes for that process in the
memory accounting.  Since the handle to DMA buffers are raw FDs, it is
important to be able to identify which processes have FD references to a
DMA buffer.

Currently, DMA buffer FDs can be accounted using /proc/&lt;pid&gt;/fd/* and
/proc/&lt;pid&gt;/fdinfo -- both are only readable by the process owner, as
follows:

  1. Do a readlink on each FD.
  2. If the target path begins with "/dmabuf", then the FD is a dmabuf FD.
  3. stat the file to get the dmabuf inode number.
  4. Read/ proc/&lt;pid&gt;/fdinfo/&lt;fd&gt;, to get the DMA buffer size.

Accessing other processes' fdinfo requires root privileges.  This limits
the use of the interface to debugging environments and is not suitable for
production builds.  Granting root privileges even to a system process
increases the attack surface and is highly undesirable.

Since fdinfo doesn't permit reading process memory and manipulating
process state, allow accessing fdinfo under PTRACE_MODE_READ_FSCRED.

Link: https://lkml.kernel.org/r/20210308170651.919148-1-kaleshsingh@google.com
Signed-off-by: Kalesh Singh &lt;kaleshsingh@google.com&gt;
Suggested-by: Jann Horn &lt;jannh@google.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Alexey Gladkov &lt;gladkov.alexey@gmail.com&gt;
Cc: Andrei Vagin &lt;avagin@gmail.com&gt;
Cc: Bernd Edlinger &lt;bernd.edlinger@hotmail.de&gt;
Cc: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Hridya Valsaraju &lt;hridya@google.com&gt;
Cc: James Morris &lt;jamorris@linux.microsoft.com&gt;
Cc: Jeff Vander Stoep &lt;jeffv@google.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab+huawei@kernel.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Szabolcs Nagy &lt;szabolcs.nagy@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
