<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/super.c, branch v7.2-rc1</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.2-rc1</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.2-rc1'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-06-14T22:45:31+00:00</updated>
<entry>
<title>Merge tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2026-06-14T22:45:31+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-06-14T22:45:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e8a56d6fc828bb569fa2dd33c3e6eb16a165b097'/>
<id>urn:sha1:e8a56d6fc828bb569fa2dd33c3e6eb16a165b097</id>
<content type='text'>
Pull dcache updates from Al Viro:

 - d_alloc_parallel() API change (Neil's with my changes)

 - NORCU fixes

 - Reorganization and simplification of dentry eviction logic

 - Simplifying rcu_read_lock() scopes in fs/dcache.c

 - Secondary roots work - getting rid of NFS fake root dentries and
   dealing with remaining shrink_dcache_for_umount() and
   shrink_dentry_list() races

 - making cursors NORCU (surprisingly easy)

* tag 'pull-dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (22 commits)
  make cursors NORCU
  nfs: get rid of fake root dentries
  wind -&gt;s_roots via -&gt;d_sib instead of -&gt;d_hash
  shrink_dentry_tree(): unify the calls of shrink_dentry_list()
  shrinking rcu_read_lock() scope in d_alloc_parallel()
  d_walk(): shrink rcu_read_lock() scope
  document dentry_kill()
  adjust calling conventions of lock_for_kill(), fold __dentry_kill() into dentry_kill()
  Document rcu_read_lock() use in select_collect2()
  Shift rcu_read_{,un}lock() inside fast_dput()
  simplify safety for lock_for_kill() slowpath
  fold lock_for_kill() and __dentry_kill() into common helper
  fold lock_for_kill() into shrink_kill()
  shrink_dentry_list(): start with removing from shrink list
  d_prune_aliases(): make sure to skip NORCU aliases
  kill d_dispose_if_unused()
  make to_shrink_list() return whether it has moved dentry to list
  select_collect(): ignore dentries on shrink lists if they have positive refcounts
  find_acceptable_alias(): skip NORCU aliases with zero refcount
  fix a race between d_find_any_alias() and final dput() of NORCU dentries
  ...
</content>
</entry>
<entry>
<title>Merge tag 'vfs-7.2-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2026-06-14T22:29:45+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-06-14T22:29:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7e0e7bd60d4a812b694c477716597fcb038b00cb'/>
<id>urn:sha1:7e0e7bd60d4a812b694c477716597fcb038b00cb</id>
<content type='text'>
Pull misc vfs updates from Christian Brauner:
 "Features:

   - Reduce pipe-&gt;mutex contention by pre-allocating pages outside the
     lock in anon_pipe_write().

     anon_pipe_write() called alloc_page() once per page while holding
     pipe-&gt;mutex. The allocation can sleep doing direct reclaim and runs
     memcg charging, which extends the critical section and stalls any
     concurrent reader on the same mutex. Now up to 8 pages are
     pre-allocated before the mutex is taken, leftovers are recycled
     into the per-pipe tmp_page[] cache before unlock, and any remainder
     is released after unlock, keeping the allocator out of the critical
     section on both sides. On a writers x readers sweep with 64KB
     writes against a 1 MB pipe throughput improves 6-28% and average
     write latency drops 5-22%; under memory pressure - when the cost of
     holding the mutex across reclaim is highest - throughput improves
     21-48% and latency drops 17-33%. The microbenchmark is added to
     selftests.

   - uaccess/sockptr: fix the ignored_trailing logic in
     copy_struct_to_user() to behave as documented and the usize check
     in copy_struct_from_sockptr() for user pointers, and add
     copy_struct_{from,to}_bounce_buffer() and copy_struct_to_sockptr()
     helpers for upcoming users (IPPROTO_SMBDIRECT, IPPROTO_QUIC).

   - bpf: add a sleepable bpf_real_inode() kfunc that resolves the real
     inode backing a dentry via d_real_inode(). On overlayfs the inode
     attached to the dentry doesn't carry the underlying device
     information; this is used by the filesystem restriction BPF program
     that was merged into systemd.

   - docs: add guidelines for submitting new filesystems, motivated by
     the maintenance burden abandoned and untestable filesystems impose
     on VFS developers, blocking infrastructure work like folio
     conversions and iomap migration.

  Fixes:

   - libfs: set SB_I_NOEXEC and SB_I_NODEV by default in init_pseudo()
     and drop the now-redundant assignments in callers. This began as a
     one-line dma-buf fix for a path_noexec() warning; a pseudo
     filesystem has no reason not to set SB_I_NOEXEC. All init_pseudo()
     callers were audited: the only visible effect is on dma-buf where
     SB_I_NOEXEC silences the warning.

   - Handle set_blocksize() failures in legacy filesystems (bfs, hpfs,
     qnx4, jfs, befs, affs, isofs, minix, ntfs3, omfs). Mounting a
     device with a sector size &gt; PAGE_SIZE crashed roughly half of them;
     the rest had the same missing error handling pattern. Plus a
     follow-up releasing the superblock buffer_head when setting the
     minix v3 block size fails.

   - mount: honour SB_NOUSER in the new mount API.

   - fs/fcntl: fix a SOFTIRQ-unsafe lock order in fasync signaling by
     switching the process-group paths of send_sigio() and send_sigurg()
     from read_lock(&amp;tasklist_lock) to RCU, matching the single-PID
     path.

   - vfs: add an FS_USERNS_DELEGATABLE flag and set it for NFS, fixing
     delegated NFS mounts (fsopen() in a container with the mount
     performed by a privileged daemon) that broke when non-init
     s_user_ns was tied to FS_USERNS_MOUNT.

   - selftests/namespaces: fix a hang in nsid_test where an unreaped
     grandchild kept the TAP pipe write-end open, a waitpid(-1) race in
     listns_efault_test, and a false FAIL on kernels without listns()
     where the tests should SKIP.

   - filelock: fix the break_lease() stub signature for
     CONFIG_FILE_LOCKING=n.

   - init/initramfs_test: wait for the async initramfs unpacking before
     running; the test and do_populate_rootfs() share the parser state.

   - fs/coredump: reduce redundant log noise in
     validate_coredump_safety().

   - iomap: pass the correct length to fserror_report_io() in
     __iomap_write_begin().

   - backing-file: fix the backing_file_open() kerneldoc.

  Cleanups:

   - initramfs: refactor the cpio hex header parsing to use hex2bin()
     instead of the hand-rolled simple_strntoul() which is reverted, and
     extend the initramfs KUnit tests to cover header fields with 0x
     prefixes.

   - Replace __get_free_pages() and friends with kmalloc()/kzalloc()
     across quota, proc, ocfs2/dlm, nilfs2, nfs, nfsd, libfs, jfs, jbd2,
     isofs, fuse, select, namespace, configfs, binfmt_misc, bfs, and the
     do_mounts init code - part of the larger work of replacing page
     allocator calls with kmalloc().

   - Use clear_and_wake_up_bit() in unlock_buffer() and
     journal_end_buffer_io_sync() instead of open-coding the sequence.

   - Drop unused VFS exports: unexport drop_super_exclusive(), remove
     start_removing_user_path_at(), and fold __start_removing_path()
     into start_removing_path().

   - fs/read_write: narrow the __kernel_write() export with
     EXPORT_SYMBOL_FOR_MODULES().

   - vfs: uapi: retire octal and hex constants in favor of (1 &lt;&lt; n) for
     the O_ flags. Finding a free bit for a new flag across the
     architectures was needlessly hard with the mixed bases.

   - dcache: add extra sanity checks of dead dentries in dentry_free()
     via a new DENTRY_WARN_ONCE() that also prints d_flags.

   - iov_iter: use kmemdup_array() in dup_iter() to harden the
     allocation against multiplication overflow.

   - fs/pipe: write to -&gt;poll_usage only once.

   - vfs: remove an always-taken if-branch in find_next_fd().

   - dcache: use kmalloc_flex() for struct external_name in __d_alloc().

   - namei: use QSTR() instead of QSTR_INIT() in path_pts().

   - sync_file_range: delete dead S_ISLNK code.

   - Comment fixes: retire a stale comment in fget_task_next() and fix
     assorted spelling mistakes"

* tag 'vfs-7.2-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (73 commits)
  backing-file: fix backing_file_open() kerneldoc parameter
  iomap: pass the correct len to fserror_report_io in __iomap_write_begin
  vfs: add FS_USERNS_DELEGATABLE flag and set it for NFS
  filelock: fix break_lease() stub signature for CONFIG_FILE_LOCKING=n
  vfs: uapi: retire octal and hex numbers in favor of (1 &lt;&lt; n) for O_ flags
  bpf: add bpf_real_inode() kfunc
  fs/read_write: Do not export __kernel_write() to the entire world
  libfs: drop redundant SB_I_NOEXEC/SB_I_NODEV in init_pseudo() callers
  libfs: set SB_I_NOEXEC and SB_I_NODEV by default in init_pseudo()
  mount: honour SB_NOUSER in the new mount API
  fs/fcntl: fix SOFTIRQ-unsafe lock order in fasync signaling
  selftests/pipe: add pipe_bench microbenchmark
  fs/pipe: pre-allocate pages outside pipe-&gt;mutex in anon_pipe_write
  fs: retire stale comment in fget_task_next()
  fs: fix spelling mistakes in comment
  bfs: replace get_zeroed_page() with kzalloc()
  binfmt_misc: replace __get_free_page() with kmalloc()
  configfs: replace __get_free_pages() with kzalloc()
  fs/namespace: use __getname() to allocate mntpath buffer
  fs/select: replace __get_free_page() with kmalloc()
  ...
</content>
</entry>
<entry>
<title>vfs: add FS_USERNS_DELEGATABLE flag and set it for NFS</title>
<updated>2026-06-09T15:17:29+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@kernel.org</email>
</author>
<published>2026-01-29T21:47:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c5d6cac28646b0d5d81ef632be748ae93c1f36c7'/>
<id>urn:sha1:c5d6cac28646b0d5d81ef632be748ae93c1f36c7</id>
<content type='text'>
Commit e1c5ae59c0f2 ("fs: don't allow non-init s_user_ns for filesystems
without FS_USERNS_MOUNT") prevents the mount of any filesystem inside a
container that doesn't have FS_USERNS_MOUNT set.

This broke NFS mounts in our containerized environment. We have a daemon
somewhat like systemd-mountfsd running in the init_ns. A process does a
fsopen() inside the container and passes it to the daemon via unix
socket.

The daemon then vets that the request is for an allowed NFS server and
performs the mount. This now fails because the fc-&gt;user_ns is set to the
value in the container and NFS doesn't set FS_USERNS_MOUNT.  We don't
want to add FS_USERNS_MOUNT to NFS since that would allow the container
to mount any NFS server (even malicious ones).

Add a new FS_USERNS_DELEGATABLE flag, and enable it on NFS.

Fixes: e1c5ae59c0f2 ("fs: don't allow non-init s_user_ns for filesystems without FS_USERNS_MOUNT")
Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Link: https://patch.msgid.link/20260129-twmount-v1-1-4874ed2a15c4@kernel.org
Acked-by: Anna Schumaker &lt;anna.schumaker@oracle.com&gt;
Reviewed-by: Alexander Mikhalitsyn &lt;aleksandr.mikhalitsyn@futurfusion.io&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>wind -&gt;s_roots via -&gt;d_sib instead of -&gt;d_hash</title>
<updated>2026-06-05T04:34:56+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2026-04-18T22:39:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e9895609cb7f9d9712c5b20044edff5f196eec1b'/>
<id>urn:sha1:e9895609cb7f9d9712c5b20044edff5f196eec1b</id>
<content type='text'>
shrink_dcache_for_umount() is supposed to handle the possibility of
some of the dentries to be evicted being in other threads shrink
lists; it either kills them, leaving an empty husk to be freed by
the owner of shrink list whenever it gets around to that, or it
waits for the eviction in progress to get completed.

That relies upon dentry remaining attached to the tree until the
eviction reaches dentry_unlist() and its -&gt;d_sib gets removed
from the list.  Unfortunately, the secondary roots are linked
via -&gt;d_hash, rather than -&gt;d_sib and they become removed from
that list before their inode references are dropped.

If shrink_dentry_list() from another thread ends up evicting
one of the secondary roots and gets to that point in dentry_kill()
when shrink_dcache_for_umount() is looking for secondary roots,
the latter will *not* notice anything, possibly leading to
warnings about busy inodes at umount time and all kinds of breakage
after that.

Moreover, shrink_dcache_for_umount() walks the list of secondary
roots with no protection whatsoever, so it might end up calling
dget() on a dentry that already passed through
	lockref_mark_dead(&amp;dentry-&gt;d_lockref);
ending up with corrupted refcount and possible UAF.

AFAICS, the most straightforward way to deal with that would be
to have secondary roots linked via -&gt;d_sib rather than -&gt;d_hash;
then they would remain on the list until killed, and we could
use d_add_waiter() machinery to wait for eviction in progress.

Changes:
	* secondary roots look the same as -&gt;s_root from d_unhashed()
and d_unlinked() POV now.
	* secondary roots are represented as "no parent, but on -&gt;d_sib"
instead of "no parent, but on -&gt;d_hash".
	* since -&gt;d_sib is a plain hlist, we protect it with per-superblock
spinlock (sb-&gt;s_roots_lock) instead of the LSB of the head pointer (for
non-root dentries it would be protected by -&gt;d_lock of parent).
	* __d_obtain_alias() uses -&gt;d_sib for linkage when allocating
a secondary root.
	* d_splice_alias_ops() detects splicing of a secondary root and
removes it from the list before calling __d_move().
	* dentry_unlist() detects eviction of a secondary root and
removes it from the list; no need to play the games for d_walk() sake,
since the latter is not going to look for the next sibling of those
anyway.
	* ___d_drop() doesn't care about -&gt;s_roots anymore.
	* shrink_dcache_for_umount() uses proper locking for access to
the list of secondary roots and if it runs into one that is in the middle
of eviction waits for that to finish.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>fs: retire sget()</title>
<updated>2026-06-03T07:09:51+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2026-05-29T08:43:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=12863b36bfe4ac0aa93e984c72b2bcd8f08402aa'/>
<id>urn:sha1:12863b36bfe4ac0aa93e984c72b2bcd8f08402aa</id>
<content type='text'>
sget() and sget_fc() have lived side by side as near-duplicate
find-or-create-and-publish helpers for the legacy and fs_context mount
APIs. The three remaining in-tree callers (CIFS plus the ext4 extents
and mballoc KUnit tests) have all been moved to sget_fc(). Nothing
calls sget() anymore.

Delete sget() from fs/super.c and the prototype in &lt;linux/fs.h&gt;.
Update the two comments that referred to "sget()" or "sget{_fc}()" to
just say "sget_fc()".

This removes ~60 lines of code that only existed to be kept in
lockstep with sget_fc() on every superblock publish-path change.

Link: https://patch.msgid.link/20260529-work-sget-v2-4-57bbe08604e4@kernel.org
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>fs: unexport drop_super_exclusive</title>
<updated>2026-05-21T11:39:34+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2026-05-11T07:22:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4d636e5aabb887dc72867110710737398effaa70'/>
<id>urn:sha1:4d636e5aabb887dc72867110710737398effaa70</id>
<content type='text'>
drop_super_exclusive is only used by the built-in quota code.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://patch.msgid.link/20260511072239.2456725-2-hch@lst.de
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>urn:sha1:bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'fsnotify_for_v6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs</title>
<updated>2026-02-12T18:52:46+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-12T18:52:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a8ed22870f5304a6ac64f694572cafc12801a9cf'/>
<id>urn:sha1:a8ed22870f5304a6ac64f694572cafc12801a9cf</id>
<content type='text'>
Pull fsnotify updates from Jan Kara:
 "A set of fixes to shutdown fsnotify subsystem before invalidating
  dcache thus addressing some nasty possible races"

* tag 'fsnotify_for_v6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: Shutdown fsnotify before destroying sb's dcache
  fsnotify: Use connector list for destroying inode marks
  fsnotify: Track inode connectors for a superblock
</content>
</entry>
<entry>
<title>fsnotify: Shutdown fsnotify before destroying sb's dcache</title>
<updated>2026-01-23T12:26:45+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2025-10-15T14:06:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=74bd284537b3447c651588101c32a203e4fe1a32'/>
<id>urn:sha1:74bd284537b3447c651588101c32a203e4fe1a32</id>
<content type='text'>
Currently fsnotify_sb_delete() was called after we have evicted
superblock's dcache and inode cache. This was done mainly so that we
iterate as few inodes as possible when removing inode marks. However, as
Jakub reported, this is problematic because for some filesystems
encoding of file handles uses sb-&gt;s_root which gets cleared as part of
dcache eviction. And either delayed fsnotify events or reading fdinfo
for fsnotify group with marks on fs being unmounted may trigger encoding
of file handles during unmount. So move shutdown of fsnotify subsystem
before shrinking of dcache.

Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxgXvwumYvJm3cLDFfx-TsU3g5-yVsTiG=6i8KS48dn0mQ@mail.gmail.com/
Reported-by: Jakub Acs &lt;acsjakub@amazon.de&gt;
Reviewed-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
</feed>
