<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/notify/dnotify, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-07-24T06:56:30+00:00</updated>
<entry>
<title>fix a leak in fcntl_dirnotify()</title>
<updated>2025-07-24T06:56:30+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2025-07-12T17:18:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9a7de97b915ad264f1c040807aa6e414bd983cbb'/>
<id>urn:sha1:9a7de97b915ad264f1c040807aa6e414bd983cbb</id>
<content type='text'>
[ Upstream commit fdfe0133473a528e3f5da69c35419ce6711d6b89 ]

[into #fixes, unless somebody objects]

Lifetime of new_dn_mark is controlled by that of its -&gt;fsn_mark,
pointed to by new_fsn_mark.  Unfortunately, a failure exit had
been inserted between the allocation of new_dn_mark and the
call of fsnotify_init_mark(), ending up with a leak.

Fixes: 1934b212615d "file: reclaim 24 bytes from f_owner"
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Link: https://lore.kernel.org/20250712171843.GB1880847@ZenIV
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>inotify: Fix possible deadlock in fsnotify_destroy_mark</title>
<updated>2024-10-02T13:14:29+00:00</updated>
<author>
<name>Lizhi Xu</name>
<email>lizhi.xu@windriver.com</email>
</author>
<published>2024-09-27T14:36:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cad3f4a22cfa4081cc2d465d1118cf31708fd82b'/>
<id>urn:sha1:cad3f4a22cfa4081cc2d465d1118cf31708fd82b</id>
<content type='text'>
[Syzbot reported]
WARNING: possible circular locking dependency detected
6.11.0-rc4-syzkaller-00019-gb311c1b497e5 #0 Not tainted
------------------------------------------------------
kswapd0/78 is trying to acquire lock:
ffff88801b8d8930 (&amp;group-&gt;mark_mutex){+.+.}-{3:3}, at: fsnotify_group_lock include/linux/fsnotify_backend.h:270 [inline]
ffff88801b8d8930 (&amp;group-&gt;mark_mutex){+.+.}-{3:3}, at: fsnotify_destroy_mark+0x38/0x3c0 fs/notify/mark.c:578

but task is already holding lock:
ffffffff8ea2fd60 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:6841 [inline]
ffffffff8ea2fd60 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0xbb4/0x35a0 mm/vmscan.c:7223

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-&gt; #1 (fs_reclaim){+.+.}-{0:0}:
       ...
       kmem_cache_alloc_noprof+0x3d/0x2a0 mm/slub.c:4044
       inotify_new_watch fs/notify/inotify/inotify_user.c:599 [inline]
       inotify_update_watch fs/notify/inotify/inotify_user.c:647 [inline]
       __do_sys_inotify_add_watch fs/notify/inotify/inotify_user.c:786 [inline]
       __se_sys_inotify_add_watch+0x72e/0x1070 fs/notify/inotify/inotify_user.c:729
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-&gt; #0 (&amp;group-&gt;mark_mutex){+.+.}-{3:3}:
       ...
       __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
       fsnotify_group_lock include/linux/fsnotify_backend.h:270 [inline]
       fsnotify_destroy_mark+0x38/0x3c0 fs/notify/mark.c:578
       fsnotify_destroy_marks+0x14a/0x660 fs/notify/mark.c:934
       fsnotify_inoderemove include/linux/fsnotify.h:264 [inline]
       dentry_unlink_inode+0x2e0/0x430 fs/dcache.c:403
       __dentry_kill+0x20d/0x630 fs/dcache.c:610
       shrink_kill+0xa9/0x2c0 fs/dcache.c:1055
       shrink_dentry_list+0x2c0/0x5b0 fs/dcache.c:1082
       prune_dcache_sb+0x10f/0x180 fs/dcache.c:1163
       super_cache_scan+0x34f/0x4b0 fs/super.c:221
       do_shrink_slab+0x701/0x1160 mm/shrinker.c:435
       shrink_slab+0x1093/0x14d0 mm/shrinker.c:662
       shrink_one+0x43b/0x850 mm/vmscan.c:4815
       shrink_many mm/vmscan.c:4876 [inline]
       lru_gen_shrink_node mm/vmscan.c:4954 [inline]
       shrink_node+0x3799/0x3de0 mm/vmscan.c:5934
       kswapd_shrink_node mm/vmscan.c:6762 [inline]
       balance_pgdat mm/vmscan.c:6954 [inline]
       kswapd+0x1bcd/0x35a0 mm/vmscan.c:7223

[Analysis]
The problem is that inotify_new_watch() is using GFP_KERNEL to allocate
new watches under group-&gt;mark_mutex, however if dentry reclaim races
with unlinking of an inode, it can end up dropping the last dentry reference
for an unlinked inode resulting in removal of fsnotify mark from reclaim
context which wants to acquire group-&gt;mark_mutex as well.

This scenario shows that all notification groups are in principle prone
to this kind of a deadlock (previously, we considered only fanotify and
dnotify to be problematic for other reasons) so make sure all
allocations under group-&gt;mark_mutex happen with GFP_NOFS.

Reported-and-tested-by: syzbot+c679f13773f295d2da53@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=c679f13773f295d2da53
Signed-off-by: Lizhi Xu &lt;lizhi.xu@windriver.com&gt;
Reviewed-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://patch.msgid.link/20240927143642.2369508-1-lizhi.xu@windriver.com
</content>
</entry>
<entry>
<title>file: reclaim 24 bytes from f_owner</title>
<updated>2024-08-28T11:05:39+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2024-08-09T16:00:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1934b212615dc617ac84fc306333ab2b9fc3b04f'/>
<id>urn:sha1:1934b212615dc617ac84fc306333ab2b9fc3b04f</id>
<content type='text'>
We do embedd struct fown_struct into struct file letting it take up 32
bytes in total. We could tweak struct fown_struct to be more compact but
really it shouldn't even be embedded in struct file in the first place.

Instead, actual users of struct fown_struct should allocate the struct
on demand. This frees up 24 bytes in struct file.

That will have some potentially user-visible changes for the ownership
fcntl()s. Some of them can now fail due to allocation failures.
Practically, that probably will almost never happen as the allocations
are small and they only happen once per file.

The fown_struct is used during kill_fasync() which is used by e.g.,
pipes to generate a SIGIO signal. Sending of such signals is conditional
on userspace having set an owner for the file using one of the F_OWNER
fcntl()s. Such users will be unaffected if struct fown_struct is
allocated during the fcntl() call.

There are a few subsystems that call __f_setown() expecting
file-&gt;f_owner to be allocated:

(1) tun devices
    file-&gt;f_op-&gt;fasync::tun_chr_fasync()
    -&gt; __f_setown()

    There are no callers of tun_chr_fasync().

(2) tty devices

    file-&gt;f_op-&gt;fasync::tty_fasync()
    -&gt; __tty_fasync()
       -&gt; __f_setown()

    tty_fasync() has no additional callers but __tty_fasync() has. Note
    that __tty_fasync() only calls __f_setown() if the @on argument is
    true. It's called from:

    file-&gt;f_op-&gt;release::tty_release()
    -&gt; tty_release()
       -&gt; __tty_fasync()
          -&gt; __f_setown()

    tty_release() calls __tty_fasync() with @on false
    =&gt; __f_setown() is never called from tty_release().
       =&gt; All callers of tty_release() are safe as well.

    file-&gt;f_op-&gt;release::tty_open()
    -&gt; tty_release()
       -&gt; __tty_fasync()
          -&gt; __f_setown()

    __tty_hangup() calls __tty_fasync() with @on false
    =&gt; __f_setown() is never called from tty_release().
       =&gt; All callers of __tty_hangup() are safe as well.

From the callchains it's obvious that (1) and (2) end up getting called
via file-&gt;f_op-&gt;fasync(). That can happen either through the F_SETFL
fcntl() with the FASYNC flag raised or via the FIOASYNC ioctl(). If
FASYNC is requested and the file isn't already FASYNC then
file-&gt;f_op-&gt;fasync() is called with @on true which ends up causing both
(1) and (2) to call __f_setown().

(1) and (2) are the only subsystems that call __f_setown() from the
file-&gt;f_op-&gt;fasync() handler. So both (1) and (2) have been updated to
allocate a struct fown_struct prior to calling fasync_helper() to
register with the fasync infrastructure. That's safe as they both call
fasync_helper() which also does allocations if @on is true.

The other interesting case are file leases:

(3) file leases
    lease_manager_ops-&gt;lm_setup::lease_setup()
    -&gt; __f_setown()

    Which in turn is called from:

    generic_add_lease()
    -&gt; lease_manager_ops-&gt;lm_setup::lease_setup()
       -&gt; __f_setown()

So here again we can simply make generic_add_lease() allocate struct
fown_struct prior to the lease_manager_ops-&gt;lm_setup::lease_setup()
which happens under a spinlock.

With that the two remaining subsystems that call __f_setown() are:

(4) dnotify
(5) sockets

Both have their own custom ioctls to set struct fown_struct and both
have been converted to allocate a struct fown_struct on demand from
their respective ioctls.

Interactions with O_PATH are fine as well e.g., when opening a /dev/tty
as O_PATH then no file-&gt;f_op-&gt;open() happens thus no file-&gt;f_owner is
allocated. That's fine as no file operation will be set for those and
the device has never been opened. fcntl()s called on such things will
just allocate a -&gt;f_owner on demand. Although I have zero idea why'd you
care about f_owner on an O_PATH fd.

Link: https://lore.kernel.org/r/20240813-work-f_owner-v2-1-4e9343a79f9f@kernel.org
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>fsnotify: create a wrapper fsnotify_find_inode_mark()</title>
<updated>2024-04-04T14:24:16+00:00</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2024-03-17T18:41:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=230d97d39ee2eb9030309f04f98615aaeb420dac'/>
<id>urn:sha1:230d97d39ee2eb9030309f04f98615aaeb420dac</id>
<content type='text'>
In preparation to passing an object pointer to fsnotify_find_mark(), add
a wrapper fsnotify_find_inode_mark() and use it where possible.

Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Message-Id: &lt;20240317184154.1200192-4-amir73il@gmail.com&gt;
</content>
</entry>
<entry>
<title>fs: Remove the now superfluous sentinel elements from ctl_table array</title>
<updated>2023-12-28T12:57:57+00:00</updated>
<author>
<name>Joel Granados</name>
<email>j.granados@samsung.com</email>
</author>
<published>2023-11-21T11:35:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9d5b9475356635d018b4d22f7e58fce32e2e89a7'/>
<id>urn:sha1:9d5b9475356635d018b4d22f7e58fce32e2e89a7</id>
<content type='text'>
This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)

Remove sentinel elements ctl_table struct. Special attention was placed in
making sure that an empty directory for fs/verity was created when
CONFIG_FS_VERITY_BUILTIN_SIGNATURES is not defined. In this case we use the
register sysctl call that expects a size.

Signed-off-by: Joel Granados &lt;j.granados@samsung.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: "Darrick J. Wong" &lt;djwong@kernel.org&gt;
Acked-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2023-11-03T06:53:31+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-11-03T06:53:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8f6f76a6a29f36d2f3e4510d0bde5046672f6924'/>
<id>urn:sha1:8f6f76a6a29f36d2f3e4510d0bde5046672f6924</id>
<content type='text'>
Pull non-MM updates from Andrew Morton:
 "As usual, lots of singleton and doubleton patches all over the tree
  and there's little I can say which isn't in the individual changelogs.

  The lengthier patch series are

   - 'kdump: use generic functions to simplify crashkernel reservation
     in arch', from Baoquan He. This is mainly cleanups and
     consolidation of the 'crashkernel=' kernel parameter handling

   - After much discussion, David Laight's 'minmax: Relax type checks in
     min() and max()' is here. Hopefully reduces some typecasting and
     the use of min_t() and max_t()

   - A group of patches from Oleg Nesterov which clean up and slightly
     fix our handling of reads from /proc/PID/task/... and which remove
     task_struct.thread_group"

* tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits)
  scripts/gdb/vmalloc: disable on no-MMU
  scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n
  .mailmap: add address mapping for Tomeu Vizoso
  mailmap: update email address for Claudiu Beznea
  tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions
  .mailmap: map Benjamin Poirier's address
  scripts/gdb: add lx_current support for riscv
  ocfs2: fix a spelling typo in comment
  proc: test ProtectionKey in proc-empty-vm test
  proc: fix proc-empty-vm test with vsyscall
  fs/proc/base.c: remove unneeded semicolon
  do_io_accounting: use sig-&gt;stats_lock
  do_io_accounting: use __for_each_thread()
  ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error()
  ocfs2: fix a typo in a comment
  scripts/show_delta: add __main__ judgement before main code
  treewide: mark stuff as __ro_after_init
  fs: ocfs2: check status values
  proc: test /proc/${pid}/statm
  compiler.h: move __is_constexpr() to compiler.h
  ...
</content>
</entry>
<entry>
<title>file: convert to SLAB_TYPESAFE_BY_RCU</title>
<updated>2023-10-19T09:02:48+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2023-09-29T06:45:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0ede61d8589cc2d93aa78230d74ac58b5b8d0244'/>
<id>urn:sha1:0ede61d8589cc2d93aa78230d74ac58b5b8d0244</id>
<content type='text'>
In recent discussions around some performance improvements in the file
handling area we discussed switching the file cache to rely on
SLAB_TYPESAFE_BY_RCU which allows us to get rid of call_rcu() based
freeing for files completely. This is a pretty sensitive change overall
but it might actually be worth doing.

The main downside is the subtlety. The other one is that we should
really wait for Jann's patch to land that enables KASAN to handle
SLAB_TYPESAFE_BY_RCU UAFs. Currently it doesn't but a patch for this
exists.

With SLAB_TYPESAFE_BY_RCU objects may be freed and reused multiple times
which requires a few changes. So it isn't sufficient anymore to just
acquire a reference to the file in question under rcu using
atomic_long_inc_not_zero() since the file might have already been
recycled and someone else might have bumped the reference.

In other words, callers might see reference count bumps from newer
users. For this reason it is necessary to verify that the pointer is the
same before and after the reference count increment. This pattern can be
seen in get_file_rcu() and __files_get_rcu().

In addition, it isn't possible to access or check fields in struct file
without first aqcuiring a reference on it. Not doing that was always
very dodgy and it was only usable for non-pointer data in struct file.
With SLAB_TYPESAFE_BY_RCU it is necessary that callers first acquire a
reference under rcu or they must hold the files_lock of the fdtable.
Failing to do either one of this is a bug.

Thanks to Jann for pointing out that we need to ensure memory ordering
between reallocations and pointer check by ensuring that all subsequent
loads have a dependency on the second load in get_file_rcu() and
providing a fixup that was folded into this patch.

Cc: Jann Horn &lt;jannh@google.com&gt;
Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>treewide: mark stuff as __ro_after_init</title>
<updated>2023-10-18T21:43:23+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2023-10-11T16:55:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=68279f9c9f592e75d30a9ba5154a15e0a0b42ae8'/>
<id>urn:sha1:68279f9c9f592e75d30a9ba5154a15e0a0b42ae8</id>
<content type='text'>
__read_mostly predates __ro_after_init. Many variables which are marked
__read_mostly should have been __ro_after_init from day 1.

Also, mark some stuff as "const" and "__init" while I'm at it.

[akpm@linux-foundation.org: revert sysctl_nr_open_min, sysctl_nr_open_max changes due to arm warning]
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/4f6bb9c0-abba-4ee4-a7aa-89265e886817@p183
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>dnotify: Pass argument of fcntl_dirnotify as int</title>
<updated>2023-07-10T12:36:12+00:00</updated>
<author>
<name>Luca Vizzarro</name>
<email>Luca.Vizzarro@arm.com</email>
</author>
<published>2023-02-15T15:50:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f4ae4081e5a871b4c4e2f802ccf73f8d69566073'/>
<id>urn:sha1:f4ae4081e5a871b4c4e2f802ccf73f8d69566073</id>
<content type='text'>
The interface for fcntl expects the argument passed for the command
F_DIRNOTIFY to be of type int. The current code wrongly treats it as
a long. In order to avoid access to undefined bits, we should explicitly
cast the argument to int.

Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Amir Goldstein &lt;amir73il@gmail.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Cc: Kevin Brodsky &lt;Kevin.Brodsky@arm.com&gt;
Cc: Vincenzo Frascino &lt;Vincenzo.Frascino@arm.com&gt;
Cc: Szabolcs Nagy &lt;Szabolcs.Nagy@arm.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: David Laight &lt;David.Laight@ACULAB.com&gt;
Cc: Mark Rutland &lt;Mark.Rutland@arm.com&gt;
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-morello@op-lists.linaro.org
Acked-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Luca Vizzarro &lt;Luca.Vizzarro@arm.com&gt;
Message-Id: &lt;20230414152459.816046-6-Luca.Vizzarro@arm.com&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>dnotify: use fsnotify group lock helpers</title>
<updated>2022-04-25T12:37:34+00:00</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2022-04-22T12:03:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=aabb45fdcb31f00f1e7cae2bce83e83474a87c03'/>
<id>urn:sha1:aabb45fdcb31f00f1e7cae2bce83e83474a87c03</id>
<content type='text'>
Before commit 9542e6a643fc6 ("nfsd: Containerise filecache laundrette")
nfsd would close open files in direct reclaim context.  There is no
guarantee that others memory shrinkers don't do the same and no
guarantee that future shrinkers won't do that.

For example, if overlayfs implements inode cache of fscache would
keep open files to cached objects, inode shrinkers could end up closing
open files to underlying fs.

Direct reclaim from dnotify mark allocation context may try to close
open files that have dnotify marks of the same group and hit a deadlock
on mark_mutex.

Set the FSNOTIFY_GROUP_NOFS flag to prevent going into direct reclaim
from allocations under dnotify group lock and use the safe group lock
helpers.

Link: https://lore.kernel.org/r/20220422120327.3459282-11-amir73il@gmail.com
Suggested-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
</feed>
