<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/nfs_fs.h, branch v6.6.131</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.131</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.131'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-25T10:05:59+00:00</updated>
<entry>
<title>NFS: Fix a deadlock involving nfs_release_folio()</title>
<updated>2026-03-25T10:05:59+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@hammerspace.com</email>
</author>
<published>2026-02-24T07:02:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a4810f8beb0122f032f10735f98d257aa6064f4c'/>
<id>urn:sha1:a4810f8beb0122f032f10735f98d257aa6064f4c</id>
<content type='text'>
[ Upstream commit cce0be6eb4971456b703aaeafd571650d314bcca ]

Wang Zhaolong reports a deadlock involving NFSv4.1 state recovery
waiting on kthreadd, which is attempting to reclaim memory by calling
nfs_release_folio(). The latter cannot make progress due to state
recovery being needed.

It seems that the only safe thing to do here is to kick off a writeback
of the folio, without waiting for completion, or else kicking off an
asynchronous commit.

Reported-by: Wang Zhaolong &lt;wangzhaolong@huaweicloud.com&gt;
Fixes: 96780ca55e3c ("NFS: fix up nfs_release_folio() to try to release the page")
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
[ Minor conflict resolved. ]
Signed-off-by: Li hongliang &lt;1468888505@139.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>nfs: fix UAF in direct writes</title>
<updated>2024-04-03T13:28:29+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2024-03-01T16:49:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e25447c35f8745337ea8bc0c9697fcac14df8605'/>
<id>urn:sha1:e25447c35f8745337ea8bc0c9697fcac14df8605</id>
<content type='text'>
[ Upstream commit 17f46b803d4f23c66cacce81db35fef3adb8f2af ]

In production we have been hitting the following warning consistently

------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 17 PID: 1800359 at lib/refcount.c:28 refcount_warn_saturate+0x9c/0xe0
Workqueue: nfsiod nfs_direct_write_schedule_work [nfs]
RIP: 0010:refcount_warn_saturate+0x9c/0xe0
PKRU: 55555554
Call Trace:
 &lt;TASK&gt;
 ? __warn+0x9f/0x130
 ? refcount_warn_saturate+0x9c/0xe0
 ? report_bug+0xcc/0x150
 ? handle_bug+0x3d/0x70
 ? exc_invalid_op+0x16/0x40
 ? asm_exc_invalid_op+0x16/0x20
 ? refcount_warn_saturate+0x9c/0xe0
 nfs_direct_write_schedule_work+0x237/0x250 [nfs]
 process_one_work+0x12f/0x4a0
 worker_thread+0x14e/0x3b0
 ? ZSTD_getCParams_internal+0x220/0x220
 kthread+0xdc/0x120
 ? __btf_name_valid+0xa0/0xa0
 ret_from_fork+0x1f/0x30

This is because we're completing the nfs_direct_request twice in a row.

The source of this is when we have our commit requests to submit, we
process them and send them off, and then in the completion path for the
commit requests we have

if (nfs_commit_end(cinfo.mds))
	nfs_direct_write_complete(dreq);

However since we're submitting asynchronous requests we sometimes have
one that completes before we submit the next one, so we end up calling
complete on the nfs_direct_request twice.

The only other place we use nfs_generic_commit_list() is in
__nfs_commit_inode, which wraps this call in a

nfs_commit_begin();
nfs_commit_end();

Which is a common pattern for this style of completion handling, one
that is also repeated in the direct code with get_dreq()/put_dreq()
calls around where we process events as well as in the completion paths.

Fix this by using the same pattern for the commit requests.

Before with my 200 node rocksdb stress running this warning would pop
every 10ish minutes.  With my patch the stress test has been running for
several hours without popping.

Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>NFSv3: handle out-of-order write replies.</title>
<updated>2023-04-11T20:13:21+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2023-03-21T22:27:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3db63daabe210af32a09533fe7d8d47c711a103c'/>
<id>urn:sha1:3db63daabe210af32a09533fe7d8d47c711a103c</id>
<content type='text'>
NFSv3 includes pre/post wcc attributes which allow the client to
determine if all changes to the file have been made by the client
itself, or if any might have been made by some other client.

If there are gaps in the pre/post ctime sequence it must be assumed that
some other client changed the file in that gap and the local cache must
be suspect.  The next time the file is opened the cache should be
invalidated.

Since Commit 1c341b777501 ("NFS: Add deferred cache invalidation for
close-to-open consistency violations") in linux 5.3 the Linux client has
been triggering this invalidation.  The chunk in nfs_update_inode() in
particularly triggers.

Unfortunately Linux NFS assumes that all replies will be processed in
the order sent, and will arrive in the order processed.  This is not
true in general.  Consequently Linux NFS might ignore the wcc info in a
WRITE reply because the reply is in response to a WRITE that was sent
before some other request for which a reply has already been seen.  This
is detected by Linux using the gencount tests in nfs_inode_attr_cmp().

Also, when the gencount tests pass it is still possible that the request
were processed on the server in a different order, and a gap seen in
the ctime sequence might be filled in by a subsequent reply, so gaps
should not immediately trigger delayed invalidation.

The net result is that writing to a server and then reading the file
back can result in going to the server for the read rather than serving
it from cache - all because a couple of replies arrived out-of-order.
This is a performance regression over kernels before 5.3, though the
change in 5.3 is a correctness improvement.

This has been seen with Linux writing to a Netapp server which
occasionally re-orders requests.  In testing the majority of requests
were in-order, but a few (maybe 2 or three at a time) could be
re-ordered.

This patch addresses the problem by recording any gaps seen in the
pre/post ctime sequence and not triggering invalidation until either
there are too many gaps to fit in the table, or until there are no more
active writes and the remaining gaps cannot be resolved.

We allocate a table of 16 gaps on demand.  If the allocation fails we
revert to current behaviour which is of little cost as we are unlikely
to be able to cache the writes anyway.

In the table we store "start-&gt;end" pair when iversion is updated and
"end&lt;-start" pairs pre/post pairs reported by the server.  Usually these
exactly cancel out and so nothing is stored.  When there are
out-of-order replies we do store gaps and these will eventually be
cancelled against later replies when this client is the only writer.

If the final write is out-of-order there may be one gap remaining when
the file is closed.  This will be noticed and if there is precisely on
gap and if the iversion can be advanced to match it, then we do so.

This patch makes no attempt to handle directories correctly.  The same
problem potentially exists in the out-of-order replies to create/unlink
requests can cause future lookup requires to be sent to the server
unnecessarily.  A similar scheme using the same primitives could be used
to notice and handle out-of-order replies.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Anna Schumaker &lt;Anna.Schumaker@Netapp.com&gt;
</content>
</entry>
<entry>
<title>NFS: Remove fscache specific trace points and NFS_INO_FSCACHE bit</title>
<updated>2023-04-11T17:08:27+00:00</updated>
<author>
<name>Dave Wysochanski</name>
<email>dwysocha@redhat.com</email>
</author>
<published>2023-02-20T13:43:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=03f5bd75a4c19714a929a0f4e7bbf36b49e874c1'/>
<id>urn:sha1:03f5bd75a4c19714a929a0f4e7bbf36b49e874c1</id>
<content type='text'>
The NFS specific trace points are no longer needed as tracing is well
covered by netfs and fscache.

Signed-off-by: Dave Wysochanski &lt;dwysocha@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Tested-by: Daire Byrne &lt;daire@dneg.com&gt;
Signed-off-by: Anna Schumaker &lt;Anna.Schumaker@Netapp.com&gt;
</content>
</entry>
<entry>
<title>NFS: Configure support for netfs when NFS fscache is configured</title>
<updated>2023-04-11T17:00:02+00:00</updated>
<author>
<name>Dave Wysochanski</name>
<email>dwysocha@redhat.com</email>
</author>
<published>2023-02-20T13:43:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=88a4d7bdeec97890cd543b58dd3588f1f879f51b'/>
<id>urn:sha1:88a4d7bdeec97890cd543b58dd3588f1f879f51b</id>
<content type='text'>
As first steps for support of the netfs library when NFS_FSCACHE is
configured, add NETFS_SUPPORT to Kconfig and add the required netfs_inode
into struct nfs_inode.

Using netfs requires we move the VFS inode structure to be stored
inside struct netfs_inode, along with the fscache_cookie.
Thus, if NFS_FSCACHE is configured, place netfs_inode inside an
anonymous union so the vfs_inode memory is the same and we do
not need to modify other non-fscache areas of NFS.
In addition, inside the NFS fscache code, use the new helpers,
netfs_inode() and netfs_i_cookie() helpers, and remove our own
helper, nfs_i_fscache().

Later patches will convert NFS fscache to fully use netfs.

Signed-off-by: Dave Wysochanski &lt;dwysocha@redhat.com&gt;
Tested-by: Daire Byrne &lt;daire@dneg.com&gt;
Signed-off-by: Anna Schumaker &lt;Anna.Schumaker@Netapp.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'nfs-for-6.3-1' of git://git.linux-nfs.org/projects/anna/linux-nfs</title>
<updated>2023-02-22T22:47:20+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-02-22T22:47:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d8ca6dbb8de7923fcfb18e0b0b123f37c3225519'/>
<id>urn:sha1:d8ca6dbb8de7923fcfb18e0b0b123f37c3225519</id>
<content type='text'>
Pull NFS client updates from Anna Schumaker:
 "New Features:

   - Convert the read and write paths to use folios

  Bugfixes and Cleanups:

   - Fix tracepoint state manager flag printing

   - Fix disabling swap files

   - Fix NFSv4 client identifier sysfs path in the documentation

   - Don't clear NFS_CAP_COPY if server returns NFS4ERR_OFFLOAD_DENIED

   - Treat GETDEVICEINFO errors as a layout failure

   - Replace kmap_atomic() calls with kmap_local_page()

   - Constify sunrpc sysfs kobj_type structures"

* tag 'nfs-for-6.3-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (25 commits)
  fs/nfs: Replace kmap_atomic() with kmap_local_page() in dir.c
  pNFS/filelayout: treat GETDEVICEINFO errors as layout failure
  Documentation: Fix sysfs path for the NFSv4 client identifier
  nfs42: do not fail with EIO if ssc returns NFS4ERR_OFFLOAD_DENIED
  NFS: fix disabling of swap
  SUNRPC: make kobj_type structures constant
  nfs4trace: fix state manager flag printing
  NFS: Remove unnecessary check in nfs_read_folio()
  NFS: Improve tracing of nfs_wb_folio()
  NFS: Enable tracing of nfs_invalidate_folio() and nfs_launder_folio()
  NFS: fix up nfs_release_folio() to try to release the page
  NFS: Clean up O_DIRECT request allocation
  NFS: Fix up nfs_vm_page_mkwrite() for folios
  NFS: Convert nfs_write_begin/end to use folios
  NFS: Remove unused function nfs_wb_page()
  NFS: Convert buffered writes to use folios
  NFS: Convert the function nfs_wb_page() to use folios
  NFS: Convert buffered reads to use folios
  NFS: Add a helper nfs_wb_folio()
  NFS: Convert the remaining pagelist helper functions to support folios
  ...
</content>
</entry>
<entry>
<title>NFS: Remove unused function nfs_wb_page()</title>
<updated>2023-02-14T19:22:32+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@hammerspace.com</email>
</author>
<published>2023-01-19T21:33:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4cbf76948c457f0beb9f184ebb21341c8235846a'/>
<id>urn:sha1:4cbf76948c457f0beb9f184ebb21341c8235846a</id>
<content type='text'>
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Anna Schumaker &lt;Anna.Schumaker@Netapp.com&gt;
</content>
</entry>
<entry>
<title>NFS: Convert buffered writes to use folios</title>
<updated>2023-02-14T19:22:32+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@hammerspace.com</email>
</author>
<published>2023-01-19T21:33:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0c493b5cf16e28d761b6e77c7c32aa0e7af70813'/>
<id>urn:sha1:0c493b5cf16e28d761b6e77c7c32aa0e7af70813</id>
<content type='text'>
Mostly mechanical conversion of struct page and functions into struct
folio equivalents.
The lack of support for folios in write_cache_pages(), means we still
only support order 0 folio allocations. However the rest of the
writeback code should now be ready for order n &gt; 0.

Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Anna Schumaker &lt;Anna.Schumaker@Netapp.com&gt;
</content>
</entry>
<entry>
<title>NFS: Add a helper nfs_wb_folio()</title>
<updated>2023-02-14T19:22:32+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@hammerspace.com</email>
</author>
<published>2023-01-19T21:33:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4b27232a6e064f3d779cfa76cd251d6023949d22'/>
<id>urn:sha1:4b27232a6e064f3d779cfa76cd251d6023949d22</id>
<content type='text'>
...and use it in nfs_launder_folio().

Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Anna Schumaker &lt;Anna.Schumaker@Netapp.com&gt;
</content>
</entry>
<entry>
<title>fs: port -&gt;permission() to pass mnt_idmap</title>
<updated>2023-01-19T08:24:28+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2023-01-13T11:49:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4609e1f18e19c3b302e1eb4858334bca1532f780'/>
<id>urn:sha1:4609e1f18e19c3b302e1eb4858334bca1532f780</id>
<content type='text'>
Convert to struct mnt_idmap.

Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.

Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.

Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.

Acked-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Christian Brauner (Microsoft) &lt;brauner@kernel.org&gt;
</content>
</entry>
</feed>
