<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/nfs/flexfilelayout, branch v6.6.131</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.131</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.131'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-01-30T09:27:22+00:00</updated>
<entry>
<title>pnfs/flexfiles: Fix memory leak in nfs4_ff_alloc_deviceid_node()</title>
<updated>2026-01-30T09:27:22+00:00</updated>
<author>
<name>Zilin Guan</name>
<email>zilin@seu.edu.cn</email>
</author>
<published>2025-12-25T07:41:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=869862056e100973e76ce9f5f1b01837771b7722'/>
<id>urn:sha1:869862056e100973e76ce9f5f1b01837771b7722</id>
<content type='text'>
[ Upstream commit 0c728083654f0066f5e10a1d2b0bd0907af19a58 ]

In nfs4_ff_alloc_deviceid_node(), if the allocation for ds_versions fails,
the function jumps to the out_scratch label without freeing the already
allocated dsaddrs list, leading to a memory leak.

Fix this by jumping to the out_err_drain_dsaddrs label, which properly
frees the dsaddrs list before cleaning up other resources.

Fixes: d67ae825a59d6 ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: Zilin Guan &lt;zilin@seu.edu.cn&gt;
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>NFSv4/flexfiles: Fix layout merge mirror check.</title>
<updated>2025-09-19T14:32:02+00:00</updated>
<author>
<name>Jonathan Curley</name>
<email>jcurley@purestorage.com</email>
</author>
<published>2025-09-08T17:35:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69d99a936f00fe598affef676b086b17d527fb7d'/>
<id>urn:sha1:69d99a936f00fe598affef676b086b17d527fb7d</id>
<content type='text'>
[ Upstream commit dd2fa82473453661d12723c46c9f43d9876a7efd ]

Typo in ff_lseg_match_mirrors makes the diff ineffective. This results
in merge happening all the time. Merge happening all the time is
problematic because it marks lsegs invalid. Marking lsegs invalid
causes all outstanding IO to get restarted with EAGAIN and connections
to get closed.

Closing connections constantly triggers race conditions in the RDMA
implementation...

Fixes: 660d1eb22301c ("pNFS/flexfile: Don't merge layout segments if the mirrors don't match")
Signed-off-by: Jonathan Curley &lt;jcurley@purestorage.com&gt;
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>flexfiles/pNFS: fix NULL checks on result of ff_layout_choose_ds_for_read</title>
<updated>2025-09-19T14:32:01+00:00</updated>
<author>
<name>Tigran Mkrtchyan</name>
<email>tigran.mkrtchyan@desy.de</email>
</author>
<published>2025-08-28T14:51:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fd00ae04c5aa059e801346ace88489d104ef3675'/>
<id>urn:sha1:fd00ae04c5aa059e801346ace88489d104ef3675</id>
<content type='text'>
[ Upstream commit 5a46d2339a5ae268ede53a221f20433d8ea4f2f9 ]

Recent commit f06bedfa62d5 ("pNFS/flexfiles: don't attempt pnfs on fatal DS
errors") has changed the error return type of ff_layout_choose_ds_for_read() from
NULL to an error pointer. However, not all code paths have been updated
to match the change. Thus, some non-NULL checks will accept error pointers
as a valid return value.

Reported-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Suggested-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Fixes: f06bedfa62d5 ("pNFS/flexfiles: don't attempt pnfs on fatal DS errors")
Signed-off-by: Tigran Mkrtchyan &lt;tigran.mkrtchyan@desy.de&gt;
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>pNFS/flexfiles: don't attempt pnfs on fatal DS errors</title>
<updated>2025-08-15T10:09:01+00:00</updated>
<author>
<name>Tigran Mkrtchyan</name>
<email>tigran.mkrtchyan@desy.de</email>
</author>
<published>2025-06-27T07:17:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3bdb29df2f0d3f66805240b5ed749b7871928d94'/>
<id>urn:sha1:3bdb29df2f0d3f66805240b5ed749b7871928d94</id>
<content type='text'>
[ Upstream commit f06bedfa62d57f7b67d44aacd6badad2e13a803f ]

When an applications get killed (SIGTERM/SIGINT) while pNFS client performs a connection
to DS, client ends in an infinite loop of connect-disconnect. This
source of the issue, it that flexfilelayoutdev#nfs4_ff_layout_prepare_ds gets an error
on nfs4_pnfs_ds_connect with status ERESTARTSYS, which is set by rpc_signal_task, but
the error is treated as transient, thus retried.

The issue is reproducible with Ctrl+C the following script(there should be ~1000 files in
a directory, client should must not have any connections to DSes):

```
echo 3 &gt; /proc/sys/vm/drop_caches

for i in *
do
   head -1 $i
done
```

The change aims to propagate the nfs4_ff_layout_prepare_ds error state
to the caller that can decide whatever this is a retryable error or not.

Signed-off-by: Tigran Mkrtchyan &lt;tigran.mkrtchyan@desy.de&gt;
Link: https://lore.kernel.org/r/20250627071751.189663-1-tigran.mkrtchyan@desy.de
Fixes: 260f32adb88d ("pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect")
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>NFSv4/flexfiles: Fix handling of NFS level errors in I/O</title>
<updated>2025-07-10T14:03:18+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@hammerspace.com</email>
</author>
<published>2025-06-19T19:16:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=204bdc7a8b7b7b7fcfc500f1402762a1f99e00bc'/>
<id>urn:sha1:204bdc7a8b7b7b7fcfc500f1402762a1f99e00bc</id>
<content type='text'>
[ Upstream commit 38074de35b015df5623f524d6f2b49a0cd395c40 ]

Allow the flexfiles error handling to recognise NFS level errors (as
opposed to RPC level errors) and handle them separately. The main
motivator is the NFSERR_PERM errors that get returned if the NFS client
connects to the data server through a port number that is lower than
1024. In that case, the client should disconnect and retry a READ on a
different data server, or it should retry a WRITE after reconnecting.

Reviewed-by: Tigran Mkrtchyan &lt;tigran.mkrtchyan@desy.de&gt;
Fixes: d67ae825a59d ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Anna Schumaker &lt;anna.schumaker@oracle.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>nfs: don't share pNFS DS connections between net namespaces</title>
<updated>2025-06-04T12:42:25+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@kernel.org</email>
</author>
<published>2025-04-10T20:42:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=764f8cd8aadcd98e7fd1112f81e8c7507a153cd5'/>
<id>urn:sha1:764f8cd8aadcd98e7fd1112f81e8c7507a153cd5</id>
<content type='text'>
[ Upstream commit 6b9785dc8b13d9fb75ceec8cf4ea7ec3f3b1edbc ]

Currently, different NFS clients can share the same DS connections, even
when they are in different net namespaces. If a containerized client
creates a DS connection, another container can find and use it. When the
first client exits, the connection will close which can lead to stalls
in other clients.

Add a net namespace pointer to struct nfs4_pnfs_ds, and compare those
value to the caller's netns in _data_server_lookup_locked() when
searching for a nfs4_pnfs_ds to match.

Reported-by: Omar Sandoval &lt;osandov@osandov.com&gt;
Reported-by: Sargun Dillon &lt;sargun@sargun.me&gt;
Closes: https://lore.kernel.org/linux-nfs/Z_ArpQC_vREh_hEA@telecaster/
Tested-by: Sargun Dillon &lt;sargun@sargun.me&gt;
Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Reviewed-by: Benjamin Coddington &lt;bcodding@redhat.com&gt;
Link: https://lore.kernel.org/r/20250410-nfs-ds-netns-v2-1-f80b7979ba80@kernel.org
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>pNFS/flexfiles: Report ENETDOWN as a connection error</title>
<updated>2025-06-04T12:41:56+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@hammerspace.com</email>
</author>
<published>2025-03-20T16:45:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=aa59ccacf8f339312c6395b2257239cfcc45985f'/>
<id>urn:sha1:aa59ccacf8f339312c6395b2257239cfcc45985f</id>
<content type='text'>
[ Upstream commit aa42add73ce9b9e3714723d385c254b75814e335 ]

If the client should see an ENETDOWN when trying to connect to the data
server, it might still be able to talk to the metadata server through
another NIC. If so, report the error.

Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Tested-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Acked-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>pnfs/flexfiles: retry getting layout segment for reads</title>
<updated>2025-02-17T08:40:37+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@kernel.org</email>
</author>
<published>2025-01-17T01:05:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e5d289c972ebb336a303204f5b400aa48420fc9b'/>
<id>urn:sha1:e5d289c972ebb336a303204f5b400aa48420fc9b</id>
<content type='text'>
commit eb3fabde15bccdf34f1c9b35a83aa4c0dacbb4ca upstream.

If ff_layout_pg_get_read()'s attempt to get a layout segment results
in -EAGAIN have ff_layout_pg_init_read() retry it after sleeping.

If "softerr" mount is used, use 'io_maxretrans' to limit the number of
attempts to get a layout segment.

This fixes a long-standing issue of O_DIRECT reads failing with
-EAGAIN (11) when using flexfiles Client Side Mirroring (CSM).

Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Anna Schumaker &lt;anna.schumaker@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>nfs: fix panic when nfs4_ff_layout_prepare_ds() fails</title>
<updated>2024-03-26T22:20:04+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2024-03-11T15:11:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7ca651b4ec4a049f5a46a0e5ff921b86b91c47c5'/>
<id>urn:sha1:7ca651b4ec4a049f5a46a0e5ff921b86b91c47c5</id>
<content type='text'>
[ Upstream commit 719fcafe07c12646691bd62d7f8d94d657fa0766 ]

We've been seeing the following panic in production

BUG: kernel NULL pointer dereference, address: 0000000000000065
PGD 2f485f067 P4D 2f485f067 PUD 2cc5d8067 PMD 0
RIP: 0010:ff_layout_cancel_io+0x3a/0x90 [nfs_layout_flexfiles]
Call Trace:
 &lt;TASK&gt;
 ? __die+0x78/0xc0
 ? page_fault_oops+0x286/0x380
 ? __rpc_execute+0x2c3/0x470 [sunrpc]
 ? rpc_new_task+0x42/0x1c0 [sunrpc]
 ? exc_page_fault+0x5d/0x110
 ? asm_exc_page_fault+0x22/0x30
 ? ff_layout_free_layoutreturn+0x110/0x110 [nfs_layout_flexfiles]
 ? ff_layout_cancel_io+0x3a/0x90 [nfs_layout_flexfiles]
 ? ff_layout_cancel_io+0x6f/0x90 [nfs_layout_flexfiles]
 pnfs_mark_matching_lsegs_return+0x1b0/0x360 [nfsv4]
 pnfs_error_mark_layout_for_return+0x9e/0x110 [nfsv4]
 ? ff_layout_send_layouterror+0x50/0x160 [nfs_layout_flexfiles]
 nfs4_ff_layout_prepare_ds+0x11f/0x290 [nfs_layout_flexfiles]
 ff_layout_pg_init_write+0xf0/0x1f0 [nfs_layout_flexfiles]
 __nfs_pageio_add_request+0x154/0x6c0 [nfs]
 nfs_pageio_add_request+0x26b/0x380 [nfs]
 nfs_do_writepage+0x111/0x1e0 [nfs]
 nfs_writepages_callback+0xf/0x30 [nfs]
 write_cache_pages+0x17f/0x380
 ? nfs_pageio_init_write+0x50/0x50 [nfs]
 ? nfs_writepages+0x6d/0x210 [nfs]
 ? nfs_writepages+0x6d/0x210 [nfs]
 nfs_writepages+0x125/0x210 [nfs]
 do_writepages+0x67/0x220
 ? generic_perform_write+0x14b/0x210
 filemap_fdatawrite_wbc+0x5b/0x80
 file_write_and_wait_range+0x6d/0xc0
 nfs_file_fsync+0x81/0x170 [nfs]
 ? nfs_file_mmap+0x60/0x60 [nfs]
 __x64_sys_fsync+0x53/0x90
 do_syscall_64+0x3d/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Inspecting the core with drgn I was able to pull this

  &gt;&gt;&gt; prog.crashed_thread().stack_trace()[0]
  #0 at 0xffffffffa079657a (ff_layout_cancel_io+0x3a/0x84) in ff_layout_cancel_io at fs/nfs/flexfilelayout/flexfilelayout.c:2021:27
  &gt;&gt;&gt; prog.crashed_thread().stack_trace()[0]['idx']
  (u32)1
  &gt;&gt;&gt; prog.crashed_thread().stack_trace()[0]['flseg'].mirror_array[1].mirror_ds
  (struct nfs4_ff_layout_ds *)0xffffffffffffffed

This is clear from the stack trace, we call nfs4_ff_layout_prepare_ds()
which could error out initializing the mirror_ds, and then we go to
clean it all up and our check is only for if (!mirror-&gt;mirror_ds).  This
is inconsistent with the rest of the users of mirror_ds, which have

  if (IS_ERR_OR_NULL(mirror_ds))

to keep from tripping over this exact scenario.  Fix this up in
ff_layout_cancel_io() to make sure we don't panic when we get an error.
I also spot checked all the other instances of checking mirror_ds and we
appear to be doing the correct checks everywhere, only unconditionally
dereferencing mirror_ds when we know it would be valid.

Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Fixes: b739a5bd9d9f ("NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked")
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>pNFS/flexfiles: Check the layout validity in ff_layout_mirror_prepare_stats</title>
<updated>2023-10-18T18:42:09+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@hammerspace.com</email>
</author>
<published>2023-10-08T18:28:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3'/>
<id>urn:sha1:e1c6cfbb3bd1377e2ddcbe06cf8fb1ec323ea7d3</id>
<content type='text'>
Ensure that we check the layout pointer and validity after dereferencing
it in ff_layout_mirror_prepare_stats.

Fixes: 08e2e5bc6c9a ("pNFS/flexfiles: Clean up layoutstats")
Signed-off-by: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Signed-off-by: Anna Schumaker &lt;Anna.Schumaker@Netapp.com&gt;
</content>
</entry>
</feed>
