<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/netfs/write_issue.c, branch v6.18.21</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-04-02T11:23:33+00:00</updated>
<entry>
<title>netfs: Fix the handling of stream-&gt;front by removing it</title>
<updated>2026-04-02T11:23:33+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2026-03-25T08:20:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f0035858dfb2335171b897fe7653fa8e8928cc04'/>
<id>urn:sha1:f0035858dfb2335171b897fe7653fa8e8928cc04</id>
<content type='text'>
[ Upstream commit 0e764b9d46071668969410ec5429be0e2f38c6d3 ]

The netfs_io_stream::front member is meant to point to the subrequest
currently being collected on a stream, but it isn't actually used this way
by direct write (which mostly ignores it).  However, there's a tracepoint
which looks at it.  Further, stream-&gt;front is actually redundant with
stream-&gt;subrequests.next.

Fix the potential problem in the direct code by just removing the member
and using stream-&gt;subrequests.next instead, thereby also simplifying the
code.

Fixes: a0b4c7a49137 ("netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence")
Reported-by: Paulo Alcantara &lt;pc@manguebit.org&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://patch.msgid.link/4158599.1774426817@warthog.procyon.org.uk
Reviewed-by: Paulo Alcantara (Red Hat) &lt;pc@manguebit.org&gt;
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence</title>
<updated>2026-03-12T11:09:49+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2026-02-26T13:32:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=72d08d2839649d1c5efbe375751f4473fa4486af'/>
<id>urn:sha1:72d08d2839649d1c5efbe375751f4473fa4486af</id>
<content type='text'>
[ Upstream commit a0b4c7a49137ed21279f354eb59f49ddae8dffc2 ]

Fix netfslib such that when it's making an unbuffered or DIO write, to make
sure that it sends each subrequest strictly sequentially, waiting till the
previous one is 'committed' before sending the next so that we don't have
pieces landing out of order and potentially leaving a hole if an error
occurs (ENOSPC for example).

This is done by copying in just those bits of issuing, collecting and
retrying subrequests that are necessary to do one subrequest at a time.
Retrying, in particular, is simpler because if the current subrequest needs
retrying, the source iterator can just be copied again and the subrequest
prepped and issued again without needing to be concerned about whether it
needs merging with the previous or next in the sequence.

Note that the issuing loop waits for a subrequest to complete right after
issuing it, but this wait could be moved elsewhere allowing preparatory
steps to be performed whilst the subrequest is in progress.  In particular,
once content encryption is available in netfslib, that could be done whilst
waiting, as could cleanup of buffers that have been completed.

Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://patch.msgid.link/58526.1772112753@warthog.procyon.org.uk
Tested-by: Steve French &lt;sfrench@samba.org&gt;
Reviewed-by: Paulo Alcantara (Red Hat) &lt;pc@manguebit.org&gt;
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfs: when subreq is marked for retry, do not check if it faced an error</title>
<updated>2026-03-04T12:19:32+00:00</updated>
<author>
<name>Shyam Prasad N</name>
<email>sprasad@microsoft.com</email>
</author>
<published>2026-01-31T08:33:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=be0d85bc660afe93ebe12ecea7f3d7af375d8a6c'/>
<id>urn:sha1:be0d85bc660afe93ebe12ecea7f3d7af375d8a6c</id>
<content type='text'>
[ Upstream commit 82e8885bd7633a36ee9050e6d7f348a4155eed5f ]

The *_subreq_terminated functions today only process the NEED_RETRY
flag when the subreq was successful or failed with EAGAIN error.
However, there could be other retriable errors for network filesystems.

Avoid this by processing the NEED_RETRY irrespective of the error
code faced by the subreq. If it was specifically marked for retry,
the error code must not matter.

Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Shyam Prasad N &lt;sprasad@microsoft.com&gt;
Signed-off-by: Steve French &lt;stfrench@microsoft.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfs: fix reference leak</title>
<updated>2025-09-26T08:14:19+00:00</updated>
<author>
<name>Max Kellermann</name>
<email>max.kellermann@ionos.com</email>
</author>
<published>2025-09-25T13:08:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4d428dca252c858bfac691c31fa95d26cd008706'/>
<id>urn:sha1:4d428dca252c858bfac691c31fa95d26cd008706</id>
<content type='text'>
Commit 20d72b00ca81 ("netfs: Fix the request's work item to not
require a ref") modified netfs_alloc_request() to initialize the
reference counter to 2 instead of 1.  The rationale was that the
requet's "work" would release the second reference after completion
(via netfs_{read,write}_collection_worker()).  That works most of the
time if all goes well.

However, it leaks this additional reference if the request is released
before the I/O operation has been submitted: the error code path only
decrements the reference counter once and the work item will never be
queued because there will never be a completion.

This has caused outages of our whole server cluster today because
tasks were blocked in netfs_wait_for_outstanding_io(), leading to
deadlocks in Ceph (another bug that I will address soon in another
patch).  This was caused by a netfs_pgpriv2_begin_copy_to_cache() call
which failed in fscache_begin_write_operation().  The leaked
netfs_io_request was never completed, leaving `netfs_inode.io_count`
with a positive value forever.

All of this is super-fragile code.  Finding out which code paths will
lead to an eventual completion and which do not is hard to see:

- Some functions like netfs_create_write_req() allocate a request, but
  will never submit any I/O.

- netfs_unbuffered_read_iter_locked() calls netfs_unbuffered_read()
  and then netfs_put_request(); however, netfs_unbuffered_read() can
  also fail early before submitting the I/O request, therefore another
  netfs_put_request() call must be added there.

A rule of thumb is that functions that return a `netfs_io_request` do
not submit I/O, and all of their callers must be checked.

For my taste, the whole netfs code needs an overhaul to make reference
counting easier to understand and less fragile &amp; obscure.  But to fix
this bug here and now and produce a patch that is adequate for a
stable backport, I tried a minimal approach that quickly frees the
request object upon early failure.

I decided against adding a second netfs_put_request() each time
because that would cause code duplication which obscures the code
further.  Instead, I added the function netfs_put_failed_request()
which frees such a failed request synchronously under the assumption
that the reference count is exactly 2 (as initially set by
netfs_alloc_request() and never touched), verified by a
WARN_ON_ONCE().  It then deinitializes the request object (without
going through the "cleanup_work" indirection) and frees the allocation
(with RCU protection to protect against concurrent access by
netfs_requests_seq_start()).

All code paths that fail early have been changed to call
netfs_put_failed_request() instead of netfs_put_request().
Additionally, I have added a netfs_put_request() call to
netfs_unbuffered_read() as explained above because the
netfs_put_failed_request() approach does not work there.

Fixes: 20d72b00ca81 ("netfs: Fix the request's work item to not require a ref")
Signed-off-by: Max Kellermann &lt;max.kellermann@ionos.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Paulo Alcantara &lt;pc@manguebit.org&gt;
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfs: Fix unbuffered write error handling</title>
<updated>2025-08-15T13:56:49+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2025-08-14T21:45:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a3de58b12ce074ec05b8741fa28d62ccb1070468'/>
<id>urn:sha1:a3de58b12ce074ec05b8741fa28d62ccb1070468</id>
<content type='text'>
If all the subrequests in an unbuffered write stream fail, the subrequest
collector doesn't update the stream-&gt;transferred value and it retains its
initial LONG_MAX value.  Unfortunately, if all active streams fail, then we
take the smallest value of { LONG_MAX, LONG_MAX, ... } as the value to set
in wreq-&gt;transferred - which is then returned from -&gt;write_iter().

LONG_MAX was chosen as the initial value so that all the streams can be
quickly assessed by taking the smallest value of all stream-&gt;transferred -
but this only works if we've set any of them.

Fix this by adding a flag to indicate whether the value in
stream-&gt;transferred is valid and checking that when we integrate the
values.  stream-&gt;transferred can then be initialised to zero.

This was found by running the generic/750 xfstest against cifs with
cache=none.  It splices data to the target file.  Once (if) it has used up
all the available scratch space, the writes start failing with ENOSPC.
This causes -&gt;write_iter() to fail.  However, it was returning
wreq-&gt;transferred, i.e. LONG_MAX, rather than an error (because it thought
the amount transferred was non-zero) and iter_file_splice_write() would
then try to clean up that amount of pipe bufferage - leading to an oops
when it overran.  The kernel log showed:

    CIFS: VFS: Send error in write = -28

followed by:

    BUG: kernel NULL pointer dereference, address: 0000000000000008

with:

    RIP: 0010:iter_file_splice_write+0x3a4/0x520
    do_splice+0x197/0x4e0

or:

    RIP: 0010:pipe_buf_release (include/linux/pipe_fs_i.h:282)
    iter_file_splice_write (fs/splice.c:755)

Also put a warning check into splice to announce if -&gt;write_iter() returned
that it had written more than it was asked to.

Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
Reported-by: Xiaoli Feng &lt;fengxiaoli0714@gmail.com&gt;
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220445
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/915443.1755207950@warthog.procyon.org.uk
cc: Paulo Alcantara &lt;pc@manguebit.org&gt;
cc: Steve French &lt;sfrench@samba.org&gt;
cc: Shyam Prasad N &lt;sprasad@microsoft.com&gt;
cc: netfs@lists.linux.dev
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfs: Fix wait/wake to be consistent about the waitqueue used</title>
<updated>2025-05-21T12:35:21+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2025-05-19T09:07:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2b1424cd131cfaba4cf7040473133d26cddac088'/>
<id>urn:sha1:2b1424cd131cfaba4cf7040473133d26cddac088</id>
<content type='text'>
Fix further inconsistencies in the use of waitqueues
(clear_and_wake_up_bit() vs private waitqueue).

Move some of this stuff from the read and write sides into common code so
that it can be done in fewer places.

To make this work, async I/O needs to set NETFS_RREQ_OFFLOAD_COLLECTION to
indicate that a workqueue will do the collecting and places that call the
wait function need to deal with it returning the amount transferred.

Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/20250519090707.2848510-5-dhowells@redhat.com
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: Steve French &lt;stfrench@microsoft.com&gt;
cc: Ihor Solodrai &lt;ihor.solodrai@pm.me&gt;
cc: Eric Van Hensbergen &lt;ericvh@kernel.org&gt;
cc: Latchesar Ionkov &lt;lucho@ionkov.net&gt;
cc: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
cc: Christian Schoenebeck &lt;linux_oss@crudebyte.com&gt;
cc: Paulo Alcantara &lt;pc@manguebit.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: v9fs@lists.linux.dev
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfs: Fix the request's work item to not require a ref</title>
<updated>2025-05-21T12:35:20+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2025-05-19T09:07:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=20d72b00ca814d748f5663484e5c53bb2bf37a3a'/>
<id>urn:sha1:20d72b00ca814d748f5663484e5c53bb2bf37a3a</id>
<content type='text'>
When the netfs_io_request struct's work item is queued, it must be supplied
with a ref to the work item struct to prevent it being deallocated whilst
on the queue or whilst it is being processed.  This is tricky to manage as
we have to get a ref before we try and queue it and then we may find it's
already queued and is thus already holding a ref - in which case we have to
try and get rid of the ref again.

The problem comes if we're in BH or IRQ context and need to drop the ref:
if netfs_put_request() reduces the count to 0, we have to do the cleanup -
but the cleanup may need to wait.

Fix this by adding a new work item to the request, -&gt;cleanup_work, and
dispatching that when the refcount hits zero.  That can then synchronously
cancel any outstanding work on the main work item before doing the cleanup.

Adding a new work item also deals with another problem upstream where it's
sometimes changing the work func in the put function and requeuing it -
which has occasionally in the past caused the cleanup to happen
incorrectly.

As a bonus, this allows us to get rid of the 'was_async' parameter from a
bunch of functions.  This indicated whether the put function might not be
permitted to sleep.

Fixes: 3d3c95046742 ("netfs: Provide readahead and readpage netfs helpers")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/20250519090707.2848510-4-dhowells@redhat.com
cc: Paulo Alcantara &lt;pc@manguebit.com&gt;
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: Steve French &lt;stfrench@microsoft.com&gt;
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfs: Add retry stat counters</title>
<updated>2025-02-13T15:00:48+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2025-02-12T22:24:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d01c495f432ce34df8bfd092e71720a2cf169a90'/>
<id>urn:sha1:d01c495f432ce34df8bfd092e71720a2cf169a90</id>
<content type='text'>
Add stat counters to count the number of request and subrequest retries and
display them in /proc/fs/netfs/stats.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/r/20250212222402.3618494-3-dhowells@redhat.com
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfs: Change the read result collector to only use one work item</title>
<updated>2024-12-20T21:34:08+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2024-12-16T20:41:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e2d46f2ec332533816417b60933954173f602121'/>
<id>urn:sha1:e2d46f2ec332533816417b60933954173f602121</id>
<content type='text'>
Change the way netfslib collects read results to do all the collection for
a particular read request using a single work item that walks along the
subrequest queue as subrequests make progress or complete, unlocking folios
progressively rather than doing the unlock in parallel as parallel requests
come in.

The code is remodelled to be more like the write-side code, though only
using a single stream.  This makes it more directly comparable and thus
easier to duplicate fixes between the two sides.

This has a number of advantages:

 (1) It's simpler.  There doesn't need to be a complex donation mechanism
     to handle mismatches between the size and alignment of subrequests and
     folios.  The collector unlocks folios as the subrequests covering each
     complete.

 (2) It should cause less scheduler overhead as there's a single work item
     in play unlocking pages in parallel when a read gets split up into a
     lot of subrequests instead of one per subrequest.

     Whilst the parallellism is nice in theory, in practice, the vast
     majority of loads are sequential reads of the whole file, so
     committing a bunch of threads to unlocking folios out of order doesn't
     help in those cases.

 (3) It should make it easier to implement content decryption.  A folio
     cannot be decrypted until all the requests that contribute to it have
     completed - and, again, most loads are sequential and so, most of the
     time, we want to begin decryption sequentially (though it's great if
     the decryption can happen in parallel).

There is a disadvantage in that we're losing the ability to decrypt and
unlock things on an as-things-arrive basis which may affect some
applications.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/r/20241216204124.3752367-28-dhowells@redhat.com
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfs: Add support for caching single monolithic objects such as AFS dirs</title>
<updated>2024-12-20T21:34:06+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2024-12-16T20:41:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=49866ce7ea8d41a3dc198f519cc9caa2d6be1891'/>
<id>urn:sha1:49866ce7ea8d41a3dc198f519cc9caa2d6be1891</id>
<content type='text'>
Add support for caching the content of a file that contains a single
monolithic object that must be read/written with a single I/O operation,
such as an AFS directory.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/r/20241216204124.3752367-20-dhowells@redhat.com
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: netfs@lists.linux.dev
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
</feed>
