<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/afs/cell.c, branch v6.6.131</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.131</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.131'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-03-07T15:45:38+00:00</updated>
<entry>
<title>afs: Make it possible to find the volumes that are using a server</title>
<updated>2025-03-07T15:45:38+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2023-11-02T16:08:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0b2bf6fb53530fc5772fc2bcee87eac5ba8c84ea'/>
<id>urn:sha1:0b2bf6fb53530fc5772fc2bcee87eac5ba8c84ea</id>
<content type='text'>
[ Upstream commit ca0e79a46097d54e4af46c67c852479d97af35bb ]

Make it possible to find the afs_volume structs that are using an
afs_server struct to aid in breaking volume callbacks.

The way this is done is that each afs_volume already has an array of
afs_server_entry records that point to the servers where that volume might
be found.  An afs_volume backpointer and a list node is added to each entry
and each entry is then added to an RCU-traversable list on the afs_server
to which it points.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: linux-afs@lists.infradead.org
Stable-dep-of: add117e48df4 ("afs: Fix the server_list to unuse a displaced server rather than putting it")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>afs: Fix overwriting of result of DNS query</title>
<updated>2024-01-01T12:42:34+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2023-12-21T15:09:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=81fc8dceb78b76dfe9d167e0845144f39b65cb1d'/>
<id>urn:sha1:81fc8dceb78b76dfe9d167e0845144f39b65cb1d</id>
<content type='text'>
[ Upstream commit a9e01ac8c5ff32669119c40dfdc9e80eb0b7d7aa ]

In afs_update_cell(), ret is the result of the DNS lookup and the errors
are to be handled by a switch - however, the value gets clobbered in
between by setting it to -ENOMEM in case afs_alloc_vlserver_list()
fails.

Fix this by moving the setting of -ENOMEM into the error handling for
OOM failure.  Further, only do it if we don't have an alternative error
to return.

Found by Linux Verification Center (linuxtesting.org) with SVACE.  Based
on a patch from Anastasia Belova [1].

Fixes: d5c32c89b208 ("afs: Fix cell DNS lookup")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Reviewed-by: Jeffrey Altman &lt;jaltman@auristor.com&gt;
cc: Anastasia Belova &lt;abelova@astralinux.ru&gt;
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: linux-afs@lists.infradead.org
cc: lvc-project@linuxtesting.org
Link: https://lore.kernel.org/r/20231221085849.1463-1-abelova@astralinux.ru/ [1]
Link: https://lore.kernel.org/r/1700862.1703168632@warthog.procyon.org.uk/ # v1
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>afs: Use refcount_t rather than atomic_t</title>
<updated>2022-08-02T17:10:11+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2022-07-06T09:52:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c56f9ec8b20f931014574b943590c4d830109380'/>
<id>urn:sha1:c56f9ec8b20f931014574b943590c4d830109380</id>
<content type='text'>
Use refcount_t rather than atomic_t in afs to make use of the count
checking facilities provided.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Reviewed-by: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/165911277768.3745403.423349776836296452.stgit@warthog.procyon.org.uk/ # v1
</content>
</entry>
<entry>
<title>afs: Convert afs to use the new fscache API</title>
<updated>2022-01-07T13:44:47+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2020-02-06T14:22:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=523d27cda149cfc8c99159c0f68f690e4d3b9d14'/>
<id>urn:sha1:523d27cda149cfc8c99159c0f68f690e4d3b9d14</id>
<content type='text'>
Change the afs filesystem to support the new afs driver.

The following changes have been made:

 (1) The fscache_netfs struct is no more, and there's no need to register
     the filesystem as a whole.  There's also no longer a cell cookie.

 (2) The volume cookie is now an fscache_volume cookie, allocated with
     fscache_acquire_volume().  This function takes three parameters: a
     string representing the "volume" in the index, a string naming the
     cache to use (or NULL) and a u64 that conveys coherency metadata for
     the volume.

     For afs, I've made it render the volume name string as:

        "afs,&lt;cell&gt;,&lt;volume_id&gt;"

     and the coherency data is currently 0.

 (3) The fscache_cookie_def is no more and needed information is passed
     directly to fscache_acquire_cookie().  The cache no longer calls back
     into the filesystem, but rather metadata changes are indicated at
     other times.

     fscache_acquire_cookie() is passed the same keying and coherency
     information as before, except that these are now stored in big endian
     form instead of cpu endian.  This makes the cache more copyable.

 (4) fscache_use_cookie() and fscache_unuse_cookie() are called when a file
     is opened or closed to prevent a cache file from being culled and to
     keep resources to hand that are needed to do I/O.

     fscache_use_cookie() is given an indication if the cache is likely to
     be modified locally (e.g. the file is open for writing).

     fscache_unuse_cookie() is given a coherency update if we had the file
     open for writing and will update that.

 (5) fscache_invalidate() is now given uptodate auxiliary data and a file
     size.  It can also take a flag to indicate if this was due to a DIO
     write.  This is wrapped into afs_fscache_invalidate() now for
     convenience.

 (6) fscache_resize() now gets called from the finalisation of
     afs_setattr(), and afs_setattr() does use/unuse of the cookie around
     the call to support this.

 (7) fscache_note_page_release() is called from afs_release_page().

 (8) Use a killable wait in nfs_vm_page_mkwrite() when waiting for
     PG_fscache to be cleared.

Render the parts of the cookie key for an afs inode cookie as big endian.

Changes
=======
ver #2:
 - Use gfpflags_allow_blocking() rather than using flag directly.
 - fscache_acquire_volume() now returns errors.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Tested-by: kafs-testing@auristor.com
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: linux-afs@lists.infradead.org
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/163819661382.215744.1485608824741611837.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163906970002.143852.17678518584089878259.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967174665.1823006.1301789965454084220.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021568841.640689.6684240152253400380.stgit@warthog.procyon.org.uk/ # v4
</content>
</entry>
<entry>
<title>afs: Fix mmap coherency vs 3rd-party changes</title>
<updated>2021-09-13T08:10:39+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2021-09-02T15:43:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6e0e99d58a6530cf65f10e4bb16630c5be6c254d'/>
<id>urn:sha1:6e0e99d58a6530cf65f10e4bb16630c5be6c254d</id>
<content type='text'>
Fix the coherency management of mmap'd data such that 3rd-party changes
become visible as soon as possible after the callback notification is
delivered by the fileserver.  This is done by the following means:

 (1) When we break a callback on a vnode specified by the CB.CallBack call
     from the server, we queue a work item (vnode-&gt;cb_work) to go and
     clobber all the PTEs mapping to that inode.

     This causes the CPU to trip through the -&gt;map_pages() and
     -&gt;page_mkwrite() handlers if userspace attempts to access the page(s)
     again.

     (Ideally, this would be done in the service handler for CB.CallBack,
     but the server is waiting for our reply before considering, and we
     have a list of vnodes, all of which need breaking - and the process of
     getting the mmap_lock and stripping the PTEs on all CPUs could be
     quite slow.)

 (2) Call afs_validate() from the -&gt;map_pages() handler to check to see if
     the file has changed and to get a new callback promise from the
     server.

Also handle the fileserver telling us that it's dropping all callbacks,
possibly after it's been restarted by sending us a CB.InitCallBackState*
call by the following means:

 (3) Maintain a per-cell list of afs files that are currently mmap'd
     (cell-&gt;fs_open_mmaps).

 (4) Add a work item to each server that is invoked if there are any open
     mmaps when CB.InitCallBackState happens.  This work item goes through
     the aforementioned list and invokes the vnode-&gt;cb_work work item for
     each one that is currently using this server.

     This causes the PTEs to be cleared, causing -&gt;map_pages() or
     -&gt;page_mkwrite() to be called again, thereby calling afs_validate()
     again.

I've chosen to simply strip the PTEs at the point of notification reception
rather than invalidate all the pages as well because (a) it's faster, (b)
we may get a notification for other reasons than the data being altered (in
which case we don't want to clobber the pagecache) and (c) we need to ask
the server to find out - and I don't want to wait for the reply before
holding up userspace.

This was tested using the attached test program:

	#include &lt;stdbool.h&gt;
	#include &lt;stdio.h&gt;
	#include &lt;stdlib.h&gt;
	#include &lt;unistd.h&gt;
	#include &lt;fcntl.h&gt;
	#include &lt;sys/mman.h&gt;
	int main(int argc, char *argv[])
	{
		size_t size = getpagesize();
		unsigned char *p;
		bool mod = (argc == 3);
		int fd;
		if (argc != 2 &amp;&amp; argc != 3) {
			fprintf(stderr, "Format: %s &lt;file&gt; [mod]\n", argv[0]);
			exit(2);
		}
		fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
		if (fd &lt; 0) {
			perror(argv[1]);
			exit(1);
		}

		p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
			 MAP_SHARED, fd, 0);
		if (p == MAP_FAILED) {
			perror("mmap");
			exit(1);
		}
		for (;;) {
			if (mod) {
				p[0]++;
				msync(p, size, MS_ASYNC);
				fsync(fd);
			}
			printf("%02x", p[0]);
			fflush(stdout);
			sleep(1);
		}
	}

It runs in two modes: in one mode, it mmaps a file, then sits in a loop
reading the first byte, printing it and sleeping for a second; in the
second mode it mmaps a file, then sits in a loop incrementing the first
byte and flushing, then printing and sleeping.

Two instances of this program can be run on different machines, one doing
the reading and one doing the writing.  The reader should see the changes
made by the writer, but without this patch, they aren't because validity
checking is being done lazily - only on entry to the filesystem.

Testing the InitCallBackState change is more complicated.  The server has
to be taken offline, the saved callback state file removed and then the
server restarted whilst the reading-mode program continues to run.  The
client machine then has to poke the server to trigger the InitCallBackState
call.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Tested-by: Markus Suvanto &lt;markus.suvanto@gmail.com&gt;
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
</content>
</entry>
<entry>
<title>afs: Fix tracing deref-before-check</title>
<updated>2020-10-27T22:05:56+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2020-10-27T10:42:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=acc080d15dde820bd39eb55a04f9a09c7ef52e67'/>
<id>urn:sha1:acc080d15dde820bd39eb55a04f9a09c7ef52e67</id>
<content type='text'>
The patch dca54a7bbb8c: "afs: Add tracing for cell refcount and active user
count" from Oct 13, 2020, leads to the following Smatch complaint:

    fs/afs/cell.c:596 afs_unuse_cell()
    warn: variable dereferenced before check 'cell' (see line 592)

Fix this by moving the retrieval of the cell debug ID to after the check of
the validity of the cell pointer.

Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Fixes: dca54a7bbb8c ("afs: Add tracing for cell refcount and active user count")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
</content>
</entry>
<entry>
<title>afs: Add tracing for cell refcount and active user count</title>
<updated>2020-10-16T13:39:21+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2020-10-13T19:51:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dca54a7bbb8ca9148ae10d60c66c926e222a9c4b'/>
<id>urn:sha1:dca54a7bbb8ca9148ae10d60c66c926e222a9c4b</id>
<content type='text'>
Add a tracepoint to log the cell refcount and active user count and pass in
a reason code through various functions that manipulate these counters.

Additionally, a helper function, afs_see_cell(), is provided to log
interesting places that deal with a cell without actually doing any
accounting directly.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>afs: Fix cell removal</title>
<updated>2020-10-16T13:38:26+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2020-10-16T12:21:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1d0e850a49a5b56f8f3cb51e74a11e2fedb96be6'/>
<id>urn:sha1:1d0e850a49a5b56f8f3cb51e74a11e2fedb96be6</id>
<content type='text'>
Fix cell removal by inserting a more final state than AFS_CELL_FAILED that
indicates that the cell has been unpublished in case the manager is already
requeued and will go through again.  The new AFS_CELL_REMOVED state will
just immediately leave the manager function.

Going through a second time in the AFS_CELL_FAILED state will cause it to
try to remove the cell again, potentially leading to the proc list being
removed.

Fixes: 989782dcdc91 ("afs: Overhaul cell database management")
Reported-by: syzbot+b994ecf2b023f14832c1@syzkaller.appspotmail.com
Reported-by: syzbot+0e0db88e1eb44a91ae8d@syzkaller.appspotmail.com
Reported-by: syzbot+2d0585e5efcd43d113c2@syzkaller.appspotmail.com
Reported-by: syzbot+1ecc2f9d3387f1d79d42@syzkaller.appspotmail.com
Reported-by: syzbot+18d51774588492bf3f69@syzkaller.appspotmail.com
Reported-by: syzbot+a5e4946b04d6ca8fa5f3@syzkaller.appspotmail.com
Suggested-by: Hillf Danton &lt;hdanton@sina.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Hillf Danton &lt;hdanton@sina.com&gt;
</content>
</entry>
<entry>
<title>afs: Fix cell purging with aliases</title>
<updated>2020-10-16T13:38:26+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2020-10-15T10:05:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=286377f6bdf71568a4cf07104fe44006ae0dba6d'/>
<id>urn:sha1:286377f6bdf71568a4cf07104fe44006ae0dba6d</id>
<content type='text'>
When the afs module is removed, one of the things that has to be done is to
purge the cell database.  afs_cell_purge() cancels the management timer and
then starts the cell manager work item to do the purging.  This does a
single run through and then assumes that all cells are now purged - but
this is no longer the case.

With the introduction of alias detection, a later cell in the database can
now be holding an active count on an earlier cell (cell-&gt;alias_of).  The
purge scan passes by the earlier cell first, but this can't be got rid of
until it has discarded the alias.  Ordinarily, afs_unuse_cell() would
handle this by setting the management timer to trigger another pass - but
afs_set_cell_timer() doesn't do anything if the namespace is being removed
(net-&gt;live == false).  rmmod then hangs in the wait on cells_outstanding in
afs_cell_purge().

Fix this by making afs_set_cell_timer() directly queue the cell manager if
net-&gt;live is false.  This causes additional management passes.

Queueing the cell manager increments cells_outstanding to make sure the
wait won't complete until all cells are destroyed.

Fixes: 8a070a964877 ("afs: Detect cell aliases 1 - Cells with root volumes")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>afs: Fix cell refcounting by splitting the usage counter</title>
<updated>2020-10-16T13:38:22+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2019-07-23T10:24:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=88c853c3f5c0a07c5db61b494ee25152535cfeee'/>
<id>urn:sha1:88c853c3f5c0a07c5db61b494ee25152535cfeee</id>
<content type='text'>
Management of the lifetime of afs_cell struct has some problems due to the
usage counter being used to determine whether objects of that type are in
use in addition to whether anyone might be interested in the structure.

This is made trickier by cell objects being cached for a period of time in
case they're quickly reused as they hold the result of a setup process that
may be slow (DNS lookups, AFS RPC ops).

Problems include the cached root volume from alias resolution pinning its
parent cell record, rmmod occasionally hanging and occasionally producing
assertion failures.

Fix this by splitting the count of active users from the struct reference
count.  Things then work as follows:

 (1) The cell cache keeps +1 on the cell's activity count and this has to
     be dropped before the cell can be removed.  afs_manage_cell() tries to
     exchange the 1 to a 0 with the cells_lock write-locked, and if
     successful, the record is removed from the net-&gt;cells.

 (2) One struct ref is 'owned' by the activity count.  That is put when the
     active count is reduced to 0 (final_destruction label).

 (3) A ref can be held on a cell whilst it is queued for management on a
     work queue without confusing the active count.  afs_queue_cell() is
     added to wrap this.

 (4) The queue's ref is dropped at the end of the management.  This is
     split out into a separate function, afs_manage_cell_work().

 (5) The root volume record is put after a cell is removed (at the
     final_destruction label) rather then in the RCU destruction routine.

 (6) Volumes hold struct refs, but aren't active users.

 (7) Both counts are displayed in /proc/net/afs/cells.

There are some management function changes:

 (*) afs_put_cell() now just decrements the refcount and triggers the RCU
     destruction if it becomes 0.  It no longer sets a timer to have the
     manager do this.

 (*) afs_use_cell() and afs_unuse_cell() are added to increase and decrease
     the active count.  afs_unuse_cell() sets the management timer.

 (*) afs_queue_cell() is added to queue a cell with approprate refs.

There are also some other fixes:

 (*) Don't let /proc/net/afs/cells access a cell's vllist if it's NULL.

 (*) Make sure that candidate cells in lookups are properly destroyed
     rather than being simply kfree'd.  This ensures the bits it points to
     are destroyed also.

 (*) afs_dec_cells_outstanding() is now called in cell destruction rather
     than at "final_destruction".  This ensures that cell-&gt;net is still
     valid to the end of the destructor.

 (*) As a consequence of the previous two changes, move the increment of
     net-&gt;cells_outstanding that was at the point of insertion into the
     tree to the allocation routine to correctly balance things.

Fixes: 989782dcdc91 ("afs: Overhaul cell database management")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
</feed>
