<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/iversion.h, branch v6.6.132</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-01-26T11:59:40+00:00</updated>
<entry>
<title>fs: clarify when the i_version counter must be updated</title>
<updated>2023-01-26T11:59:40+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@kernel.org</email>
</author>
<published>2022-08-22T13:04:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a3bb710383cbca2a0f1f4e5a1c7ef8dde14eff95'/>
<id>urn:sha1:a3bb710383cbca2a0f1f4e5a1c7ef8dde14eff95</id>
<content type='text'>
The i_version field in the kernel has had different semantics over
the decades, but NFSv4 has certain expectations. Update the comments
in iversion.h to describe when the i_version must change.

Cc: Colin Walters &lt;walters@verbum.org&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Trond Myklebust &lt;trondmy@hammerspace.com&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
</entry>
<entry>
<title>fs: uninline inode_query_iversion</title>
<updated>2023-01-26T10:41:39+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@kernel.org</email>
</author>
<published>2022-09-16T13:37:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c5bc1b3ff35ae321d018d0c4ba66b062e4fb1e05'/>
<id>urn:sha1:c5bc1b3ff35ae321d018d0c4ba66b062e4fb1e05</id>
<content type='text'>
Reviewed-by: NeilBrown &lt;neilb@suse.de&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
</entry>
<entry>
<title>fs: uninline inode_maybe_inc_iversion()</title>
<updated>2022-10-03T21:21:43+00:00</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2022-09-09T20:57:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5ca14835dc429c09fefb290f60343fe266382760'/>
<id>urn:sha1:5ca14835dc429c09fefb290f60343fe266382760</id>
<content type='text'>
It has many callsites and is large.

   text	   data	    bss	    dec	    hex	filename
  91796	  15984	    512	 108292	  1a704	mm/shmem.o-before
  91180	  15984	    512	 107676	  1a49c	mm/shmem.o-after

Acked-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>iversion: use atomic64_try_cmpxchg)</title>
<updated>2022-09-12T04:55:08+00:00</updated>
<author>
<name>Uros Bizjak</name>
<email>ubizjak@gmail.com</email>
</author>
<published>2022-08-21T19:30:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=da3f52ba359590d8eb465ae0d8b6e11d6fd9432f'/>
<id>urn:sha1:da3f52ba359590d8eb465ae0d8b6e11d6fd9432f</id>
<content type='text'>
Use atomic64_try_cmpxchg instead of
atomic64_cmpxchg (*ptr, old, new) == old in inode_set_max_iversion_raw,
inode_maybe_inc_version and inode_query_iversion. x86 CMPXCHG instruction
returns success in ZF flag, so this change saves a compare after cmpxchg
(and related move instruction in front of cmpxchg).

Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails, enabling further code simplifications.

The loop in inode_maybe_inc_iversion improves from:

    5563:	48 89 ca             	mov    %rcx,%rdx
    5566:	48 89 c8             	mov    %rcx,%rax
    5569:	48 83 e2 fe          	and    $0xfffffffffffffffe,%rdx
    556d:	48 83 c2 02          	add    $0x2,%rdx
    5571:	f0 48 0f b1 16       	lock cmpxchg %rdx,(%rsi)
    5576:	48 39 c1             	cmp    %rax,%rcx
    5579:	0f 84 85 fc ff ff    	je     5204 &lt;...&gt;
    557f:	48 89 c1             	mov    %rax,%rcx
    5582:	eb df                	jmp    5563 &lt;...&gt;

to:

    5563:	48 89 c2             	mov    %rax,%rdx
    5566:	48 83 e2 fe          	and    $0xfffffffffffffffe,%rdx
    556a:	48 83 c2 02          	add    $0x2,%rdx
    556e:	f0 48 0f b1 11       	lock cmpxchg %rdx,(%rcx)
    5573:	0f 84 8b fc ff ff    	je     5204 &lt;...&gt;
    5579:	eb e8                	jmp    5563 &lt;...&gt;

Link: https://lkml.kernel.org/r/20220821193011.88208-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak &lt;ubizjak@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>nfsd: minor nfsd4_change_attribute cleanup</title>
<updated>2020-12-09T14:39:37+00:00</updated>
<author>
<name>J. Bruce Fields</name>
<email>bfields@redhat.com</email>
</author>
<published>2020-11-30T22:46:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4b03d99794eeed27650597a886247c6427ce1055'/>
<id>urn:sha1:4b03d99794eeed27650597a886247c6427ce1055</id>
<content type='text'>
Minor cleanup, no change in behavior.

Also pull out a common helper that'll be useful elsewhere.

Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
</content>
</entry>
<entry>
<title>iversion: add a routine to update a raw value with a larger one</title>
<updated>2019-07-08T12:01:43+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@kernel.org</email>
</author>
<published>2019-06-05T21:24:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=441d367644e2f60b37f36bfc656deee551acba5b'/>
<id>urn:sha1:441d367644e2f60b37f36bfc656deee551acba5b</id>
<content type='text'>
Under ceph, clients can be independently updating iversion themselves,
while working under comprehensive sets of caps on an inode. In that
situation we always want to prefer the largest value of a change
attribute. Add a new function that will update a raw value with a larger
one, but otherwise leave it alone.

Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Reviewed-by: "Yan, Zheng" &lt;zyan@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>iversion: Rename make inode_cmp_iversion{+raw} to inode_eq_iversion{+raw}</title>
<updated>2018-02-01T13:15:25+00:00</updated>
<author>
<name>Goffredo Baroncelli</name>
<email>kreijack@inwind.it</email>
</author>
<published>2018-02-01T13:15:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c472c07bfed9c87d7e0b2c052d7e77fedd7109a9'/>
<id>urn:sha1:c472c07bfed9c87d7e0b2c052d7e77fedd7109a9</id>
<content type='text'>
The function inode_cmp_iversion{+raw} is counter-intuitive, because it
returns true when the counters are different and false when these are equal.

Rename it to inode_eq_iversion{+raw}, which will returns true when
the counters are equal and false otherwise.

Signed-off-by: Goffredo Baroncelli &lt;kreijack@inwind.it&gt;
Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
</content>
</entry>
<entry>
<title>iversion: make inode_cmp_iversion{+raw} return bool instead of s64</title>
<updated>2018-01-31T16:43:35+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@redhat.com</email>
</author>
<published>2018-01-30T20:32:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c0cef30e4ff0dc025f4a1660b8f0ba43ed58426e'/>
<id>urn:sha1:c0cef30e4ff0dc025f4a1660b8f0ba43ed58426e</id>
<content type='text'>
As Linus points out:

    The inode_cmp_iversion{+raw}() functions are pure and utter crap.

    Why?

    You say that they return 0/negative/positive, but they do so in a
    completely broken manner. They return that ternary value as the
    sequence number difference in a 's64', which means that if you
    actually care about that ternary value, and do the *sane* thing that
    the kernel-doc of the function implies is the right thing, you would
    do

        int cmp = inode_cmp_iversion(inode, old);
        if (cmp &lt; 0 ...

    and as a result you get code that looks sane, but that doesn't
    actually *WORK* right.

Since none of the callers actually care about the ternary value here,
convert the inode_cmp_iversion{+raw} functions to just return a boolean
value (false for matching, true for non-matching).

This matches the existing use of these functions just fine, and makes it
simple to convert them to return a ternary value in the future if we
grow callers that need it.

With this change we can also reimplement inode_cmp_iversion in a simpler
way using inode_peek_iversion.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs: handle inode-&gt;i_version more efficiently</title>
<updated>2018-01-29T11:42:21+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@redhat.com</email>
</author>
<published>2017-12-21T12:45:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f02a9ad1f15daf4378afeda025a53455f72645dd'/>
<id>urn:sha1:f02a9ad1f15daf4378afeda025a53455f72645dd</id>
<content type='text'>
Since i_version is mostly treated as an opaque value, we can exploit that
fact to avoid incrementing it when no one is watching. With that change,
we can avoid incrementing the counter on writes, unless someone has
queried for it since it was last incremented. If the a/c/mtime don't
change, and the i_version hasn't changed, then there's no need to dirty
the inode metadata on a write.

Convert the i_version counter to an atomic64_t, and use the lowest order
bit to hold a flag that will tell whether anyone has queried the value
since it was last incremented.

When we go to maybe increment it, we fetch the value and check the flag
bit.  If it's clear then we don't need to do anything if the update
isn't being forced.

If we do need to update, then we increment the counter by 2, and clear
the flag bit, and then use a CAS op to swap it into place. If that
works, we return true. If it doesn't then do it again with the value
that we fetch from the CAS operation.

On the query side, if the flag is already set, then we just shift the
value down by 1 bit and return it. Otherwise, we set the flag in our
on-stack value and again use cmpxchg to swap it into place if it hasn't
changed. If it has, then we use the value from the cmpxchg as the new
"old" value and try again.

This method allows us to avoid incrementing the counter on writes (and
dirtying the metadata) under typical workloads. We only need to increment
if it has been queried since it was last changed.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Acked-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Tested-by: Krzysztof Kozlowski &lt;krzk@kernel.org&gt;
</content>
</entry>
<entry>
<title>fs: don't take the i_lock in inode_inc_iversion</title>
<updated>2018-01-29T11:42:20+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@redhat.com</email>
</author>
<published>2017-12-18T11:25:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7594c461161745f0d38a4346d4f895e0837b8094'/>
<id>urn:sha1:7594c461161745f0d38a4346d4f895e0837b8094</id>
<content type='text'>
The rationale for taking the i_lock when incrementing this value is
lost in antiquity. The readers of the field don't take it (at least
not universally), so my assumption is that it was only done here to
serialize incrementors.

If that is indeed the case, then we can drop the i_lock from this
codepath and treat it as a atomic64_t for the purposes of
incrementing it. This allows us to use inode_inc_iversion without
any danger of lock inversion.

Note that the read side is not fetched atomically with this change.
The assumption here is that that is not a critical issue since the
i_version is not fully synchronized with anything else anyway.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
</feed>
