<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/bitops.h, branch v6.6.132</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-11-02T13:14:42+00:00</updated>
<entry>
<title>bits: introduce fixed-type GENMASK_U*()</title>
<updated>2025-11-02T13:14:42+00:00</updated>
<author>
<name>Vincent Mailhol</name>
<email>mailhol.vincent@wanadoo.fr</email>
</author>
<published>2025-10-31T13:11:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ddd31f5a5ff3984d163bd4dfdb08ab0fc4072f88'/>
<id>urn:sha1:ddd31f5a5ff3984d163bd4dfdb08ab0fc4072f88</id>
<content type='text'>
[ Upstream commit 19408200c094858d952a90bf4977733dc89a4df5 ]

Add GENMASK_TYPE() which generalizes __GENMASK() to support different
types, and implement fixed-types versions of GENMASK() based on it.
The fixed-type version allows more strict checks to the min/max values
accepted, which is useful for defining registers like implemented by
i915 and xe drivers with their REG_GENMASK*() macros.

The strict checks rely on shift-count-overflow compiler check to fail
the build if a number outside of the range allowed is passed.
Example:

  #define FOO_MASK GENMASK_U32(33, 4)

will generate a warning like:

  include/linux/bits.h:51:27: error: right shift count &gt;= width of type [-Werror=shift-count-overflow]
     51 |               type_max(t) &gt;&gt; (BITS_PER_TYPE(t) - 1 - (h)))))
        |                           ^~

The result is casted to the corresponding fixed width type. For
example, GENMASK_U8() returns an u8. Note that because of the C
promotion rules, GENMASK_U8() and GENMASK_U16() will immediately be
promoted to int if used in an expression. Regardless, the main goal is
not to get the correct type, but rather to enforce more checks at
compile time.

While GENMASK_TYPE() is crafted to cover all variants, including the
already existing GENMASK(), GENMASK_ULL() and GENMASK_U128(), for the
moment, only use it for the newly introduced GENMASK_U*(). The
consolidation will be done in a separate change.

Co-developed-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Signed-off-by: Lucas De Marchi &lt;lucas.demarchi@intel.com&gt;
Acked-by: Jani Nikula &lt;jani.nikula@intel.com&gt;
Signed-off-by: Vincent Mailhol &lt;mailhol.vincent@wanadoo.fr&gt;
Reviewed-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Stable-dep-of: 2ba5772e530f ("gpio: idio-16: Define fixed direction of the GPIO lines")
Signed-off-by: William Breathitt Gray &lt;wbg@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>bitops: add missing prototype check</title>
<updated>2024-06-12T09:11:38+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>aleksander.lobakin@intel.com</email>
</author>
<published>2024-03-27T15:23:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0fdbbe7ee7f4f68173b525b653511de548a99a0b'/>
<id>urn:sha1:0fdbbe7ee7f4f68173b525b653511de548a99a0b</id>
<content type='text'>
[ Upstream commit 72cc1980a0ef3ccad0d539e7dace63d0d7d432a4 ]

Commit 8238b4579866 ("wait_on_bit: add an acquire memory barrier") added
a new bitop, test_bit_acquire(), with proper wrapping in order to try to
optimize it at compile-time, but missed the list of bitops used for
checking their prototypes a bit below.
The functions added have consistent prototypes, so that no more changes
are required and no functional changes take place.

Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier")
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Signed-off-by: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2022-10-12T18:00:22+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-10-12T18:00:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=676cb4957396411fdb7aba906d5f950fc3de7cc9'/>
<id>urn:sha1:676cb4957396411fdb7aba906d5f950fc3de7cc9</id>
<content type='text'>
Pull non-MM updates from Andrew Morton:

 - hfs and hfsplus kmap API modernization (Fabio Francesco)

 - make crash-kexec work properly when invoked from an NMI-time panic
   (Valentin Schneider)

 - ntfs bugfixes (Hawkins Jiawei)

 - improve IPC msg scalability by replacing atomic_t's with percpu
   counters (Jiebin Sun)

 - nilfs2 cleanups (Minghao Chi)

 - lots of other single patches all over the tree!

* tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
  include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype
  proc: test how it holds up with mapping'less process
  mailmap: update Frank Rowand email address
  ia64: mca: use strscpy() is more robust and safer
  init/Kconfig: fix unmet direct dependencies
  ia64: update config files
  nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure
  fork: remove duplicate included header files
  init/main.c: remove unnecessary (void*) conversions
  proc: mark more files as permanent
  nilfs2: remove the unneeded result variable
  nilfs2: delete unnecessary checks before brelse()
  checkpatch: warn for non-standard fixes tag style
  usr/gen_init_cpio.c: remove unnecessary -1 values from int file
  ipc/msg: mitigate the lock contention with percpu counter
  percpu: add percpu_counter_add_local and percpu_counter_sub_local
  fs/ocfs2: fix repeated words in comments
  relay: use kvcalloc to alloc page array in relay_alloc_page_array
  proc: make config PROC_CHILDREN depend on PROC_FS
  fs: uninline inode_maybe_inc_iversion()
  ...
</content>
</entry>
<entry>
<title>lib: add find_nth{,_and,_andnot}_bit()</title>
<updated>2022-09-26T19:19:12+00:00</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2022-09-18T03:07:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3cea8d475327756066e2a54f0b651bb7284dd448'/>
<id>urn:sha1:3cea8d475327756066e2a54f0b651bb7284dd448</id>
<content type='text'>
Kernel lacks for a function that searches for Nth bit in a bitmap.
Usually people do it like this:
	for_each_set_bit(bit, mask, size)
		if (n-- == 0)
			return bit;

We can do it more efficiently, if we:
1. find a word containing Nth bit, using hweight(); and
2. find the bit, using a helper fns(), that works similarly to
   __ffs() and ffz().

fns() is implemented as a simple loop. For x86_64, there's PDEP instruction
to do that: ret = clz(pdep(1 &lt;&lt; idx, num)). However, for large bitmaps the
most of improvement comes from using hweight(), so I kept fns() simple.

New find_nth_bit() is ~70 times faster on x86_64/kvm in find_bit benchmark:
find_nth_bit:                  7154190 ns,  16411 iterations
for_each_bit:                505493126 ns,  16315 iterations

With all that, a family of 3 new functions is added, and used where
appropriate in the following patches.

Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>bitops: use try_cmpxchg in set_mask_bits and bit_clear_unless</title>
<updated>2022-09-12T04:55:09+00:00</updated>
<author>
<name>Uros Bizjak</name>
<email>ubizjak@gmail.com</email>
</author>
<published>2022-08-22T14:38:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e1d7c7609ae0933b59840390c1b207ac0a925c8b'/>
<id>urn:sha1:e1d7c7609ae0933b59840390c1b207ac0a925c8b</id>
<content type='text'>
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
set_mask_bits and bit_clear_unless.  x86 CMPXCHG instruction returns
success in ZF flag, so this change saves a compare after cmpxchg (and
related move instruction in front of cmpxchg).

Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails, enabling further code simplifications.

Link: https://lkml.kernel.org/r/20220822143851.3290-1-ubizjak@gmail.com
Signed-off-by: Uros Bizjak &lt;ubizjak@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>wait_on_bit: add an acquire memory barrier</title>
<updated>2022-08-26T16:30:25+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2022-08-26T13:17:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8238b4579866b7c1bb99883cfe102a43db5506ff'/>
<id>urn:sha1:8238b4579866b7c1bb99883cfe102a43db5506ff</id>
<content type='text'>
There are several places in the kernel where wait_on_bit is not followed
by a memory barrier (for example, in drivers/md/dm-bufio.c:new_read).

On architectures with weak memory ordering, it may happen that memory
accesses that follow wait_on_bit are reordered before wait_on_bit and
they may return invalid data.

Fix this class of bugs by introducing a new function "test_bit_acquire"
that works like test_bit, but has acquire memory ordering semantics.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Acked-by: Will Deacon &lt;will@kernel.org&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>bitops: let optimize out non-atomic bitops on compile-time constants</title>
<updated>2022-07-01T02:52:42+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>alexandr.lobakin@intel.com</email>
</author>
<published>2022-06-24T12:13:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b03fc1173c0c2bb8fad61902a862985cecdc4b1b'/>
<id>urn:sha1:b03fc1173c0c2bb8fad61902a862985cecdc4b1b</id>
<content type='text'>
Currently, many architecture-specific non-atomic bitop
implementations use inline asm or other hacks which are faster or
more robust when working with "real" variables (i.e. fields from
the structures etc.), but the compilers have no clue how to optimize
them out when called on compile-time constants. That said, the
following code:

	DECLARE_BITMAP(foo, BITS_PER_LONG) = { }; // -&gt; unsigned long foo[1];
	unsigned long bar = BIT(BAR_BIT);
	unsigned long baz = 0;

	__set_bit(FOO_BIT, foo);
	baz |= BIT(BAZ_BIT);

	BUILD_BUG_ON(!__builtin_constant_p(test_bit(FOO_BIT, foo));
	BUILD_BUG_ON(!__builtin_constant_p(bar &amp; BAR_BIT));
	BUILD_BUG_ON(!__builtin_constant_p(baz &amp; BAZ_BIT));

triggers the first assertion on x86_64, which means that the
compiler is unable to evaluate it to a compile-time initializer
when the architecture-specific bitop is used even if it's obvious.
In order to let the compiler optimize out such cases, expand the
bitop() macro to use the "constant" C non-atomic bitop
implementations when all of the arguments passed are compile-time
constants, which means that the result will be a compile-time
constant as well, so that it produces more efficient and simple
code in 100% cases, comparing to the architecture-specific
counterparts.

The savings are architecture, compiler and compiler flags dependent,
for example, on x86_64 -O2:

GCC 12: add/remove: 78/29 grow/shrink: 332/525 up/down: 31325/-61560 (-30235)
LLVM 13: add/remove: 79/76 grow/shrink: 184/537 up/down: 55076/-141892 (-86816)
LLVM 14: add/remove: 10/3 grow/shrink: 93/138 up/down: 3705/-6992 (-3287)

and ARM64 (courtesy of Mark):

GCC 11: add/remove: 92/29 grow/shrink: 933/2766 up/down: 39340/-82580 (-43240)
LLVM 14: add/remove: 21/11 grow/shrink: 620/651 up/down: 12060/-15824 (-3764)

Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Alexander Lobakin &lt;alexandr.lobakin@intel.com&gt;
Reviewed-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>bitops: wrap non-atomic bitops with a transparent macro</title>
<updated>2022-07-01T02:52:41+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>alexandr.lobakin@intel.com</email>
</author>
<published>2022-06-24T12:13:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e69eb9c460f128b71c6b995d75a05244e4b6cc3e'/>
<id>urn:sha1:e69eb9c460f128b71c6b995d75a05244e4b6cc3e</id>
<content type='text'>
In preparation for altering the non-atomic bitops with a macro, wrap
them in a transparent definition. This requires prepending one more
'_' to their names in order to be able to do that seamlessly. It is
a simple change, given that all the non-prefixed definitions are now
in asm-generic.
sparc32 already has several triple-underscored functions, so I had
to rename them ('___' -&gt; 'sp32_').

Signed-off-by: Alexander Lobakin &lt;alexandr.lobakin@intel.com&gt;
Reviewed-by: Marco Elver &lt;elver@google.com&gt;
Reviewed-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>bitops: define const_*() versions of the non-atomics</title>
<updated>2022-07-01T02:52:41+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>alexandr.lobakin@intel.com</email>
</author>
<published>2022-06-24T12:13:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bb7379bfa680bd48b468e856475778db2ad866c1'/>
<id>urn:sha1:bb7379bfa680bd48b468e856475778db2ad866c1</id>
<content type='text'>
Define const_*() variants of the non-atomic bitops to be used when
the input arguments are compile-time constants, so that the compiler
will be always able to resolve those to compile-time constants as
well. Those are mostly direct aliases for generic_*() with one
exception for const_test_bit(): the original one is declared
atomic-safe and thus doesn't discard the `volatile` qualifier, so
in order to let optimize code, define it separately disregarding
the qualifier.
Add them to the compile-time type checks as well just in case.

Suggested-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Alexander Lobakin &lt;alexandr.lobakin@intel.com&gt;
Reviewed-by: Marco Elver &lt;elver@google.com&gt;
Reviewed-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>bitops: unify non-atomic bitops prototypes across architectures</title>
<updated>2022-07-01T02:52:41+00:00</updated>
<author>
<name>Alexander Lobakin</name>
<email>alexandr.lobakin@intel.com</email>
</author>
<published>2022-06-24T12:13:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0e862838f290147ea9c16db852d8d494b552d38d'/>
<id>urn:sha1:0e862838f290147ea9c16db852d8d494b552d38d</id>
<content type='text'>
Currently, there is a mess with the prototypes of the non-atomic
bitops across the different architectures:

ret	bool, int, unsigned long
nr	int, long, unsigned int, unsigned long
addr	volatile unsigned long *, volatile void *

Thankfully, it doesn't provoke any bugs, but can sometimes make
the compiler angry when it's not handy at all.
Adjust all the prototypes to the following standard:

ret	bool				retval can be only 0 or 1
nr	unsigned long			native; signed makes no sense
addr	volatile unsigned long *	bitmaps are arrays of ulongs

Next, some architectures don't define 'arch_' versions as they don't
support instrumentation, others do. To make sure there is always the
same set of callables present and to ease any potential future
changes, make them all follow the rule:
 * architecture-specific files define only 'arch_' versions;
 * non-prefixed versions can be defined only in asm-generic files;
and place the non-prefixed definitions into a new file in
asm-generic to be included by non-instrumented architectures.

Finally, add some static assertions in order to prevent people from
making a mess in this room again.
I also used the %__always_inline attribute consistently, so that
they always get resolved to the actual operations.

Suggested-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Signed-off-by: Alexander Lobakin &lt;alexandr.lobakin@intel.com&gt;
Acked-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Reviewed-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Reviewed-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
</feed>
