<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/lib/find_bit.c, branch v6.6.131</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.131</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.131'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-03-19T17:02:04+00:00</updated>
<entry>
<title>cpumask: introduce for_each_cpu_or</title>
<updated>2023-03-19T17:02:04+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2023-03-16T00:31:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1470afefc3c42df5d1662f87d079b46651bdc95b'/>
<id>urn:sha1:1470afefc3c42df5d1662f87d079b46651bdc95b</id>
<content type='text'>
Equivalent of for_each_cpu_and, except it ORs the two masks together
so it iterates all the CPUs present in either mask.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Signed-off-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
</content>
</entry>
<entry>
<title>lib/find: introduce find_nth_and_andnot_bit</title>
<updated>2023-02-08T02:20:00+00:00</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2023-01-21T04:24:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=43245117806ff8914e37327b610fc08b5ddedc91'/>
<id>urn:sha1:43245117806ff8914e37327b610fc08b5ddedc91</id>
<content type='text'>
In the following patches the function is used to implement in-place bitmaps
traversing without storing intermediate result in temporary bitmaps.

Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Acked-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Reviewed-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Reviewed-by: Peter Lafreniere &lt;peter@n8pjl.ca&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>lib/find_bit: Introduce find_next_andnot_bit()</title>
<updated>2022-10-06T12:57:36+00:00</updated>
<author>
<name>Valentin Schneider</name>
<email>vschneid@redhat.com</email>
</author>
<published>2022-10-03T15:34:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=90d482908eedd56f01a707325aa541cf9c40f936'/>
<id>urn:sha1:90d482908eedd56f01a707325aa541cf9c40f936</id>
<content type='text'>
In preparation of introducing for_each_cpu_andnot(), add a variant of
find_next_bit() that negate the bits in @addr2 when ANDing them with the
bits in @addr1.

Signed-off-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
</content>
</entry>
<entry>
<title>lib: add find_nth{,_and,_andnot}_bit()</title>
<updated>2022-09-26T19:19:12+00:00</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2022-09-18T03:07:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3cea8d475327756066e2a54f0b651bb7284dd448'/>
<id>urn:sha1:3cea8d475327756066e2a54f0b651bb7284dd448</id>
<content type='text'>
Kernel lacks for a function that searches for Nth bit in a bitmap.
Usually people do it like this:
	for_each_set_bit(bit, mask, size)
		if (n-- == 0)
			return bit;

We can do it more efficiently, if we:
1. find a word containing Nth bit, using hweight(); and
2. find the bit, using a helper fns(), that works similarly to
   __ffs() and ffz().

fns() is implemented as a simple loop. For x86_64, there's PDEP instruction
to do that: ret = clz(pdep(1 &lt;&lt; idx, num)). However, for large bitmaps the
most of improvement comes from using hweight(), so I kept fns() simple.

New find_nth_bit() is ~70 times faster on x86_64/kvm in find_bit benchmark:
find_nth_bit:                  7154190 ns,  16411 iterations
for_each_bit:                505493126 ns,  16315 iterations

With all that, a family of 3 new functions is added, and used where
appropriate in the following patches.

Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>lib/find_bit: optimize find_next_bit() functions</title>
<updated>2022-09-21T19:21:32+00:00</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2022-09-15T02:07:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e79864f3164c573afce09ec4b72b75ebe061c14d'/>
<id>urn:sha1:e79864f3164c573afce09ec4b72b75ebe061c14d</id>
<content type='text'>
Over the past couple years, the function _find_next_bit() was extended
with parameters that modify its behavior to implement and- zero- and le-
flavors. The parameters are passed at compile time, but current design
prevents a compiler from optimizing out the conditionals.

As find_next_bit() API grows, I expect that more parameters will be added.
Current design would require more conditional code in _find_next_bit(),
which would bloat the helper even more and make it barely readable.

This patch replaces _find_next_bit() with a macro FIND_NEXT_BIT, and adds
a set of wrappers, so that the compile-time optimizations become possible.

The common logic is moved to the new macro, and all flavors may be
generated by providing a FETCH macro parameter, like in this example:

  #define FIND_NEXT_BIT(FETCH, MUNGE, size, start) ...

  find_next_xornot_and_bit(addr1, addr2, addr3, size, start)
  {
	return FIND_NEXT_BIT(addr1[idx] ^ ~addr2[idx] &amp; addr3[idx],
				/* nop */, size, start);
  }

The FETCH may be of any complexity, as soon as it only refers the bitmap(s)
and an iterator idx.

MUNGE is here to support _le code generation for BE builds. May be
empty.

I ran find_bit_benchmark 16 times on top of 6.0-rc2 and 16 times on top
of 6.0-rc2 + this series. The results for kvm/x86_64 are:

                      v6.0-rc2  Optimized       Difference  Z-score
Random dense bitmap         ns         ns        ns      %
find_next_bit:          787735     670546    117189   14.9     3.97
find_next_zero_bit:     777492     664208    113284   14.6    10.51
find_last_bit:          830925     687573    143352   17.3     2.35
find_first_bit:        3874366    3306635    567731   14.7     1.84
find_first_and_bit:   40677125   37739887   2937238    7.2     1.36
find_next_and_bit:      347865     304456     43409   12.5     1.35

Random sparse bitmap
find_next_bit:           19816      14021      5795   29.2     6.10
find_next_zero_bit:    1318901    1223794     95107    7.2     1.41
find_last_bit:           14573      13514      1059    7.3     6.92
find_first_bit:        1313321    1249024     64297    4.9     1.53
find_first_and_bit:       8921       8098       823    9.2     4.56
find_next_and_bit:        9796       7176      2620   26.7     5.39

Where the statistics is significant (z-score &gt; 3), the improvement
is ~15%.

According to the bloat-o-meter, the Image size is 10-11K less:

x86_64/defconfig:
add/remove: 32/14 grow/shrink: 61/782 up/down: 6344/-16521 (-10177)

arm64/defconfig:
add/remove: 3/2 grow/shrink: 50/714 up/down: 608/-11556 (-10948)

Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>lib/find_bit: create find_first_zero_bit_le()</title>
<updated>2022-09-21T19:17:18+00:00</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2022-09-15T02:07:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=14a99e130f2758bc826a7e7a8bdf6f7400b54f0f'/>
<id>urn:sha1:14a99e130f2758bc826a7e7a8bdf6f7400b54f0f</id>
<content type='text'>
find_first_zero_bit_le() is an alias to find_next_zero_bit_le(),
despite that 'next' is known to be slower than 'first' version.

Now that we have common FIND_FIRST_BIT() macro helper, it's trivial
to implement find_first_zero_bit_le() as a real function.

Reviewed-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>lib/find_bit: introduce FIND_FIRST_BIT() macro</title>
<updated>2022-09-21T19:15:53+00:00</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2022-09-15T02:07:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=58414bbb58a8b49af20e3accae56f6f8344b2424'/>
<id>urn:sha1:58414bbb58a8b49af20e3accae56f6f8344b2424</id>
<content type='text'>
Now that we have many flavors of find_first_bit(), and expect even more,
it's better to have one macro that generates optimal code for all and makes
maintaining of slightly different functions simpler.

The logic common to all versions is moved to the new macro, and all the
flavors are generated by providing an FETCH macro-parameter, like
in this example:

  #define FIND_FIRST_BIT(FETCH, MUNGE, size) ...

  find_first_ornot_and_bit(addr1, addr2, addr3, size)
  {
        return FIND_FIRST_BIT(addr1[idx] | ~addr2[idx] &amp; addr3[idx], /* nop */, size);
  }

The FETCH may be of any complexity, as soon as it only refers
the bitmap(s) and an iterator idx.

MUNGE is here to support _le code generation for BE builds. May be
empty.

Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Reviewed-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>lib: add find_first_and_bit()</title>
<updated>2022-01-15T16:47:31+00:00</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2021-08-14T21:17:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f68edc9297bf3f7c94abb54b9b0b053607f7587b'/>
<id>urn:sha1:f68edc9297bf3f7c94abb54b9b0b053607f7587b</id>
<content type='text'>
Currently find_first_and_bit() is an alias to find_next_and_bit(). However,
it is widely used in cpumask, so it worth to optimize it. This patch adds
its own implementation for find_first_and_bit().

On x86_64 find_bit_benchmark says:

Before (#define find_first_and_bit(...) find_next_and_bit(..., 0):
Start testing find_bit() with random-filled bitmap
[  140.291468] find_first_and_bit:           46890919 ns,  32671 iterations
Start testing find_bit() with sparse bitmap
[  140.295028] find_first_and_bit:               7103 ns,      1 iterations

After:
Start testing find_bit() with random-filled bitmap
[  162.574907] find_first_and_bit:           25045813 ns,  32846 iterations
Start testing find_bit() with sparse bitmap
[  162.578458] find_first_and_bit:               4900 ns,      1 iterations

(Thanks to Alexey Klimov for thorough testing.)

Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Tested-by: Wolfram Sang &lt;wsa+renesas@sang-engineering.com&gt;
Tested-by: Alexey Klimov &lt;aklimov@redhat.com&gt;
</content>
</entry>
<entry>
<title>lib: add fast path for find_first_*_bit() and find_last_bit()</title>
<updated>2021-05-07T02:24:12+00:00</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2021-05-07T01:03:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2cc7b6a44ac21d31b398b03f4845c53152070416'/>
<id>urn:sha1:2cc7b6a44ac21d31b398b03f4845c53152070416</id>
<content type='text'>
Similarly to bitmap functions, users would benefit if we'll handle a case
of small-size bitmaps that fit into a single word.

While here, move the find_last_bit() declaration to bitops/find.h where
other find_*_bit() functions sit.

Link: https://lkml.kernel.org/r/20210401003153.97325-11-yury.norov@gmail.com
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Acked-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Acked-by: Andy Shevchenko &lt;andy.shevchenko@gmail.com&gt;
Cc: Alexey Klimov &lt;aklimov@redhat.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Jianpeng Ma &lt;jianpeng.ma@intel.com&gt;
Cc: Joe Perches &lt;joe@perches.com&gt;
Cc: John Paul Adrian Glaubitz &lt;glaubitz@physik.fu-berlin.de&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Cc: Rich Felker &lt;dalias@libc.org&gt;
Cc: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Cc: Wei Yang &lt;richard.weiyang@linux.alibaba.com&gt;
Cc: Wolfram Sang &lt;wsa+renesas@sang-engineering.com&gt;
Cc: Yoshinori Sato &lt;ysato@users.osdn.me&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>lib: inline _find_next_bit() wrappers</title>
<updated>2021-05-07T02:24:12+00:00</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2021-05-07T01:03:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5c88af59f9abc202648a431428ad9d32e5d2a201'/>
<id>urn:sha1:5c88af59f9abc202648a431428ad9d32e5d2a201</id>
<content type='text'>
lib/find_bit.c declares five single-line wrappers for _find_next_bit().
We may turn those wrappers to inline functions.  It eliminates unneeded
function calls and opens room for compile-time optimizations.

Link: https://lkml.kernel.org/r/20210401003153.97325-8-yury.norov@gmail.com
Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
Acked-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Acked-by: Andy Shevchenko &lt;andy.shevchenko@gmail.com&gt;
Cc: Alexey Klimov &lt;aklimov@redhat.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Jianpeng Ma &lt;jianpeng.ma@intel.com&gt;
Cc: Joe Perches &lt;joe@perches.com&gt;
Cc: John Paul Adrian Glaubitz &lt;glaubitz@physik.fu-berlin.de&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Cc: Rich Felker &lt;dalias@libc.org&gt;
Cc: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Cc: Wei Yang &lt;richard.weiyang@linux.alibaba.com&gt;
Cc: Wolfram Sang &lt;wsa+renesas@sang-engineering.com&gt;
Cc: Yoshinori Sato &lt;ysato@users.osdn.me&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
