<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/lib/cpumask.c, branch v5.15.85</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.85</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.85'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-02-05T22:28:29+00:00</updated>
<entry>
<title>Revert "lib: Restrict cpumask_local_spread to houskeeping CPUs"</title>
<updated>2021-02-05T22:28:29+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2021-02-05T22:28:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2452483d9546de1c540f330469dc4042ff089731'/>
<id>urn:sha1:2452483d9546de1c540f330469dc4042ff089731</id>
<content type='text'>
This reverts commit 1abdfe706a579a702799fce465bceb9fb01d407c.

This change is broken and not solving any problem it claims to solve.

Robin reported that cpumask_local_spread() now returns any cpu out of
cpu_possible_mask in case that NOHZ_FULL is disabled (runtime or compile
time). It can also return any offline or not-present CPU in the
housekeeping mask. Before that it was returning a CPU out of
online_cpu_mask.

While the function is racy against CPU hotplug if the caller does not
protect against it, the actual use cases are not caring much about it as
they use it mostly as hint for:

 - the user space affinity hint which is unused by the kernel
 - memory node selection which is just suboptimal
 - network queue affinity which might fail but is handled gracefully

But the occasional fail vs. hotplug is very different from returning
anything from possible_cpu_mask which can have a large amount of offline
CPUs obviously.

The changelog of the commit claims:

 "The current implementation of cpumask_local_spread() does not respect
  the isolated CPUs, i.e., even if a CPU has been isolated for Real-Time
  task, it will return it to the caller for pinning of its IRQ
  threads. Having these unwanted IRQ threads on an isolated CPU adds up
  to a latency overhead."

The only correct part of this changelog is:

 "The current implementation of cpumask_local_spread() does not respect
  the isolated CPUs."

Everything else is just disjunct from reality.

Reported-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Nitesh Narayan Lal &lt;nitesh@redhat.com&gt;
Cc: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Cc: abelits@marvell.com
Cc: davem@davemloft.net
Link: https://lore.kernel.org/r/87y2g26tnt.fsf@nanos.tec.linutronix.de
</content>
</entry>
<entry>
<title>sched,rt: Use cpumask_any*_distribute()</title>
<updated>2020-11-10T17:39:00+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-10-01T13:54:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=14e292f8d45380c519a83d9b0f37089a17eedcdf'/>
<id>urn:sha1:14e292f8d45380c519a83d9b0f37089a17eedcdf</id>
<content type='text'>
Replace a bunch of cpumask_any*() instances with
cpumask_any*_distribute(), by injecting this little bit of random in
cpu selection, we reduce the chance two competing balance operations
working off the same lowest_mask pick the same CPU.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Reviewed-by: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Link: https://lkml.kernel.org/r/20201023102347.190759694@infradead.org
</content>
</entry>
<entry>
<title>lib: Restrict cpumask_local_spread to houskeeping CPUs</title>
<updated>2020-07-08T09:39:01+00:00</updated>
<author>
<name>Alex Belits</name>
<email>abelits@marvell.com</email>
</author>
<published>2020-06-25T22:34:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1abdfe706a579a702799fce465bceb9fb01d407c'/>
<id>urn:sha1:1abdfe706a579a702799fce465bceb9fb01d407c</id>
<content type='text'>
The current implementation of cpumask_local_spread() does not respect the
isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
it will return it to the caller for pinning of its IRQ threads. Having
these unwanted IRQ threads on an isolated CPU adds up to a latency
overhead.

Restrict the CPUs that are returned for spreading IRQs only to the
available housekeeping CPUs.

Signed-off-by: Alex Belits &lt;abelits@marvell.com&gt;
Signed-off-by: Nitesh Narayan Lal &lt;nitesh@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20200625223443.2684-2-nitesh@redhat.com
</content>
</entry>
<entry>
<title>sched/core: Distribute tasks within affinity masks</title>
<updated>2020-03-20T12:06:18+00:00</updated>
<author>
<name>Paul Turner</name>
<email>pjt@google.com</email>
</author>
<published>2020-03-11T01:01:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=46a87b3851f0d6eb05e6d83d5c5a30df0eca8f76'/>
<id>urn:sha1:46a87b3851f0d6eb05e6d83d5c5a30df0eca8f76</id>
<content type='text'>
Currently, when updating the affinity of tasks via either cpusets.cpus,
or, sched_setaffinity(); tasks not currently running within the newly
specified mask will be arbitrarily assigned to the first CPU within the
mask.

This (particularly in the case that we are restricting masks) can
result in many tasks being assigned to the first CPUs of their new
masks.

This:
 1) Can induce scheduling delays while the load-balancer has a chance to
    spread them between their new CPUs.
 2) Can antogonize a poor load-balancer behavior where it has a
    difficult time recognizing that a cross-socket imbalance has been
    forced by an affinity mask.

This change adds a new cpumask interface to allow iterated calls to
distribute within the intersection of the provided masks.

The cases that this mainly affects are:
 - modifying cpuset.cpus
 - when tasks join a cpuset
 - when modifying a task's affinity via sched_setaffinity(2)

Signed-off-by: Paul Turner &lt;pjt@google.com&gt;
Signed-off-by: Josh Don &lt;joshdon@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Tested-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Link: https://lkml.kernel.org/r/20200311010113.136465-1-joshdon@google.com
</content>
</entry>
<entry>
<title>treewide: add checks for the return value of memblock_alloc*()</title>
<updated>2019-03-12T17:04:02+00:00</updated>
<author>
<name>Mike Rapoport</name>
<email>rppt@linux.ibm.com</email>
</author>
<published>2019-03-12T06:30:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8a7f97b902f4fb0d94b355b6b3f1fbd7154cafb9'/>
<id>urn:sha1:8a7f97b902f4fb0d94b355b6b3f1fbd7154cafb9</id>
<content type='text'>
Add check for the return value of memblock_alloc*() functions and call
panic() in case of error.  The panic message repeats the one used by
panicing memblock allocators with adjustment of parameters to include
only relevant ones.

The replacement was mostly automated with semantic patches like the one
below with manual massaging of format strings.

  @@
  expression ptr, size, align;
  @@
  ptr = memblock_alloc(size, align);
  + if (!ptr)
  + 	panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);

[anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
  Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
[rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
  Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
[rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
  Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
[akpm@linux-foundation.org: fix xtensa printk warning]
Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Signed-off-by: Anders Roxell &lt;anders.roxell@linaro.org&gt;
Reviewed-by: Guo Ren &lt;ren_guo@c-sky.com&gt;		[c-sky]
Acked-by: Paul Burton &lt;paul.burton@mips.com&gt;		[MIPS]
Acked-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;	[s390]
Reviewed-by: Juergen Gross &lt;jgross@suse.com&gt;		[Xen]
Reviewed-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;	[m68k]
Acked-by: Max Filippov &lt;jcmvbkbc@gmail.com&gt;		[xtensa]
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Greentime Hu &lt;green.hu@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Guan Xuetao &lt;gxt@pku.edu.cn&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Mark Salter &lt;msalter@redhat.com&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rich Felker &lt;dalias@libc.org&gt;
Cc: Rob Herring &lt;robh+dt@kernel.org&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Stafford Horne &lt;shorne@gmail.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Yoshinori Sato &lt;ysato@users.sourceforge.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: replace all open encodings for NUMA_NO_NODE</title>
<updated>2019-03-06T05:07:14+00:00</updated>
<author>
<name>Anshuman Khandual</name>
<email>anshuman.khandual@arm.com</email>
</author>
<published>2019-03-05T23:42:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=98fa15f34cb379864757670b8e8743b21456a20e'/>
<id>urn:sha1:98fa15f34cb379864757670b8e8743b21456a20e</id>
<content type='text'>
Patch series "Replace all open encodings for NUMA_NO_NODE", v3.

All these places for replacement were found by running the following
grep patterns on the entire kernel code.  Please let me know if this
might have missed some instances.  This might also have replaced some
false positives.  I will appreciate suggestions, inputs and review.

1. git grep "nid == -1"
2. git grep "node == -1"
3. git grep "nid = -1"
4. git grep "node = -1"

This patch (of 2):

At present there are multiple places where invalid node number is
encoded as -1.  Even though implicitly understood it is always better to
have macros in there.  Replace these open encodings for an invalid node
number with the global macro NUMA_NO_NODE.  This helps remove NUMA
related assumptions like 'invalid node' from various places redirecting
them to a common definition.

Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Jeff Kirsher &lt;jeffrey.t.kirsher@intel.com&gt;	[ixgbe]
Acked-by: Jens Axboe &lt;axboe@kernel.dk&gt;			[mtip32xx]
Acked-by: Vinod Koul &lt;vkoul@kernel.org&gt;			[dmaengine.c]
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;		[powerpc]
Acked-by: Doug Ledford &lt;dledford@redhat.com&gt;		[drivers/infiniband]
Cc: Joseph Qi &lt;jiangqi903@gmail.com&gt;
Cc: Hans Verkuil &lt;hverkuil@xs4all.nl&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memblock: stop using implicit alignment to SMP_CACHE_BYTES</title>
<updated>2018-10-31T15:54:16+00:00</updated>
<author>
<name>Mike Rapoport</name>
<email>rppt@linux.vnet.ibm.com</email>
</author>
<published>2018-10-30T22:09:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7e1c4e27928e5f87b9b1eaf06dc31773b2f1e7f1'/>
<id>urn:sha1:7e1c4e27928e5f87b9b1eaf06dc31773b2f1e7f1</id>
<content type='text'>
When a memblock allocation APIs are called with align = 0, the alignment
is implicitly set to SMP_CACHE_BYTES.

Implicit alignment is done deep in the memblock allocator and it can
come as a surprise.  Not that such an alignment would be wrong even
when used incorrectly but it is better to be explicit for the sake of
clarity and the prinicple of the least surprise.

Replace all such uses of memblock APIs with the 'align' parameter
explicitly set to SMP_CACHE_BYTES and stop implicit alignment assignment
in the memblock internal allocation functions.

For the case when memblock APIs are used via helper functions, e.g.  like
iommu_arena_new_node() in Alpha, the helper functions were detected with
Coccinelle's help and then manually examined and updated where
appropriate.

The direct memblock APIs users were updated using the semantic patch below:

@@
expression size, min_addr, max_addr, nid;
@@
(
|
- memblock_alloc_try_nid_raw(size, 0, min_addr, max_addr, nid)
+ memblock_alloc_try_nid_raw(size, SMP_CACHE_BYTES, min_addr, max_addr,
nid)
|
- memblock_alloc_try_nid_nopanic(size, 0, min_addr, max_addr, nid)
+ memblock_alloc_try_nid_nopanic(size, SMP_CACHE_BYTES, min_addr, max_addr,
nid)
|
- memblock_alloc_try_nid(size, 0, min_addr, max_addr, nid)
+ memblock_alloc_try_nid(size, SMP_CACHE_BYTES, min_addr, max_addr, nid)
|
- memblock_alloc(size, 0)
+ memblock_alloc(size, SMP_CACHE_BYTES)
|
- memblock_alloc_raw(size, 0)
+ memblock_alloc_raw(size, SMP_CACHE_BYTES)
|
- memblock_alloc_from(size, 0, min_addr)
+ memblock_alloc_from(size, SMP_CACHE_BYTES, min_addr)
|
- memblock_alloc_nopanic(size, 0)
+ memblock_alloc_nopanic(size, SMP_CACHE_BYTES)
|
- memblock_alloc_low(size, 0)
+ memblock_alloc_low(size, SMP_CACHE_BYTES)
|
- memblock_alloc_low_nopanic(size, 0)
+ memblock_alloc_low_nopanic(size, SMP_CACHE_BYTES)
|
- memblock_alloc_from_nopanic(size, 0, min_addr)
+ memblock_alloc_from_nopanic(size, SMP_CACHE_BYTES, min_addr)
|
- memblock_alloc_node(size, 0, nid)
+ memblock_alloc_node(size, SMP_CACHE_BYTES, nid)
)

[mhocko@suse.com: changelog update]
[akpm@linux-foundation.org: coding-style fixes]
[rppt@linux.ibm.com: fix missed uses of implicit alignment]
  Link: http://lkml.kernel.org/r/20181016133656.GA10925@rapoport-lnx
Link: http://lkml.kernel.org/r/1538687224-17535-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport &lt;rppt@linux.vnet.ibm.com&gt;
Suggested-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Paul Burton &lt;paul.burton@mips.com&gt;	[MIPS]
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;	[powerpc]
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Guan Xuetao &lt;gxt@pku.edu.cn&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: remove include/linux/bootmem.h</title>
<updated>2018-10-31T15:54:16+00:00</updated>
<author>
<name>Mike Rapoport</name>
<email>rppt@linux.vnet.ibm.com</email>
</author>
<published>2018-10-30T22:09:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=57c8a661d95dff48dd9c2f2496139082bbaf241a'/>
<id>urn:sha1:57c8a661d95dff48dd9c2f2496139082bbaf241a</id>
<content type='text'>
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include &lt;linux/memblock.h&gt;

@@
@@
- #include &lt;linux/bootmem.h&gt;
+ #include &lt;linux/memblock.h&gt;

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
  Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport &lt;rppt@linux.vnet.ibm.com&gt;
Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Greentime Hu &lt;green.hu@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Guan Xuetao &lt;gxt@pku.edu.cn&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "James E.J. Bottomley" &lt;jejb@parisc-linux.org&gt;
Cc: Jonas Bonn &lt;jonas@southpole.se&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Ley Foon Tan &lt;lftan@altera.com&gt;
Cc: Mark Salter &lt;msalter@redhat.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Palmer Dabbelt &lt;palmer@sifive.com&gt;
Cc: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rich Felker &lt;dalias@libc.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Serge Semin &lt;fancer.lancer@gmail.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Yoshinori Sato &lt;ysato@users.sourceforge.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memblock: remove _virt from APIs returning virtual address</title>
<updated>2018-10-31T15:54:15+00:00</updated>
<author>
<name>Mike Rapoport</name>
<email>rppt@linux.vnet.ibm.com</email>
</author>
<published>2018-10-30T22:08:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=eb31d559f1e8390195372cd51cfb198da8bc84b9'/>
<id>urn:sha1:eb31d559f1e8390195372cd51cfb198da8bc84b9</id>
<content type='text'>
The conversion is done using

sed -i 's@memblock_virt_alloc@memblock_alloc@g' \
	$(git grep -l memblock_virt_alloc)

Link: http://lkml.kernel.org/r/1536927045-23536-8-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport &lt;rppt@linux.vnet.ibm.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Greentime Hu &lt;green.hu@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Guan Xuetao &lt;gxt@pku.edu.cn&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "James E.J. Bottomley" &lt;jejb@parisc-linux.org&gt;
Cc: Jonas Bonn &lt;jonas@southpole.se&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Ley Foon Tan &lt;lftan@altera.com&gt;
Cc: Mark Salter &lt;msalter@redhat.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Palmer Dabbelt &lt;palmer@sifive.com&gt;
Cc: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rich Felker &lt;dalias@libc.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Serge Semin &lt;fancer.lancer@gmail.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Yoshinori Sato &lt;ysato@users.sourceforge.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>lib: optimize cpumask_next_and()</title>
<updated>2018-02-07T02:32:44+00:00</updated>
<author>
<name>Clement Courbet</name>
<email>courbet@google.com</email>
</author>
<published>2018-02-06T23:38:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0ade34c37012ea5c516d9aa4d19a56e9f40a55ed'/>
<id>urn:sha1:0ade34c37012ea5c516d9aa4d19a56e9f40a55ed</id>
<content type='text'>
We've measured that we spend ~0.6% of sys cpu time in cpumask_next_and().
It's essentially a joined iteration in search for a non-zero bit, which is
currently implemented as a lookup join (find a nonzero bit on the lhs,
lookup the rhs to see if it's set there).

Implement a direct join (find a nonzero bit on the incrementally built
join).  Also add generic bitmap benchmarks in the new `test_find_bit`
module for new function (see `find_next_and_bit` in [2] and [3] below).

For cpumask_next_and, direct benchmarking shows that it's 1.17x to 14x
faster with a geometric mean of 2.1 on 32 CPUs [1].  No impact on memory
usage.  Note that on Arm, the new pure-C implementation still outperforms
the old one that uses a mix of C and asm (`find_next_bit`) [3].

[1] Approximate benchmark code:

```
  unsigned long src1p[nr_cpumask_longs] = {pattern1};
  unsigned long src2p[nr_cpumask_longs] = {pattern2};
  for (/*a bunch of repetitions*/) {
    for (int n = -1; n &lt;= nr_cpu_ids; ++n) {
      asm volatile("" : "+rm"(src1p)); // prevent any optimization
      asm volatile("" : "+rm"(src2p));
      unsigned long result = cpumask_next_and(n, src1p, src2p);
      asm volatile("" : "+rm"(result));
    }
  }
```

Results:
pattern1    pattern2     time_before/time_after
0x0000ffff  0x0000ffff   1.65
0x0000ffff  0x00005555   2.24
0x0000ffff  0x00001111   2.94
0x0000ffff  0x00000000   14.0
0x00005555  0x0000ffff   1.67
0x00005555  0x00005555   1.71
0x00005555  0x00001111   1.90
0x00005555  0x00000000   6.58
0x00001111  0x0000ffff   1.46
0x00001111  0x00005555   1.49
0x00001111  0x00001111   1.45
0x00001111  0x00000000   3.10
0x00000000  0x0000ffff   1.18
0x00000000  0x00005555   1.18
0x00000000  0x00001111   1.17
0x00000000  0x00000000   1.25
-----------------------------
               geo.mean  2.06

[2] test_find_next_bit, X86 (skylake)

 [ 3913.477422] Start testing find_bit() with random-filled bitmap
 [ 3913.477847] find_next_bit: 160868 cycles, 16484 iterations
 [ 3913.477933] find_next_zero_bit: 169542 cycles, 16285 iterations
 [ 3913.478036] find_last_bit: 201638 cycles, 16483 iterations
 [ 3913.480214] find_first_bit: 4353244 cycles, 16484 iterations
 [ 3913.480216] Start testing find_next_and_bit() with random-filled
 bitmap
 [ 3913.481074] find_next_and_bit: 89604 cycles, 8216 iterations
 [ 3913.481075] Start testing find_bit() with sparse bitmap
 [ 3913.481078] find_next_bit: 2536 cycles, 66 iterations
 [ 3913.481252] find_next_zero_bit: 344404 cycles, 32703 iterations
 [ 3913.481255] find_last_bit: 2006 cycles, 66 iterations
 [ 3913.481265] find_first_bit: 17488 cycles, 66 iterations
 [ 3913.481266] Start testing find_next_and_bit() with sparse bitmap
 [ 3913.481272] find_next_and_bit: 764 cycles, 1 iterations

[3] test_find_next_bit, arm (v7 odroid XU3).

[  267.206928] Start testing find_bit() with random-filled bitmap
[  267.214752] find_next_bit: 4474 cycles, 16419 iterations
[  267.221850] find_next_zero_bit: 5976 cycles, 16350 iterations
[  267.229294] find_last_bit: 4209 cycles, 16419 iterations
[  267.279131] find_first_bit: 1032991 cycles, 16420 iterations
[  267.286265] Start testing find_next_and_bit() with random-filled
bitmap
[  267.302386] find_next_and_bit: 2290 cycles, 8140 iterations
[  267.309422] Start testing find_bit() with sparse bitmap
[  267.316054] find_next_bit: 191 cycles, 66 iterations
[  267.322726] find_next_zero_bit: 8758 cycles, 32703 iterations
[  267.329803] find_last_bit: 84 cycles, 66 iterations
[  267.336169] find_first_bit: 4118 cycles, 66 iterations
[  267.342627] Start testing find_next_and_bit() with sparse bitmap
[  267.356919] find_next_and_bit: 91 cycles, 1 iterations

[courbet@google.com: v6]
  Link: http://lkml.kernel.org/r/20171129095715.23430-1-courbet@google.com
[geert@linux-m68k.org: m68k/bitops: always include &lt;asm-generic/bitops/find.h&gt;]
  Link: http://lkml.kernel.org/r/1512556816-28627-1-git-send-email-geert@linux-m68k.org
Link: http://lkml.kernel.org/r/20171128131334.23491-1-courbet@google.com
Signed-off-by: Clement Courbet &lt;courbet@google.com&gt;
Signed-off-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Yury Norov &lt;ynorov@caviumnetworks.com&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
