<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/lib/percpu_counter.c, branch v6.6.132</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-08-25T15:06:53+00:00</updated>
<entry>
<title>pcpcntr: add group allocation/free</title>
<updated>2023-08-25T15:06:53+00:00</updated>
<author>
<name>Mateusz Guzik</name>
<email>mjguzik@gmail.com</email>
</author>
<published>2023-08-23T05:06:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c439d5e8a0deb7310b5bb4e5f2fe47c40ff5297f'/>
<id>urn:sha1:c439d5e8a0deb7310b5bb4e5f2fe47c40ff5297f</id>
<content type='text'>
Allocations and frees are globally serialized on the pcpu lock (and the
CPU hotplug lock if enabled, which is the case on Debian).

At least one frequent consumer allocates 4 back-to-back counters (and
frees them in the same manner), exacerbating the problem.

While this does not fully remedy scalability issues, it is a step
towards that goal and provides immediate relief.

Signed-off-by: Mateusz Guzik &lt;mjguzik@gmail.com&gt;
Reviewed-by: Dennis Zhou &lt;dennis@kernel.org&gt;
Reviewed-by: Vegard Nossum &lt;vegard.nossum@oracle.com&gt;
Link: https://lore.kernel.org/r/20230823050609.2228718-2-mjguzik@gmail.com
[Dennis: reflowed a few lines]
Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
</content>
</entry>
<entry>
<title>pcpcntr: remove percpu_counter_sum_all()</title>
<updated>2023-03-19T17:02:04+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2023-03-16T00:31:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e9b60c7f97130795c7aa81a649ae4b93a172a277'/>
<id>urn:sha1:e9b60c7f97130795c7aa81a649ae4b93a172a277</id>
<content type='text'>
percpu_counter_sum_all() is now redundant as the race condition it
was invented to handle is now dealt with by percpu_counter_sum()
directly and all users of percpu_counter_sum_all() have been
removed.

Remove it.

This effectively reverts the changes made in f689054aace2
("percpu_counter: add percpu_counter_sum_all interface") except for
the cpumask iteration that fixes percpu_counter_sum() made earlier
in this series.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Signed-off-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
</content>
</entry>
<entry>
<title>pcpcntrs: fix dying cpu summation race</title>
<updated>2023-03-19T17:02:04+00:00</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2023-03-16T00:31:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8b57b11cca88f397035a95b9e12b03511847b0e8'/>
<id>urn:sha1:8b57b11cca88f397035a95b9e12b03511847b0e8</id>
<content type='text'>
In commit f689054aace2 ("percpu_counter: add percpu_counter_sum_all
interface") a race condition between a cpu dying and
percpu_counter_sum() iterating online CPUs was identified. The
solution was to iterate all possible CPUs for summation via
percpu_counter_sum_all().

We recently had a percpu_counter_sum() call in XFS trip over this
same race condition and it fired a debug assert because the
filesystem was unmounting and the counter *should* be zero just
before we destroy it. That was reported here:

https://lore.kernel.org/linux-kernel/20230314090649.326642-1-yebin@huaweicloud.com/

likely as a result of running generic/648 which exercises
filesystems in the presence of CPU online/offline events.

The solution to use percpu_counter_sum_all() is an awful one. We
use percpu counters and percpu_counter_sum() for accurate and
reliable threshold detection for space management, so a summation
race condition during these operations can result in overcommit of
available space and that may result in filesystem shutdowns.

As percpu_counter_sum_all() iterates all possible CPUs rather than
just those online or even those present, the mask can include CPUs
that aren't even installed in the machine, or in the case of
machines that can hot-plug CPU capable nodes, even have physical
sockets present in the machine.

Fundamentally, this race condition is caused by the CPU being
offlined being removed from the cpu_online_mask before the notifier
that cleans up per-cpu state is run. Hence percpu_counter_sum() will
not sum the count for a cpu currently being taken offline,
regardless of whether the notifier has run or not. This is
the root cause of the bug.

The percpu counter notifier iterates all the registered counters,
locks the counter and moves the percpu count to the global sum.
This is serialised against other operations that move the percpu
counter to the global sum as well as percpu_counter_sum() operations
that sum the percpu counts while holding the counter lock.

Hence the notifier is safe to run concurrently with sum operations,
and the only thing we actually need to care about is that
percpu_counter_sum() iterates dying CPUs. That's trivial to do,
and when there are no CPUs dying, it has no addition overhead except
for a cpumask_or() operation.

This change makes percpu_counter_sum() always do the right thing in
the presence of CPU hot unplug events and makes
percpu_counter_sum_all() unnecessary. This, in turn, means that
filesystems like XFS, ext4, and btrfs don't have to work out when
they should use percpu_counter_sum() vs percpu_counter_sum_all() in
their space accounting algorithms

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Signed-off-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
</content>
</entry>
<entry>
<title>lib/percpu_counter: percpu_counter_add_batch() overflow/underflow</title>
<updated>2023-02-03T06:50:01+00:00</updated>
<author>
<name>Manfred Spraul</name>
<email>manfred@colorfullife.com</email>
</author>
<published>2022-12-16T15:04:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=805afd8300998948d16bdba0358dcfeb202a70d5'/>
<id>urn:sha1:805afd8300998948d16bdba0358dcfeb202a70d5</id>
<content type='text'>
Patch series "various irq handling fixes/docu updates".

If an interrupt happens between __this_cpu_read(*fbc-&gt;counters) and
this_cpu_add(*fbc-&gt;counters, amount), and that interrupt modifies the
per_cpu_counter, then the this_cpu_add() after the interrupt returns may
under/overflow.

Link: https://lkml.kernel.org/r/20221216150155.200389-1-manfred@colorfullife.com
Link: https://lkml.kernel.org/r/20221216150441.200533-1-manfred@colorfullife.com
Signed-off-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Cc: "Sun, Jiebin" &lt;jiebin.sun@intel.com&gt;
Cc: &lt;1vier1@web.de&gt;
Cc: Alexander Sverdlin &lt;alexander.sverdlin@siemens.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>percpu_counter: add percpu_counter_sum_all interface</title>
<updated>2022-11-30T23:58:40+00:00</updated>
<author>
<name>Shakeel Butt</name>
<email>shakeelb@google.com</email>
</author>
<published>2022-11-09T01:20:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f689054aace2ff13af2e9a44a74fbba650ca31ba'/>
<id>urn:sha1:f689054aace2ff13af2e9a44a74fbba650ca31ba</id>
<content type='text'>
The percpu_counter is used for scenarios where performance is more
important than the accuracy.  For percpu_counter users, who want more
accurate information in their slowpath, percpu_counter_sum is provided
which traverses all the online CPUs to accumulate the data.  The reason it
only needs to traverse online CPUs is because percpu_counter does
implement CPU offline callback which syncs the local data of the offlined
CPU.

However there is a small race window between the online CPUs traversal of
percpu_counter_sum and the CPU offline callback.  The offline callback has
to traverse all the percpu_counters on the system to flush the CPU local
data which can be a lot.  During that time, the CPU which is going offline
has already been published as offline to all the readers.  So, as the
offline callback is running, percpu_counter_sum can be called for one
counter which has some state on the CPU going offline.  Since
percpu_counter_sum only traverses online CPUs, it will skip that specific
CPU and the offline callback might not have flushed the state for that
specific percpu_counter on that offlined CPU.

Normally this is not an issue because percpu_counter users can deal with
some inaccuracy for small time window.  However a new user i.e.  mm_struct
on the cleanup path wants to check the exact state of the percpu_counter
through check_mm().  For such users, this patch introduces
percpu_counter_sum_all() which traverses all possible CPUs and it is used
in fork.c:check_mm() to avoid the potential race.

This issue is exposed by the later patch "mm: convert mm's rss stats into
percpu_counter".

Link: https://lkml.kernel.org/r/20221109012011.881058-1-shakeelb@google.com
Signed-off-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Reported-by: Marek Szyprowski &lt;m.szyprowski@samsung.com&gt;
Tested-by: Marek Szyprowski &lt;m.szyprowski@samsung.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>lib/percpu_counter: tame kernel-doc compile warning</title>
<updated>2021-05-07T02:24:12+00:00</updated>
<author>
<name>Alex Shi</name>
<email>alexs@kernel.org</email>
</author>
<published>2021-05-07T01:03:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=db65a867fd40fb33d4a7d619e95f2b796e798999'/>
<id>urn:sha1:db65a867fd40fb33d4a7d619e95f2b796e798999</id>
<content type='text'>
commit 3e8f399da490 ("writeback: rework wb_[dec|inc]_stat family of
functions") add some function description of percpu_counter_add_batch.
but the double '*' in comments means a kernel-doc format comment which
isn't right.

Since the whole file of lib/percpu_counter.c has no any other kernel-doc
format comments, we'd better to remove this incomplete one to tame the
kernel-doc warning:

lib/percpu_counter.c:83: warning: Function parameter or member 'fbc' not described in 'percpu_counter_add_batch'
lib/percpu_counter.c:83: warning: Function parameter or member 'amount' not described in 'percpu_counter_add_batch'
lib/percpu_counter.c:83: warning: Function parameter or member 'batch' not described in 'percpu_counter_add_batch'

Link: https://lkml.kernel.org/r/20210405135505.132446-1-alexs@kernel.org
Signed-off-by: Alex Shi &lt;alexs@kernel.org&gt;
Cc: Nikolay Borisov &lt;nborisov@suse.com&gt;
Cc: Stephen Boyd &lt;swboyd@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>lib/percpu_counter.c: use helper macro abs()</title>
<updated>2020-10-16T18:11:20+00:00</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2020-10-16T03:11:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1d339638a954edce06ffcaa15d883d322e9e8291'/>
<id>urn:sha1:1d339638a954edce06ffcaa15d883d322e9e8291</id>
<content type='text'>
Use helper macro abs() to simplify the "x &gt;= t || x &lt;= -t" cmp.

Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: https://lkml.kernel.org/r/20200927122746.5964-1-linmiaohe@huawei.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Make all debug_obj_descriptors const</title>
<updated>2020-09-24T19:56:25+00:00</updated>
<author>
<name>Stephen Boyd</name>
<email>swboyd@chromium.org</email>
</author>
<published>2020-08-15T00:40:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f9e62f318fd706a54b7ce9b28e5c7e49bbde8788'/>
<id>urn:sha1:f9e62f318fd706a54b7ce9b28e5c7e49bbde8788</id>
<content type='text'>
This should make it harder for the kernel to corrupt the debug object
descriptor, used to call functions to fixup state and track debug objects,
by moving the structure to read-only memory.

Signed-off-by: Stephen Boyd &lt;swboyd@chromium.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20200815004027.2046113-3-swboyd@chromium.org

</content>
</entry>
<entry>
<title>percpu_counter: add percpu_counter_sync()</title>
<updated>2020-08-07T18:33:26+00:00</updated>
<author>
<name>Feng Tang</name>
<email>feng.tang@intel.com</email>
</author>
<published>2020-08-07T06:23:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0a4954a850b0c4d0a5d18b1a55d6e5a653e362b5'/>
<id>urn:sha1:0a4954a850b0c4d0a5d18b1a55d6e5a653e362b5</id>
<content type='text'>
percpu_counter's accuracy is related to its batch size.  For a
percpu_counter with a big batch, its deviation could be big, so when the
counter's batch is runtime changed to a smaller value for better accuracy,
there could also be requirment to reduce the big deviation.

So add a percpu-counter sync function to be run on each CPU.

Reported-by: kernel test robot &lt;rong.a.chen@intel.com&gt;
Signed-off-by: Feng Tang &lt;feng.tang@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Qian Cai &lt;cai@lca.pw&gt;
Cc: Andi Kleen &lt;andi.kleen@intel.com&gt;
Cc: Huang Ying &lt;ying.huang@intel.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Haiyang Zhang &lt;haiyangz@microsoft.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Tim Chen &lt;tim.c.chen@intel.com&gt;
Link: http://lkml.kernel.org/r/1594389708-60781-4-git-send-email-feng.tang@intel.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>notifier: Remove notifier header file wherever not used</title>
<updated>2018-08-30T10:56:40+00:00</updated>
<author>
<name>Mukesh Ojha</name>
<email>mojha@codeaurora.org</email>
</author>
<published>2018-08-24T12:33:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=13ba17bee18e321b073b49a88dcab10881f757da'/>
<id>urn:sha1:13ba17bee18e321b073b49a88dcab10881f757da</id>
<content type='text'>
The conversion of the hotplug notifiers to a state machine left the
notifier.h includes around in some places. Remove them.

Signed-off-by: Mukesh Ojha &lt;mojha@codeaurora.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/1535114033-4605-1-git-send-email-mojha@codeaurora.org

</content>
</entry>
</feed>
