diff options
author | Jesper Dangaard Brouer <hawk@kernel.org> | 2024-04-16 20:51:26 +0300 |
---|---|---|
committer | Tejun Heo <tj@kernel.org> | 2024-04-17 01:10:42 +0300 |
commit | fc29e04ae1ad4c99422c0b8ae4b43cfe99c70429 (patch) | |
tree | 0e253e08ac5881d265f82bf64faa2c8c5c1b558e /include/linux/cgroup.h | |
parent | 15b8b9ab5081d8dce9aa27a594ba4db2c29cefc0 (diff) | |
download | linux-fc29e04ae1ad4c99422c0b8ae4b43cfe99c70429.tar.xz |
cgroup/rstat: add cgroup_rstat_lock helpers and tracepoints
This commit enhances the ability to troubleshoot the global
cgroup_rstat_lock by introducing wrapper helper functions for the lock
along with associated tracepoints.
Although global, the cgroup_rstat_lock helper APIs and tracepoints take
arguments such as cgroup pointer and cpu_in_loop variable. This
adjustment is made because flushing occurs per cgroup despite the lock
being global. Hence, when troubleshooting, it's important to identify the
relevant cgroup. The cpu_in_loop variable is necessary because the global
lock may be released within the main flushing loop that traverses CPUs.
In the tracepoints, the cpu_in_loop value is set to -1 when acquiring the
main lock; otherwise, it denotes the CPU number processed last.
The new feature in this patchset is detecting when lock is contended. The
tracepoints are implemented with production in mind. For minimum overhead
attach to cgroup:cgroup_rstat_lock_contended, which only gets activated
when trylock detects lock is contended. A quick production check for
issues could be done via this perf commands:
perf record -g -e cgroup:cgroup_rstat_lock_contended
Next natural question would be asking how long time do lock contenders
wait for obtaining the lock. This can be answered by measuring the time
between cgroup:cgroup_rstat_lock_contended and cgroup:cgroup_rstat_locked
when args->contended is set. Like this bpftrace script:
bpftrace -e '
tracepoint:cgroup:cgroup_rstat_lock_contended {@start[tid]=nsecs}
tracepoint:cgroup:cgroup_rstat_locked {
if (args->contended) {
@wait_ns=hist(nsecs-@start[tid]); delete(@start[tid]);}}
interval:s:1 {time("%H:%M:%S "); print(@wait_ns); }'
Extending with time spend holding the lock will be more expensive as this
also looks at all the non-contended cases.
Like this bpftrace script:
bpftrace -e '
tracepoint:cgroup:cgroup_rstat_lock_contended {@start[tid]=nsecs}
tracepoint:cgroup:cgroup_rstat_locked { @locked[tid]=nsecs;
if (args->contended) {
@wait_ns=hist(nsecs-@start[tid]); delete(@start[tid]);}}
tracepoint:cgroup:cgroup_rstat_unlock {
@locked_ns=hist(nsecs-@locked[tid]); delete(@locked[tid]);}
interval:s:1 {time("%H:%M:%S "); print(@wait_ns);print(@locked_ns); }'
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Diffstat (limited to 'include/linux/cgroup.h')
-rw-r--r-- | include/linux/cgroup.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 34aaf0e87def..2150ca60394b 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -690,7 +690,7 @@ static inline void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen) void cgroup_rstat_updated(struct cgroup *cgrp, int cpu); void cgroup_rstat_flush(struct cgroup *cgrp); void cgroup_rstat_flush_hold(struct cgroup *cgrp); -void cgroup_rstat_flush_release(void); +void cgroup_rstat_flush_release(struct cgroup *cgrp); /* * Basic resource stats. |