<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/tools/perf/util/Build, branch linux-6.9.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-6.9.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-6.9.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-03-04T06:51:55+00:00</updated>
<entry>
<title>perf threads: Move threads to its own files</title>
<updated>2024-03-04T06:51:55+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2024-03-01T05:36:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=93bb5b0d9394cbf49b76823c48ed8b815a5d899c'/>
<id>urn:sha1:93bb5b0d9394cbf49b76823c48ed8b815a5d899c</id>
<content type='text'>
Move threads out of machine and into its own file.

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Yang Jihong &lt;yangjihong1@huawei.com&gt;
Cc: Oliver Upton &lt;oliver.upton@linux.dev&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20240301053646.1449657-6-irogers@google.com
</content>
</entry>
<entry>
<title>perf: util: use capstone disasm engine to show assembly instructions</title>
<updated>2024-02-21T02:06:48+00:00</updated>
<author>
<name>Changbin Du</name>
<email>changbin.du@huawei.com</email>
</author>
<published>2024-02-17T07:40:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8f0ec15ff66243896ff3e534696c6af7ff013901'/>
<id>urn:sha1:8f0ec15ff66243896ff3e534696c6af7ff013901</id>
<content type='text'>
Currently, the instructions of samples are shown as raw hex strings
which are hard to read. x86 has a special option '--xed' to disassemble
the hex string via intel XED tool.

Here we use capstone as our disassembler engine to give more friendly
instructions. We select libcapstone because capstone can provide more
insn details. Perf will fallback to raw instructions if libcapstone is
not available.

The advantages compared to XED tool:
 * Support arm, arm64, x86-32, x86_64 (more could be supported),
   xed only for x86_64.
 * Immediate address operands are shown as symbol+offs.

Signed-off-by: Changbin Du &lt;changbin.du@huawei.com&gt;
Reviewed-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: changbin.du@gmail.com
Cc: Thomas Richter &lt;tmricht@linux.ibm.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20240217074046.4100789-3-changbin.du@huawei.com
</content>
</entry>
<entry>
<title>perf annotate-data: Add find_data_type() to get type from memory access</title>
<updated>2023-12-24T01:39:18+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2023-12-13T00:13:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b9c87f536c6f28c75ace8a014646faad00f0e1ec'/>
<id>urn:sha1:b9c87f536c6f28c75ace8a014646faad00f0e1ec</id>
<content type='text'>
The find_data_type() is to get a data type from the memory access at the
given address (IP) using a register and an offset.

It requires DWARF debug info in the DSO and searches the list of
variables and function parameters in the scope.

In a pseudo code, it does basically the following:

  find_data_type(dso, ip, reg, offset)
  {
      pc = map__rip_2objdump(ip);
      CU = dwarf_addrdie(dso-&gt;dwarf, pc);
      scopes = die_get_scopes(CU, pc);
      for_each_scope(S, scopes) {
          V = die_find_variable_by_reg(S, pc, reg);
          if (V &amp;&amp; V.type == pointer_type) {
              T = die_get_real_type(V);
              if (offset &lt; T.size)
                  return T;
          }
      }
      return NULL;
  }

Committer notes:

The 'size' variable in check_variable() is 64-bit, so use PRIu64 and
inttypes.h to debug it.

Ditto at find_data_type_die().

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>Merge remote-tracking branch 'torvalds/master' into perf-tools-next</title>
<updated>2023-12-19T00:37:07+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2023-12-19T00:37:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ab1c247094e323177a578b38f0325bf79f0317ac'/>
<id>urn:sha1:ab1c247094e323177a578b38f0325bf79f0317ac</id>
<content type='text'>
To pick up fixes that went thru perf-tools for v6.7 and to get in sync
with upstream to check for drift in the copies of headers, etc.

Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf build: Ensure sysreg-defs Makefile respects output dir</title>
<updated>2023-11-22T19:17:53+00:00</updated>
<author>
<name>Oliver Upton</name>
<email>oliver.upton@linux.dev</email>
</author>
<published>2023-11-21T19:29:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a29ee6aea7030786a63fde0d6d83a8f477b060fb'/>
<id>urn:sha1:a29ee6aea7030786a63fde0d6d83a8f477b060fb</id>
<content type='text'>
Currently the sysreg-defs are written out to the source tree
unconditionally, ignoring the specified output directory. Correct the
build rule to emit the header to the output directory. Opportunistically
reorganize the rules to avoid interleaving with the set of beauty make
rules.

Reported-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Oliver Upton &lt;oliver.upton@linux.dev&gt;
Link: https://lore.kernel.org/r/20231121192956.919380-3-oliver.upton@linux.dev
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf tools: Add util/debuginfo.[ch] files</title>
<updated>2023-11-10T12:03:54+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2023-11-09T23:59:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6f1b6291cf73cb3223f6fb9ec16862a5fe7ed957'/>
<id>urn:sha1:6f1b6291cf73cb3223f6fb9ec16862a5fe7ed957</id>
<content type='text'>
Split debuginfo data structure and related functions into a separate
file so that it can be used by other components than the probe-finder.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231110000012.3538610-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools</title>
<updated>2023-11-03T18:17:38+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-11-03T18:17:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7ab89417ed235f56d84c7893d38d4905e38d2692'/>
<id>urn:sha1:7ab89417ed235f56d84c7893d38d4905e38d2692</id>
<content type='text'>
Pull perf tools updates from Namhyung Kim:
 "Build:

   - Compile BPF programs by default if clang (&gt;= 12.0.1) is available
     to enable more features like kernel lock contention, off-cpu
     profiling, kwork, sample filtering and so on.

     This can be disabled by passing BUILD_BPF_SKEL=0 to make.

   - Produce better error messages for bison on debug build (make
     DEBUG=1) by defining YYDEBUG symbol internally.

  perf record:

   - Track sideband events (like FORK/MMAP) from all CPUs even if perf
     record targets a subset of CPUs only (using -C option). Otherwise
     it may lose some information happened on a CPU out of the target
     list.

   - Fix checking raw sched_switch tracepoint argument using system BTF.
     This affects off-cpu profiling which attaches a BPF program to the
     raw tracepoint.

  perf lock contention:

   - Add --lock-cgroup option to see contention by cgroups. This should
     be used with BPF only (using -b option).

       $ sudo perf lock con -ab --lock-cgroup -- sleep 1
        contended   total wait     max wait     avg wait   cgroup

              835     14.06 ms     41.19 us     16.83 us   /system.slice/led.service
               25    122.38 us     13.77 us      4.89 us   /
               44     23.73 us      3.87 us       539 ns   /user.slice/user-657345.slice/session-c4.scope
                1       491 ns       491 ns       491 ns   /system.slice/connectd.service

   - Add -G/--cgroup-filter option to see contention only for given
     cgroups.

     This can be useful when you identified a cgroup in the above
     command and want to investigate more on it. It also works with
     other output options like -t/--threads and -l/--lock-addr.

       $ sudo perf lock con -ab -G /user.slice/user-657345.slice/session-c4.scope -- sleep 1
        contended   total wait     max wait     avg wait         type   caller

                8     77.11 us     17.98 us      9.64 us     spinlock   futex_wake+0xc8
                2     24.56 us     14.66 us     12.28 us     spinlock   tick_do_update_jiffies64+0x25
                1      4.97 us      4.97 us      4.97 us     spinlock   futex_q_lock+0x2a

   - Use per-cpu array for better spinlock tracking. This is to improve
     performance of the BPF program and to avoid nested contention on a
     lock in the BPF hash map.

   - Update callstack check for PowerPC. To find a representative caller
     of a lock, it needs to look up the call stacks. It ends the lookup
     when it sees 0 in the call stack buffer. However, PowerPC call
     stacks can have 0 values in the beginning so skip them when it
     expects valid call stacks after.

  perf kwork:

   - Support 'sched' class (for -k option) so that it can see task
     scheduling event (using sched_switch tracepoint) as well as irq and
     workqueue items.

   - Add perf kwork top subcommand to show more accurate cpu utilization
     with sched class above. It works both with a recorded data (using
     perf kwork record command) and BPF (using -b option). Unlike perf
     top command, it does not support interactive mode (yet).

       $ sudo perf kwork top -b -k sched
       Starting trace, Hit &lt;Ctrl+C&gt; to stop and report
       ^C
       Total  : 160702.425 ms, 8 cpus
       %Cpu(s):  36.00% id,   0.00% hi,   0.00% si
       %Cpu0   [||||||||||||||||||              61.66%]
       %Cpu1   [||||||||||||||||||              61.27%]
       %Cpu2   [|||||||||||||||||||             66.40%]
       %Cpu3   [||||||||||||||||||              61.28%]
       %Cpu4   [||||||||||||||||||              61.82%]
       %Cpu5   [|||||||||||||||||||||||         77.41%]
       %Cpu6   [||||||||||||||||||              61.73%]
       %Cpu7   [||||||||||||||||||              63.25%]

             PID     SPID    %CPU           RUNTIME  COMMMAND
         -------------------------------------------------------------
               0        0   38.72       8089.463 ms  [swapper/1]
               0        0   38.71       8084.547 ms  [swapper/3]
               0        0   38.33       8007.532 ms  [swapper/0]
               0        0   38.26       7992.985 ms  [swapper/6]
               0        0   38.17       7971.865 ms  [swapper/4]
               0        0   36.74       7447.765 ms  [swapper/7]
               0        0   33.59       6486.942 ms  [swapper/2]
               0        0   22.58       3771.268 ms  [swapper/5]
            9545     9351    2.48        447.136 ms  sched-messaging
            9574     9351    2.09        418.583 ms  sched-messaging
            9724     9351    2.05        372.407 ms  sched-messaging
            9531     9351    2.01        368.804 ms  sched-messaging
            9512     9351    2.00        362.250 ms  sched-messaging
            9514     9351    1.95        357.767 ms  sched-messaging
            9538     9351    1.86        384.476 ms  sched-messaging
            9712     9351    1.84        386.490 ms  sched-messaging
            9723     9351    1.83        380.021 ms  sched-messaging
            9722     9351    1.82        382.738 ms  sched-messaging
            9517     9351    1.81        354.794 ms  sched-messaging
            9559     9351    1.79        344.305 ms  sched-messaging
            9725     9351    1.77        365.315 ms  sched-messaging
       &lt;SNIP&gt;

   - Add hard/soft-irq statistics to perf kwork top. This will show the
     total CPU utilization with IRQ stats like below:

       $ sudo perf kwork top -b -k sched,irq,softirq
       Starting trace, Hit &lt;Ctrl+C&gt; to stop and report
       ^C
       Total  :  12554.889 ms, 8 cpus
       %Cpu(s):  96.23% id,   0.10% hi,   0.19% si      &lt;---- here
       %Cpu0   [|                                4.60%]
       %Cpu1   [|                                4.59%]
       %Cpu2   [                                 2.73%]
       %Cpu3   [|                                3.81%]
       &lt;SNIP&gt;

  perf bench:

   - Add -G/--cgroups option to perf bench sched pipe. The pipe bench is
     good to measure context switch overhead. With this option, it puts
     the reader and writer tasks in separate cgroups to enforce context
     switch between two different cgroups.

     Also it needs to set CPU affinity of the tasks in a CPU to
     accurately measure the impact of cgroup context switches.

       $ sudo perf stat -e context-switches,cgroup-switches -- \
       &gt; taskset -c 0 perf bench sched pipe -l 100000
       # Running 'sched/pipe' benchmark:
       # Executed 100000 pipe operations between two processes

            Total time: 0.307 [sec]

              3.078180 usecs/op
                324867 ops/sec

        Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000':

                  200,026      context-switches
                       63      cgroup-switches

              0.321637922 seconds time elapsed

     You can see small number of cgroup-switches because both write and
     read tasks are in the same cgroup.

       $ sudo mkdir /sys/fs/cgroup/{AAA,BBB}

       $ sudo perf stat -e context-switches,cgroup-switches -- \
       &gt; taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB
       # Running 'sched/pipe' benchmark:
       # Executed 100000 pipe operations between two processes

            Total time: 0.351 [sec]

              3.512990 usecs/op
                284657 ops/sec

        Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB':

                  200,020      context-switches
                  200,019      cgroup-switches

              0.365034567 seconds time elapsed

     Now context-switches and cgroup-switches are almost same. And you
     can see the pipe operation took little more.

   - Kill child processes when perf bench sched messaging exited
     abnormally. Otherwise it'd leave the child doing unnecessary work.

  perf test:

   - Fix various shellcheck issues on the tests written in shell script.

   - Skip tests when condition is not satisfied:
      - object code reading test for non-text section addresses.
      - CoreSight test if cs_etm// event is not available.
      - lock contention test if not enough CPUs.

  Event parsing:

   - Make PMU alias name loading lazy to reduce the startup time in the
     event parsing code for perf record, stat and others in the general
     case.

   - Lazily compute PMU default config. In the same sense, delay PMU
     initialization until it's really needed to reduce the startup cost.

   - Fix event term values that are raw events. The event specification
     can have several terms including event name. But sometimes it
     clashes with raw event encoding which starts with 'r' and has
     hex-digits.

     For example, an event named 'read' should be processed as a normal
     event but it was mis-treated as a raw encoding and caused a
     failure.

       $ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1
       event syntax error: '..nning/event=read/'
                                         \___ parser error
       Run 'perf list' for a list of valid events

        Usage: perf stat [&lt;options&gt;] [&lt;command&gt;]

           -e, --event &lt;event&gt; event selector. use 'perf list' to list available events

  Event metrics:

   - Add "Compat" regex to match event with multiple identifiers.

   - Usual updates for Intel, Power10, Arm telemetry/CMN and AmpereOne.

  Misc:

   - Assorted memory leak fixes and footprint reduction.

   - Add "bpf_skeletons" to perf version --build-options so that users
     can check whether their perf tools have BPF support easily.

   - Fix unaligned access in Intel-PT packet decoder found by
     undefined-behavior sanitizer.

   - Avoid frequency mode for the dummy event. Surprisingly it'd impact
     kernel timer tick handler performance by force iterating all PMU
     events.

   - Update bash shell completion for events and metrics"

* tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (187 commits)
  perf vendor events intel: Update tsx_cycles_per_elision metrics
  perf vendor events intel: Update bonnell version number to v5
  perf vendor events intel: Update westmereex events to v4
  perf vendor events intel: Update meteorlake events to v1.06
  perf vendor events intel: Update knightslanding events to v16
  perf vendor events intel: Add typo fix for ivybridge FP
  perf vendor events intel: Update a spelling in haswell/haswellx
  perf vendor events intel: Update emeraldrapids to v1.01
  perf vendor events intel: Update alderlake/alderlake events to v1.23
  perf build: Disable BPF skeletons if clang version is &lt; 12.0.1
  perf callchain: Fix spelling mistake "statisitcs" -&gt; "statistics"
  perf report: Fix spelling mistake "heirachy" -&gt; "hierarchy"
  perf python: Fix binding linkage due to rename and move of evsel__increase_rlimit()
  perf tests: test_arm_coresight: Simplify source iteration
  perf vendor events intel: Add tigerlake two metrics
  perf vendor events intel: Add broadwellde two metrics
  perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metric
  perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit
  perf callchain: Minor layout changes to callchain_list
  perf callchain: Make brtype_stat in callchain_list optional
  ...
</content>
</entry>
<entry>
<title>perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit</title>
<updated>2023-10-25T20:39:58+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2023-10-24T22:23:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=56e144fe98260a0f8a17326993ceb576ef859ed5'/>
<id>urn:sha1:56e144fe98260a0f8a17326993ceb576ef859ed5</id>
<content type='text'>
Fix leak where mem_info__put wouldn't release the maps/map as used by
perf mem. Add exit functions and use elsewhere that the maps and map
are released.

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Cc: Ravi Bangoria &lt;ravi.bangoria@amd.com&gt;
Cc: Sandipan Das &lt;sandipan.das@amd.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: German Gomez &lt;german.gomez@arm.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: Nick Terrell &lt;terrelln@fb.com&gt;
Cc: Sean Christopherson &lt;seanjc@google.com&gt;
Cc: Changbin Du &lt;changbin.du@huawei.com&gt;
Cc: liuwenyu &lt;liuwenyu7@huawei.com&gt;
Cc: Yang Jihong &lt;yangjihong1@huawei.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Yanteng Si &lt;siyanteng@loongson.cn&gt;
Cc: Liam Howlett &lt;liam.howlett@oracle.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Link: https://lore.kernel.org/r/20231024222353.3024098-12-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf build: Generate arm64's sysreg-defs.h and add to include path</title>
<updated>2023-10-18T23:36:25+00:00</updated>
<author>
<name>Oliver Upton</name>
<email>oliver.upton@linux.dev</email>
</author>
<published>2023-10-11T19:57:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e2bdd172e6652c2f5554d125a5048bc9f9b0dfa3'/>
<id>urn:sha1:e2bdd172e6652c2f5554d125a5048bc9f9b0dfa3</id>
<content type='text'>
Start generating sysreg-defs.h in anticipation of updating sysreg.h to a
version that needs the generated output.

Acked-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Marc Zyngier &lt;maz@kernel.org&gt;
Link: https://lore.kernel.org/r/20231011195740.3349631-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton &lt;oliver.upton@linux.dev&gt;
</content>
</entry>
<entry>
<title>perf kwork top: Implements BPF-based cpu usage statistics</title>
<updated>2023-09-12T20:31:59+00:00</updated>
<author>
<name>Yang Jihong</name>
<email>yangjihong1@huawei.com</email>
</author>
<published>2023-08-12T08:49:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8c98420987cda501924b7a34f2413b982517d244'/>
<id>urn:sha1:8c98420987cda501924b7a34f2413b982517d244</id>
<content type='text'>
Use BPF to collect statistics on the CPU usage based on perf BPF skeletons.

Example usage:

  # perf kwork top -h

   Usage: perf kwork top [&lt;options&gt;]

      -b, --use-bpf         Use BPF to measure task cpu usage
      -C, --cpu &lt;cpu&gt;       list of cpus to profile
      -i, --input &lt;file&gt;    input file name
      -n, --name &lt;name&gt;     event name to profile
      -s, --sort &lt;key[,key2...]&gt;
                            sort by key(s): rate, runtime, tid
          --time &lt;str&gt;      Time span for analysis (start,stop)

  #
  # perf kwork -k sched top -b
  Starting trace, Hit &lt;Ctrl+C&gt; to stop and report
  ^C
  Total  : 160702.425 ms, 8 cpus
  %Cpu(s):  36.00% id,   0.00% hi,   0.00% si
  %Cpu0   [||||||||||||||||||              61.66%]
  %Cpu1   [||||||||||||||||||              61.27%]
  %Cpu2   [|||||||||||||||||||             66.40%]
  %Cpu3   [||||||||||||||||||              61.28%]
  %Cpu4   [||||||||||||||||||              61.82%]
  %Cpu5   [|||||||||||||||||||||||         77.41%]
  %Cpu6   [||||||||||||||||||              61.73%]
  %Cpu7   [||||||||||||||||||              63.25%]

        PID     SPID    %CPU           RUNTIME  COMMMAND
    -------------------------------------------------------------
          0        0   38.72       8089.463 ms  [swapper/1]
          0        0   38.71       8084.547 ms  [swapper/3]
          0        0   38.33       8007.532 ms  [swapper/0]
          0        0   38.26       7992.985 ms  [swapper/6]
          0        0   38.17       7971.865 ms  [swapper/4]
          0        0   36.74       7447.765 ms  [swapper/7]
          0        0   33.59       6486.942 ms  [swapper/2]
          0        0   22.58       3771.268 ms  [swapper/5]
       9545     9351    2.48        447.136 ms  sched-messaging
       9574     9351    2.09        418.583 ms  sched-messaging
       9724     9351    2.05        372.407 ms  sched-messaging
       9531     9351    2.01        368.804 ms  sched-messaging
       9512     9351    2.00        362.250 ms  sched-messaging
       9514     9351    1.95        357.767 ms  sched-messaging
       9538     9351    1.86        384.476 ms  sched-messaging
       9712     9351    1.84        386.490 ms  sched-messaging
       9723     9351    1.83        380.021 ms  sched-messaging
       9722     9351    1.82        382.738 ms  sched-messaging
       9517     9351    1.81        354.794 ms  sched-messaging
       9559     9351    1.79        344.305 ms  sched-messaging
       9725     9351    1.77        365.315 ms  sched-messaging
  &lt;SNIP&gt;

  # perf kwork -k sched top -b -n perf
  Starting trace, Hit &lt;Ctrl+C&gt; to stop and report
  ^C
  Total  : 151563.332 ms, 8 cpus
  %Cpu(s):  26.49% id,   0.00% hi,   0.00% si
  %Cpu0   [                                 0.01%]
  %Cpu1   [                                 0.00%]
  %Cpu2   [                                 0.00%]
  %Cpu3   [                                 0.00%]
  %Cpu4   [                                 0.00%]
  %Cpu5   [                                 0.00%]
  %Cpu6   [                                 0.00%]
  %Cpu7   [                                 0.00%]

        PID     SPID    %CPU           RUNTIME  COMMMAND
    -------------------------------------------------------------
       9754     9754    0.01          2.303 ms  perf

  #
  # perf kwork -k sched top -b -C 2,3,4
  Starting trace, Hit &lt;Ctrl+C&gt; to stop and report
  ^C
  Total  :  48016.721 ms, 3 cpus
  %Cpu(s):  27.82% id,   0.00% hi,   0.00% si
  %Cpu2   [||||||||||||||||||||||          74.68%]
  %Cpu3   [|||||||||||||||||||||           71.06%]
  %Cpu4   [|||||||||||||||||||||           70.91%]

        PID     SPID    %CPU           RUNTIME  COMMMAND
    -------------------------------------------------------------
          0        0   29.08       4734.998 ms  [swapper/4]
          0        0   28.93       4710.029 ms  [swapper/3]
          0        0   25.31       3912.363 ms  [swapper/2]
      10248    10158    1.62        264.931 ms  sched-messaging
      10253    10158    1.62        265.136 ms  sched-messaging
      10158    10158    1.60        263.013 ms  bash
      10360    10158    1.49        243.639 ms  sched-messaging
      10413    10158    1.48        238.604 ms  sched-messaging
      10531    10158    1.47        234.067 ms  sched-messaging
      10400    10158    1.47        240.631 ms  sched-messaging
      10355    10158    1.47        230.586 ms  sched-messaging
      10377    10158    1.43        234.835 ms  sched-messaging
      10526    10158    1.42        232.045 ms  sched-messaging
      10298    10158    1.41        222.396 ms  sched-messaging
      10410    10158    1.38        221.853 ms  sched-messaging
      10364    10158    1.38        226.042 ms  sched-messaging
      10480    10158    1.36        213.633 ms  sched-messaging
      10370    10158    1.36        223.620 ms  sched-messaging
      10553    10158    1.34        217.169 ms  sched-messaging
      10291    10158    1.34        211.516 ms  sched-messaging
      10251    10158    1.34        218.813 ms  sched-messaging
      10522    10158    1.33        218.498 ms  sched-messaging
      10288    10158    1.33        216.787 ms  sched-messaging
  &lt;SNIP&gt;

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Yang Jihong &lt;yangjihong1@huawei.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ravi Bangoria &lt;ravi.bangoria@amd.com&gt;
Cc: Sandipan Das &lt;sandipan.das@amd.com&gt;
Link: https://lore.kernel.org/r/20230812084917.169338-15-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
</feed>
