<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/tools/perf/util/stat-display.c, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-08-16T22:35:18+00:00</updated>
<entry>
<title>perf stat: Display iostat headers correctly</title>
<updated>2024-08-16T22:35:18+00:00</updated>
<author>
<name>Yicong Yang</name>
<email>yangyicong@hisilicon.com</email>
</author>
<published>2024-08-02T06:58:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2615639352420e6e3115952c5b8f46846e1c6d0e'/>
<id>urn:sha1:2615639352420e6e3115952c5b8f46846e1c6d0e</id>
<content type='text'>
Currently we'll only print metric headers for metric leader in
aggregration mode. This will make `perf iostat` header not shown
since it'll aggregrated globally but don't have metric events:

  root@ubuntu204:/home/yang/linux/tools/perf# ./perf stat --iostat --timeout 1000
   Performance counter stats for 'system wide':
      port
  0000:00                    0                    0                    0                    0
  0000:80                    0                    0                    0                    0
  [...]

Fix this by excluding the iostat in the check of printing metric
headers. Then we can see the headers:

  root@ubuntu204:/home/yang/linux/tools/perf# ./perf stat --iostat --timeout 1000
   Performance counter stats for 'system wide':
      port             Inbound Read(MB)    Inbound Write(MB)    Outbound Read(MB)   Outbound Write(MB)
  0000:00                    0                    0                    0                    0
  0000:80                    0                    0                    0                    0
  [...]

Fixes: 193a9e30207f5477 ("perf stat: Don't display metric header for non-leader uncore events")
Signed-off-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jonathan Cameron &lt;jonathan.cameron@huawei.com&gt;
Cc: Junhao He &lt;hejunhao3@huawei.com&gt;
Cc: linuxarm@huawei.com
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Shameerali Kolothum Thodi &lt;shameerali.kolothum.thodi@huawei.com&gt;
Cc: Zeng Tao &lt;prime.zeng@hisilicon.com&gt;
Link: https://lore.kernel.org/r/20240802065800.48774-1-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf stat: Use field separator in the metric header</title>
<updated>2024-06-28T17:58:09+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-06-28T00:06:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b195701e9f0f78f30d500ada246b3693f26e11ee'/>
<id>urn:sha1:b195701e9f0f78f30d500ada246b3693f26e11ee</id>
<content type='text'>
It didn't use the passed field separator (using -x option) when it
prints the metric headers and always put "," between the fields.

Before:
  $ sudo ./perf stat -a -x : --per-core -M tma_core_bound --metric-only true
  core,cpus,%  tma_core_bound:     &lt;&lt;&lt;--- here: "core,cpus," but ":" expected
  S0-D0-C0:2:10.5:
  S0-D0-C1:2:14.8:
  S0-D0-C2:2:9.9:
  S0-D0-C3:2:13.2:

After:
  $ sudo ./perf stat -a -x : --per-core -M tma_core_bound --metric-only true
  core:cpus:%  tma_core_bound:
  S0-D0-C0:2:10.5:
  S0-D0-C1:2:15.0:
  S0-D0-C2:2:16.5:
  S0-D0-C3:2:12.5:

Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Link: https://lore.kernel.org/r/20240628000604.1296808-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf stat: Fix a segfault with --per-cluster --metric-only</title>
<updated>2024-06-28T17:56:03+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2024-06-28T00:06:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=caa463bb79a82ac5fce05a0dcf7893d94c84fc5e'/>
<id>urn:sha1:caa463bb79a82ac5fce05a0dcf7893d94c84fc5e</id>
<content type='text'>
The new --per-cluster option was added recently but it forgot to update
the aggr_header fields which are used for --metric-only option.  And it
resulted in a segfault due to NULL string in fputs().

Fixes: cbc917a1b03b ("perf stat: Support per-cluster aggregation")
Reviewed-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Tested-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Link: https://lore.kernel.org/r/20240628000604.1296808-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf stat: Don't display metric header for non-leader uncore events</title>
<updated>2024-05-11T16:03:13+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2024-05-10T05:13:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=193a9e30207f54777ff42d0d8be8389edc522277'/>
<id>urn:sha1:193a9e30207f54777ff42d0d8be8389edc522277</id>
<content type='text'>
On an Intel tigerlake laptop a metric like:

    {
        "BriefDescription": "Test",
        "MetricExpr": "imc_free_running@data_read@ + imc_free_running@data_write@",
        "MetricGroup": "Test",
        "MetricName": "Test",
        "ScaleUnit": "6.103515625e-5MiB"
    },

Will have 4 events:

  uncore_imc_free_running_0/data_read/
  uncore_imc_free_running_0/data_write/
  uncore_imc_free_running_1/data_read/
  uncore_imc_free_running_1/data_write/

If aggregration is disabled with metric-only 2 column headers are
needed:

  $ perf stat -M test --metric-only -A -a sleep 1

   Performance counter stats for 'system wide':

                    MiB  Test            MiB  Test
  CPU0                 1821.0               1820.5

But when not, the counts aggregated in the metric leader and only 1
column should be shown:

  $ perf stat -M test --metric-only -a sleep 1
   Performance counter stats for 'system wide':

              MiB  Test
                5909.4

         1.001258915 seconds time elapsed

Achieve this by skipping events that aren't metric leaders when
printing column headers and aggregation isn't disabled.

The bug is long standing, the fixes tag is set to a refactor as that
is as far back as is reasonable to backport.

Fixes: 088519f318be3a41 ("perf stat: Move the display functions to stat-display.c")
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kaige Ye &lt;ye@kaige.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Link: https://lore.kernel.org/r/20240510051309.2452468-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf stat: Fix metric-only aggregation index</title>
<updated>2024-02-22T16:57:29+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2024-02-21T07:07:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bafd4e75c1ac5a9da0aec5c7c52c7a72613a0cf3'/>
<id>urn:sha1:bafd4e75c1ac5a9da0aec5c7c52c7a72613a0cf3</id>
<content type='text'>
Aggregation index was being computed using the evsel's cpumap which
may have a different (typically the same or fewer) entries.

Before:
```
$ perf stat --metric-only -A -M memory_bandwidth_total -a sleep 1

 Performance counter stats for 'system wide':

       MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total
CPU0                            12.8                           0.0                          12.9                          12.7                           0.0                          12.6
CPU1

       1.007806367 seconds time elapsed
```

After:
```
$ perf stat --metric-only -A -M memory_bandwidth_total -a sleep 1

 Performance counter stats for 'system wide':

       MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total
CPU0                            15.4                           0.0                          15.3                          15.0                           0.0                          14.9
CPU18                            0.0                           0.0                          13.5                           5.2                           0.0                          11.9

       1.007858736 seconds time elapsed
```

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;                                  |
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Kaige Ye &lt;ye@kaige.org&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: John Garry &lt;john.g.garry@oracle.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20240221070754.4163916-3-irogers@google.com
</content>
</entry>
<entry>
<title>perf stat: Avoid metric-only segv</title>
<updated>2024-02-13T21:48:09+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2024-02-09T20:49:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2543947c77e0e224bda86b4e7220c2f6714da463'/>
<id>urn:sha1:2543947c77e0e224bda86b4e7220c2f6714da463</id>
<content type='text'>
Cycles is recognized as part of a hard coded metric in stat-shadow.c,
it may call print_metric_only with a NULL fmt string leading to a
segfault. Handle the NULL fmt explicitly.

Fixes: 088519f318be ("perf stat: Move the display functions to stat-display.c")
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Reviewed-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: Kaige Ye &lt;ye@kaige.org&gt;
Cc: John Garry &lt;john.g.garry@oracle.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20240209204947.3873294-4-irogers@google.com
</content>
</entry>
<entry>
<title>perf stat: Support per-cluster aggregation</title>
<updated>2024-02-09T22:59:53+00:00</updated>
<author>
<name>Yicong Yang</name>
<email>yangyicong@hisilicon.com</email>
</author>
<published>2024-02-08T02:40:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cbc917a1b03bce85f385c1e640c9dcb02ffb9ab0'/>
<id>urn:sha1:cbc917a1b03bce85f385c1e640c9dcb02ffb9ab0</id>
<content type='text'>
Some platforms have 'cluster' topology and CPUs in the cluster will
share resources like L3 Cache Tag (for HiSilicon Kunpeng SoC) or L2
cache (for Intel Jacobsville). Currently parsing and building cluster
topology have been supported since [1].

perf stat has already supported aggregation for other topologies like
die or socket, etc. It'll be useful to aggregate per-cluster to find
problems like L3T bandwidth contention.

This patch add support for "--per-cluster" option for per-cluster
aggregation. Also update the docs and related test. The output will
be like:

[root@localhost tmp]# perf stat -a -e LLC-load --per-cluster -- sleep 5

 Performance counter stats for 'system wide':

S56-D0-CLS158    4      1,321,521,570      LLC-load
S56-D0-CLS594    4        794,211,453      LLC-load
S56-D0-CLS1030    4             41,623      LLC-load
S56-D0-CLS1466    4             41,646      LLC-load
S56-D0-CLS1902    4             16,863      LLC-load
S56-D0-CLS2338    4             15,721      LLC-load
S56-D0-CLS2774    4             22,671      LLC-load
[...]

On a legacy system without cluster or cluster support, the output will
be look like:
[root@localhost perf]# perf stat -a -e cycles --per-cluster -- sleep 1

 Performance counter stats for 'system wide':

S56-D0-CLS0   64         18,011,485      cycles
S7182-D0-CLS0   64         16,548,835      cycles

Note that this patch doesn't mix the cluster information in the outputs
of --per-core to avoid breaking any tools/scripts using it.

Note that perf recently supports "--per-cache" aggregation, but it's not
the same with the cluster although cluster CPUs may share some cache
resources. For example on my machine all clusters within a die share the
same L3 cache:
$ cat /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list
0-31
$ cat /sys/devices/system/cpu/cpu0/topology/cluster_cpus_list
0-3

[1] commit c5e22feffdd7 ("topology: Represent clusters of CPUs within a die")

Tested-by: Jie Zhan &lt;zhanjie9@hisilicon.com&gt;
Reviewed-by: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Cc: james.clark@arm.com
Cc: 21cnbao@gmail.com
Cc: prime.zeng@hisilicon.com
Cc: Jonathan.Cameron@huawei.com
Cc: fanghao11@huawei.com
Cc: linuxarm@huawei.com
Cc: tim.c.chen@intel.com
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20240208024026.2691-1-yangyicong@huawei.com
</content>
</entry>
<entry>
<title>perf stat: Combine the -A/--no-aggr and --no-merge options</title>
<updated>2023-12-14T21:24:38+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2023-12-14T06:02:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6f33e6fa29d0366d6e5b3ea2930dbc0b648151fe'/>
<id>urn:sha1:6f33e6fa29d0366d6e5b3ea2930dbc0b648151fe</id>
<content type='text'>
The -A or --no-aggr option disables aggregation of core events:

  $ perf stat -A -e cycles,data_total -a true

   Performance counter stats for 'system wide':

  CPU0            1,287,665      cycles
  CPU1            1,831,681      cycles
  CPU2           27,345,998      cycles
  CPU3            1,964,799      cycles
  CPU4              236,174      cycles
  CPU5            3,302,825      cycles
  CPU6            9,201,446      cycles
  CPU7            1,403,043      cycles
  CPU0               110.90 MiB  data_total

         0.008961761 seconds time elapsed

The --no-merge option disables the aggregation of uncore events:

  $ perf stat --no-merge -e cycles,data_total -a true

   Performance counter stats for 'system wide':

          38,482,778      cycles
               15.04 MiB  data_total [uncore_imc_free_running_1]
               15.00 MiB  data_total [uncore_imc_free_running_0]

         0.005915155 seconds time elapsed

Having two options confuses users who generally don't appreciate the
difference in PMUs. Keep all the options but make it so they all
disable aggregation both of core and uncore events:

  $ perf stat -A -e cycles,data_total -a true

   Performance counter stats for 'system wide':

  CPU0               85,878      cycles
  CPU1               88,179      cycles
  CPU2               60,872      cycles
  CPU3            3,265,567      cycles
  CPU4               82,357      cycles
  CPU5               83,383      cycles
  CPU6               84,156      cycles
  CPU7              220,803      cycles
  CPU0                 2.38 MiB  data_total [uncore_imc_free_running_0]
  CPU0                 2.38 MiB  data_total [uncore_imc_free_running_1]

         0.001397205 seconds time elapsed

Update the relevant 'perf stat' man page information.

Reviewed-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Athira Jajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Changbin Du &lt;changbin.du@huawei.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: John Garry &lt;john.g.garry@oracle.com&gt;
Cc: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Cc: Kaige Ye &lt;ye@kaige.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20231214060256.2094017-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf stat-display: Check if snprintf()'s fmt argument is NULL</title>
<updated>2023-08-21T13:54:22+00:00</updated>
<author>
<name>Kaige Ye</name>
<email>ye@kaige.org</email>
</author>
<published>2023-08-04T02:09:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=58a8d2edd57b35307d81ec6c304697707e6067c1'/>
<id>urn:sha1:58a8d2edd57b35307d81ec6c304697707e6067c1</id>
<content type='text'>
It is undefined behavior to pass NULL as snprintf()'s fmt argument.
Here is an example to trigger the problem:

  $ perf stat --metric-only -x, -e instructions -- sleep 1
  insn per cycle,
  Segmentation fault (core dumped)

With this patch:

  $ perf stat --metric-only -x, -e instructions -- sleep 1
  insn per cycle,
  ,

Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Kaige Ye &lt;ye@kaige.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/01CA7674B690CA24+20230804020907.144562-2-ye@kaige.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf stat: Don't display zero tool counts</title>
<updated>2023-08-08T17:33:57+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2023-08-01T20:54:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=487ae3b42d1040b4cd5ff9754e7516b409204029'/>
<id>urn:sha1:487ae3b42d1040b4cd5ff9754e7516b409204029</id>
<content type='text'>
Andi reported (see link below) a regression when printing the
'duration_time' tool event, where it gets printed as "not counted" for
most of the CPUs, fix it by skipping zero counts for tool events.

Reported-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Tested-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Claire Jensen &lt;cjense@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/all/ZMlrzcVrVi1lTDmn@tassilo/
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
</feed>
