<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/tools/perf/arch, branch v6.7.3</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.7.3</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.7.3'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-11-22T18:57:47+00:00</updated>
<entry>
<title>tools/perf: Update tools's copy of mips syscall table</title>
<updated>2023-11-22T18:57:47+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2023-11-21T22:56:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=027905fe5baec70a00e00890e982d035d6c8b6b3'/>
<id>urn:sha1:027905fe5baec70a00e00890e982d035d6c8b6b3</id>
<content type='text'>
tldr; Just FYI, I'm carrying this on the perf tools tree.

Full explanation:

There used to be no copies, with tools/ code using kernel headers
directly. From time to time tools/perf/ broke due to legitimate kernel
hacking. At some point Linus complained about such direct usage. Then we
adopted the current model.

The way these headers are used in perf are not restricted to just
including them to compile something.

There are sometimes used in scripts that convert defines into string
tables, etc, so some change may break one of these scripts, or new MSRs
may use some different #define pattern, etc.

E.g.:

  $ ls -1 tools/perf/trace/beauty/*.sh | head -5
  tools/perf/trace/beauty/arch_errno_names.sh
  tools/perf/trace/beauty/drm_ioctl.sh
  tools/perf/trace/beauty/fadvise.sh
  tools/perf/trace/beauty/fsconfig.sh
  tools/perf/trace/beauty/fsmount.sh
  $
  $ tools/perf/trace/beauty/fadvise.sh
  static const char *fadvise_advices[] = {
        [0] = "NORMAL",
        [1] = "RANDOM",
        [2] = "SEQUENTIAL",
        [3] = "WILLNEED",
        [4] = "DONTNEED",
        [5] = "NOREUSE",
  };
  $

The tools/perf/check-headers.sh script, part of the tools/ build
process, points out changes in the original files.

So its important not to touch the copies in tools/ when doing changes in
the original kernel headers, that will be done later, when
check-headers.sh inform about the change to the perf tools hackers.

Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Cc: linux-mips@vger.kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20231121225650.390246-14-namhyung@kernel.org
</content>
</entry>
<entry>
<title>tools/perf: Update tools's copy of s390 syscall table</title>
<updated>2023-11-22T18:57:47+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2023-11-21T22:56:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d3968c974a2453de9289bdb9d34130b2ce323628'/>
<id>urn:sha1:d3968c974a2453de9289bdb9d34130b2ce323628</id>
<content type='text'>
tldr; Just FYI, I'm carrying this on the perf tools tree.

Full explanation:

There used to be no copies, with tools/ code using kernel headers
directly. From time to time tools/perf/ broke due to legitimate kernel
hacking. At some point Linus complained about such direct usage. Then we
adopted the current model.

The way these headers are used in perf are not restricted to just
including them to compile something.

There are sometimes used in scripts that convert defines into string
tables, etc, so some change may break one of these scripts, or new MSRs
may use some different #define pattern, etc.

E.g.:

  $ ls -1 tools/perf/trace/beauty/*.sh | head -5
  tools/perf/trace/beauty/arch_errno_names.sh
  tools/perf/trace/beauty/drm_ioctl.sh
  tools/perf/trace/beauty/fadvise.sh
  tools/perf/trace/beauty/fsconfig.sh
  tools/perf/trace/beauty/fsmount.sh
  $
  $ tools/perf/trace/beauty/fadvise.sh
  static const char *fadvise_advices[] = {
        [0] = "NORMAL",
        [1] = "RANDOM",
        [2] = "SEQUENTIAL",
        [3] = "WILLNEED",
        [4] = "DONTNEED",
        [5] = "NOREUSE",
  };
  $

The tools/perf/check-headers.sh script, part of the tools/ build
process, points out changes in the original files.

So its important not to touch the copies in tools/ when doing changes in
the original kernel headers, that will be done later, when
check-headers.sh inform about the change to the perf tools hackers.

Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Cc: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Cc: linux-s390@vger.kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20231121225650.390246-13-namhyung@kernel.org
</content>
</entry>
<entry>
<title>tools/perf: Update tools's copy of powerpc syscall table</title>
<updated>2023-11-22T18:57:47+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2023-11-21T22:56:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3483d2440538aa2575c3cddac0b7f42d488570cf'/>
<id>urn:sha1:3483d2440538aa2575c3cddac0b7f42d488570cf</id>
<content type='text'>
tldr; Just FYI, I'm carrying this on the perf tools tree.

Full explanation:

There used to be no copies, with tools/ code using kernel headers
directly. From time to time tools/perf/ broke due to legitimate kernel
hacking. At some point Linus complained about such direct usage. Then we
adopted the current model.

The way these headers are used in perf are not restricted to just
including them to compile something.

There are sometimes used in scripts that convert defines into string
tables, etc, so some change may break one of these scripts, or new MSRs
may use some different #define pattern, etc.

E.g.:

  $ ls -1 tools/perf/trace/beauty/*.sh | head -5
  tools/perf/trace/beauty/arch_errno_names.sh
  tools/perf/trace/beauty/drm_ioctl.sh
  tools/perf/trace/beauty/fadvise.sh
  tools/perf/trace/beauty/fsconfig.sh
  tools/perf/trace/beauty/fsmount.sh
  $
  $ tools/perf/trace/beauty/fadvise.sh
  static const char *fadvise_advices[] = {
        [0] = "NORMAL",
        [1] = "RANDOM",
        [2] = "SEQUENTIAL",
        [3] = "WILLNEED",
        [4] = "DONTNEED",
        [5] = "NOREUSE",
  };
  $

The tools/perf/check-headers.sh script, part of the tools/ build
process, points out changes in the original files.

So its important not to touch the copies in tools/ when doing changes in
the original kernel headers, that will be done later, when
check-headers.sh inform about the change to the perf tools hackers.

Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20231121225650.390246-12-namhyung@kernel.org
</content>
</entry>
<entry>
<title>tools/perf: Update tools's copy of x86 syscall table</title>
<updated>2023-11-22T18:57:47+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2023-11-21T22:56:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b3b11aed147af8f7c3d79c5b0b7474505d1dde7b'/>
<id>urn:sha1:b3b11aed147af8f7c3d79c5b0b7474505d1dde7b</id>
<content type='text'>
tldr; Just FYI, I'm carrying this on the perf tools tree.

Full explanation:

There used to be no copies, with tools/ code using kernel headers
directly. From time to time tools/perf/ broke due to legitimate kernel
hacking. At some point Linus complained about such direct usage. Then we
adopted the current model.

The way these headers are used in perf are not restricted to just
including them to compile something.

There are sometimes used in scripts that convert defines into string
tables, etc, so some change may break one of these scripts, or new MSRs
may use some different #define pattern, etc.

E.g.:

  $ ls -1 tools/perf/trace/beauty/*.sh | head -5
  tools/perf/trace/beauty/arch_errno_names.sh
  tools/perf/trace/beauty/drm_ioctl.sh
  tools/perf/trace/beauty/fadvise.sh
  tools/perf/trace/beauty/fsconfig.sh
  tools/perf/trace/beauty/fsmount.sh
  $
  $ tools/perf/trace/beauty/fadvise.sh
  static const char *fadvise_advices[] = {
        [0] = "NORMAL",
        [1] = "RANDOM",
        [2] = "SEQUENTIAL",
        [3] = "WILLNEED",
        [4] = "DONTNEED",
        [5] = "NOREUSE",
  };
  $

The tools/perf/check-headers.sh script, part of the tools/ build
process, points out changes in the original files.

So its important not to touch the copies in tools/ when doing changes in
the original kernel headers, that will be done later, when
check-headers.sh inform about the change to the perf tools hackers.

Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: x86@kernel.org
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/r/20231121225650.390246-11-namhyung@kernel.org
</content>
</entry>
<entry>
<title>Merge tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools</title>
<updated>2023-11-03T18:17:38+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-11-03T18:17:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7ab89417ed235f56d84c7893d38d4905e38d2692'/>
<id>urn:sha1:7ab89417ed235f56d84c7893d38d4905e38d2692</id>
<content type='text'>
Pull perf tools updates from Namhyung Kim:
 "Build:

   - Compile BPF programs by default if clang (&gt;= 12.0.1) is available
     to enable more features like kernel lock contention, off-cpu
     profiling, kwork, sample filtering and so on.

     This can be disabled by passing BUILD_BPF_SKEL=0 to make.

   - Produce better error messages for bison on debug build (make
     DEBUG=1) by defining YYDEBUG symbol internally.

  perf record:

   - Track sideband events (like FORK/MMAP) from all CPUs even if perf
     record targets a subset of CPUs only (using -C option). Otherwise
     it may lose some information happened on a CPU out of the target
     list.

   - Fix checking raw sched_switch tracepoint argument using system BTF.
     This affects off-cpu profiling which attaches a BPF program to the
     raw tracepoint.

  perf lock contention:

   - Add --lock-cgroup option to see contention by cgroups. This should
     be used with BPF only (using -b option).

       $ sudo perf lock con -ab --lock-cgroup -- sleep 1
        contended   total wait     max wait     avg wait   cgroup

              835     14.06 ms     41.19 us     16.83 us   /system.slice/led.service
               25    122.38 us     13.77 us      4.89 us   /
               44     23.73 us      3.87 us       539 ns   /user.slice/user-657345.slice/session-c4.scope
                1       491 ns       491 ns       491 ns   /system.slice/connectd.service

   - Add -G/--cgroup-filter option to see contention only for given
     cgroups.

     This can be useful when you identified a cgroup in the above
     command and want to investigate more on it. It also works with
     other output options like -t/--threads and -l/--lock-addr.

       $ sudo perf lock con -ab -G /user.slice/user-657345.slice/session-c4.scope -- sleep 1
        contended   total wait     max wait     avg wait         type   caller

                8     77.11 us     17.98 us      9.64 us     spinlock   futex_wake+0xc8
                2     24.56 us     14.66 us     12.28 us     spinlock   tick_do_update_jiffies64+0x25
                1      4.97 us      4.97 us      4.97 us     spinlock   futex_q_lock+0x2a

   - Use per-cpu array for better spinlock tracking. This is to improve
     performance of the BPF program and to avoid nested contention on a
     lock in the BPF hash map.

   - Update callstack check for PowerPC. To find a representative caller
     of a lock, it needs to look up the call stacks. It ends the lookup
     when it sees 0 in the call stack buffer. However, PowerPC call
     stacks can have 0 values in the beginning so skip them when it
     expects valid call stacks after.

  perf kwork:

   - Support 'sched' class (for -k option) so that it can see task
     scheduling event (using sched_switch tracepoint) as well as irq and
     workqueue items.

   - Add perf kwork top subcommand to show more accurate cpu utilization
     with sched class above. It works both with a recorded data (using
     perf kwork record command) and BPF (using -b option). Unlike perf
     top command, it does not support interactive mode (yet).

       $ sudo perf kwork top -b -k sched
       Starting trace, Hit &lt;Ctrl+C&gt; to stop and report
       ^C
       Total  : 160702.425 ms, 8 cpus
       %Cpu(s):  36.00% id,   0.00% hi,   0.00% si
       %Cpu0   [||||||||||||||||||              61.66%]
       %Cpu1   [||||||||||||||||||              61.27%]
       %Cpu2   [|||||||||||||||||||             66.40%]
       %Cpu3   [||||||||||||||||||              61.28%]
       %Cpu4   [||||||||||||||||||              61.82%]
       %Cpu5   [|||||||||||||||||||||||         77.41%]
       %Cpu6   [||||||||||||||||||              61.73%]
       %Cpu7   [||||||||||||||||||              63.25%]

             PID     SPID    %CPU           RUNTIME  COMMMAND
         -------------------------------------------------------------
               0        0   38.72       8089.463 ms  [swapper/1]
               0        0   38.71       8084.547 ms  [swapper/3]
               0        0   38.33       8007.532 ms  [swapper/0]
               0        0   38.26       7992.985 ms  [swapper/6]
               0        0   38.17       7971.865 ms  [swapper/4]
               0        0   36.74       7447.765 ms  [swapper/7]
               0        0   33.59       6486.942 ms  [swapper/2]
               0        0   22.58       3771.268 ms  [swapper/5]
            9545     9351    2.48        447.136 ms  sched-messaging
            9574     9351    2.09        418.583 ms  sched-messaging
            9724     9351    2.05        372.407 ms  sched-messaging
            9531     9351    2.01        368.804 ms  sched-messaging
            9512     9351    2.00        362.250 ms  sched-messaging
            9514     9351    1.95        357.767 ms  sched-messaging
            9538     9351    1.86        384.476 ms  sched-messaging
            9712     9351    1.84        386.490 ms  sched-messaging
            9723     9351    1.83        380.021 ms  sched-messaging
            9722     9351    1.82        382.738 ms  sched-messaging
            9517     9351    1.81        354.794 ms  sched-messaging
            9559     9351    1.79        344.305 ms  sched-messaging
            9725     9351    1.77        365.315 ms  sched-messaging
       &lt;SNIP&gt;

   - Add hard/soft-irq statistics to perf kwork top. This will show the
     total CPU utilization with IRQ stats like below:

       $ sudo perf kwork top -b -k sched,irq,softirq
       Starting trace, Hit &lt;Ctrl+C&gt; to stop and report
       ^C
       Total  :  12554.889 ms, 8 cpus
       %Cpu(s):  96.23% id,   0.10% hi,   0.19% si      &lt;---- here
       %Cpu0   [|                                4.60%]
       %Cpu1   [|                                4.59%]
       %Cpu2   [                                 2.73%]
       %Cpu3   [|                                3.81%]
       &lt;SNIP&gt;

  perf bench:

   - Add -G/--cgroups option to perf bench sched pipe. The pipe bench is
     good to measure context switch overhead. With this option, it puts
     the reader and writer tasks in separate cgroups to enforce context
     switch between two different cgroups.

     Also it needs to set CPU affinity of the tasks in a CPU to
     accurately measure the impact of cgroup context switches.

       $ sudo perf stat -e context-switches,cgroup-switches -- \
       &gt; taskset -c 0 perf bench sched pipe -l 100000
       # Running 'sched/pipe' benchmark:
       # Executed 100000 pipe operations between two processes

            Total time: 0.307 [sec]

              3.078180 usecs/op
                324867 ops/sec

        Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000':

                  200,026      context-switches
                       63      cgroup-switches

              0.321637922 seconds time elapsed

     You can see small number of cgroup-switches because both write and
     read tasks are in the same cgroup.

       $ sudo mkdir /sys/fs/cgroup/{AAA,BBB}

       $ sudo perf stat -e context-switches,cgroup-switches -- \
       &gt; taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB
       # Running 'sched/pipe' benchmark:
       # Executed 100000 pipe operations between two processes

            Total time: 0.351 [sec]

              3.512990 usecs/op
                284657 ops/sec

        Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB':

                  200,020      context-switches
                  200,019      cgroup-switches

              0.365034567 seconds time elapsed

     Now context-switches and cgroup-switches are almost same. And you
     can see the pipe operation took little more.

   - Kill child processes when perf bench sched messaging exited
     abnormally. Otherwise it'd leave the child doing unnecessary work.

  perf test:

   - Fix various shellcheck issues on the tests written in shell script.

   - Skip tests when condition is not satisfied:
      - object code reading test for non-text section addresses.
      - CoreSight test if cs_etm// event is not available.
      - lock contention test if not enough CPUs.

  Event parsing:

   - Make PMU alias name loading lazy to reduce the startup time in the
     event parsing code for perf record, stat and others in the general
     case.

   - Lazily compute PMU default config. In the same sense, delay PMU
     initialization until it's really needed to reduce the startup cost.

   - Fix event term values that are raw events. The event specification
     can have several terms including event name. But sometimes it
     clashes with raw event encoding which starts with 'r' and has
     hex-digits.

     For example, an event named 'read' should be processed as a normal
     event but it was mis-treated as a raw encoding and caused a
     failure.

       $ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1
       event syntax error: '..nning/event=read/'
                                         \___ parser error
       Run 'perf list' for a list of valid events

        Usage: perf stat [&lt;options&gt;] [&lt;command&gt;]

           -e, --event &lt;event&gt; event selector. use 'perf list' to list available events

  Event metrics:

   - Add "Compat" regex to match event with multiple identifiers.

   - Usual updates for Intel, Power10, Arm telemetry/CMN and AmpereOne.

  Misc:

   - Assorted memory leak fixes and footprint reduction.

   - Add "bpf_skeletons" to perf version --build-options so that users
     can check whether their perf tools have BPF support easily.

   - Fix unaligned access in Intel-PT packet decoder found by
     undefined-behavior sanitizer.

   - Avoid frequency mode for the dummy event. Surprisingly it'd impact
     kernel timer tick handler performance by force iterating all PMU
     events.

   - Update bash shell completion for events and metrics"

* tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (187 commits)
  perf vendor events intel: Update tsx_cycles_per_elision metrics
  perf vendor events intel: Update bonnell version number to v5
  perf vendor events intel: Update westmereex events to v4
  perf vendor events intel: Update meteorlake events to v1.06
  perf vendor events intel: Update knightslanding events to v16
  perf vendor events intel: Add typo fix for ivybridge FP
  perf vendor events intel: Update a spelling in haswell/haswellx
  perf vendor events intel: Update emeraldrapids to v1.01
  perf vendor events intel: Update alderlake/alderlake events to v1.23
  perf build: Disable BPF skeletons if clang version is &lt; 12.0.1
  perf callchain: Fix spelling mistake "statisitcs" -&gt; "statistics"
  perf report: Fix spelling mistake "heirachy" -&gt; "hierarchy"
  perf python: Fix binding linkage due to rename and move of evsel__increase_rlimit()
  perf tests: test_arm_coresight: Simplify source iteration
  perf vendor events intel: Add tigerlake two metrics
  perf vendor events intel: Add broadwellde two metrics
  perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metric
  perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit
  perf callchain: Minor layout changes to callchain_list
  perf callchain: Make brtype_stat in callchain_list optional
  ...
</content>
</entry>
<entry>
<title>Merge tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic</title>
<updated>2023-11-02T01:28:33+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-11-02T01:28:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1e0c505e13162a2abe7c984309cfe2ae976b428d'/>
<id>urn:sha1:1e0c505e13162a2abe7c984309cfe2ae976b428d</id>
<content type='text'>
Pull ia64 removal and asm-generic updates from Arnd Bergmann:

 - The ia64 architecture gets its well-earned retirement as planned,
   now that there is one last (mostly) working release that will be
   maintained as an LTS kernel.

 - The architecture specific system call tables are updated for the
   added map_shadow_stack() syscall and to remove references to the
   long-gone sys_lookup_dcookie() syscall.

* tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  hexagon: Remove unusable symbols from the ptrace.h uapi
  asm-generic: Fix spelling of architecture
  arch: Reserve map_shadow_stack() syscall number for all architectures
  syscalls: Cleanup references to sys_lookup_dcookie()
  Documentation: Drop or replace remaining mentions of IA64
  lib/raid6: Drop IA64 support
  Documentation: Drop IA64 from feature descriptions
  kernel: Drop IA64 support from sig_fault handlers
  arch: Remove Itanium (IA-64) architecture
</content>
</entry>
<entry>
<title>tools/perf/arch/powerpc: Fix the CPU ID const char* value by adding 0x prefix</title>
<updated>2023-10-17T19:40:51+00:00</updated>
<author>
<name>Athira Rajeev</name>
<email>atrajeev@linux.vnet.ibm.com</email>
</author>
<published>2023-10-09T05:00:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f6a66ff98ac1cfbfb83ec2445f5652ae065a1bf0'/>
<id>urn:sha1:f6a66ff98ac1cfbfb83ec2445f5652ae065a1bf0</id>
<content type='text'>
Simple expression parser test fails in powerpc as below:

    4: Simple expression parser
    test child forked, pid 170385
    Using CPUID 004e2102
    division by zero
    syntax error
    syntax error
    FAILED tests/expr.c:65 parse test failed
    test child finished with -1
    Simple expression parser: FAILED!

This is observed after commit:
'commit 9d5da30e4ae9 ("perf jevents: Add a new expression builtin strcmp_cpuid_str()")'

With this commit, a new expression builtin strcmp_cpuid_str
got added. This function takes an 'ID' type value, which is
a string. So expression parse for strcmp_cpuid_str expects
const char * as cpuid value type. In case of powerpc, CPU IDs
are numbers. Hence it doesn't get interpreted correctly by
bison parser. Example in case of power9, cpuid string returns
as: 004e2102

cpuid of string type is expected in two cases:
1. char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused);

   Testcase "tests/expr.c" uses "perf_pmu__getcpuid" which calls
   get_cpuid_str to get the cpuid string.

2. cpuid field in  :struct pmu_events_map

   struct pmu_events_map {
           const char *arch;
	   const char *cpuid;

   Here cpuid field is used in "perf_pmu__find_events_table"
   function as "strcmp_cpuid_str(map-&gt;cpuid, cpuid)". The
   value for cpuid field is picked from mapfile.csv.

Fix the mapfile.csv and get_cpuid_str function to prefix
cpuid with 0x so that it gets correctly interpreted by
the bison parser

Signed-off-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Tested-by: Disha Goel&lt;disgoel@linux.ibm.com&gt;
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20231009050052.64935-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf cs-etm: Respect timestamp option</title>
<updated>2023-10-17T19:40:51+00:00</updated>
<author>
<name>Leo Yan</name>
<email>leo.yan@linaro.org</email>
</author>
<published>2023-10-14T07:41:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=78efa7b411450a534e9d326cdf5578f681c73988'/>
<id>urn:sha1:78efa7b411450a534e9d326cdf5578f681c73988</id>
<content type='text'>
When users pass the option '--timestamp' or '-T' in the record command,
all events will set the PERF_SAMPLE_TIME bit in the attribution.  In
this case, the AUX event will record the kernel timestamp, but it
doesn't mean Arm CoreSight enables timestamp packets in its hardware
tracing.

If the option '--timestamp' or '-T' is set, this patch always enables
Arm CoreSight timestamp, as a result, the bit 28 in event's config is to
be set.

Before:

  # perf record -e cs_etm// --per-thread --timestamp -- ls
  # perf script --header-only
  ...
  # event : name = cs_etm//, , id = { 69 }, type = 12, size = 136,
  config = 0, { sample_period, sample_freq } = 1,
  sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST,
  disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1
  ...

After:

  # perf record -e cs_etm// --per-thread --timestamp -- ls
  # perf script --header-only
  ...
  # event : name = cs_etm//, , id = { 49 }, type = 12, size = 136,
  config = 0x10000000, { sample_period, sample_freq } = 1,
  sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST,
  disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1
  ...

Signed-off-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Reviewed-by: James Clark &lt;james.clark@arm.com&gt;
Acked-by: Suzuki K Poulose &lt;suzuki.poulose@arm.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Mike Leach &lt;mike.leach@linaro.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: John Garry &lt;john.g.garry@oracle.com&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20231014074159.1667880-3-leo.yan@linaro.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf cs-etm: Validate timestamp tracing in per-thread mode</title>
<updated>2023-10-17T19:40:51+00:00</updated>
<author>
<name>Leo Yan</name>
<email>leo.yan@linaro.org</email>
</author>
<published>2023-10-14T07:41:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f8ccc2d5cc6516b019bcf8e361ae2a380cb36019'/>
<id>urn:sha1:f8ccc2d5cc6516b019bcf8e361ae2a380cb36019</id>
<content type='text'>
So far, it's impossible to validate timestamp trace in Arm CoreSight when
the perf is in the per-thread mode.  E.g. for the command:

  perf record -e cs_etm/timestamp/ --per-thread -- ls

The command enables config 'timestamp' for 'cs_etm' event in the
per-thread mode.  In this case, the function cs_etm_validate_config()
directly bails out and skips validation.

Given profiled process can be scheduled on any CPUs in the per-thread
mode, this patch validates timestamp tracing for all CPUs when detect
the CPU map is empty.

Signed-off-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Reviewed-by: James Clark &lt;james.clark@arm.com&gt;
Acked-by: Suzuki K Poulose &lt;suzuki.poulose@arm.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Mike Leach &lt;mike.leach@linaro.org&gt;
Cc: John Garry &lt;john.g.garry@oracle.com&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20231014074159.1667880-2-leo.yan@linaro.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf pmu: Lazily compute default config</title>
<updated>2023-10-17T19:40:50+00:00</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2023-10-12T17:56:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0197da7affab502cd6e25da616ad038b169a7a77'/>
<id>urn:sha1:0197da7affab502cd6e25da616ad038b169a7a77</id>
<content type='text'>
The default config is computed during creation of the PMU and may do
things like scanning sysfs, when the PMU may just be used as part of
scanning. Change default_config to perf_event_attr_init_default, a
callback that is used when a default config needs initializing. This
avoids holding onto the memory for a perf_event_attr and copying.

On a tigerlake laptop running the pmu-scan benchmark:

Before:
Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
  Average core PMU scanning took: 28.780 usec (+- 0.503 usec)
  Average PMU scanning took: 283.480 usec (+- 18.471 usec)
Number of openat syscalls: 30,227

After:
Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
  Average core PMU scanning took: 27.880 usec (+- 0.169 usec)
  Average PMU scanning took: 245.260 usec (+- 15.758 usec)
Number of openat syscalls: 28,914

Over 3 runs it is a nearly 12% reduction in execution time and a 4.3%
of openat calls.

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Reviewed-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ravi Bangoria &lt;ravi.bangoria@amd.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: Suzuki K Poulose &lt;suzuki.poulose@arm.com&gt;
Cc: Yang Jihong &lt;yangjihong1@huawei.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Mike Leach &lt;mike.leach@linaro.org&gt;
Cc: Jing Zhang &lt;renyu.zj@linux.alibaba.com&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Thomas Richter &lt;tmricht@linux.ibm.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: John Garry &lt;john.g.garry@oracle.com&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20231012175645.1849503-8-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
</content>
</entry>
</feed>
