<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/perf/arm_pmu.c, branch linux-7.1.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.1.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.1.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-02-22T01:09:51+00:00</updated>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>urn:sha1:bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux</title>
<updated>2025-12-03T01:03:55+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-03T01:03:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=44fc84337b6eae580a51cf6f7ca6a22ef1349556'/>
<id>urn:sha1:44fc84337b6eae580a51cf6f7ca6a22ef1349556</id>
<content type='text'>
Pull arm64 updates from Catalin Marinas:
 "These are the arm64 updates for 6.19.

  The biggest part is the Arm MPAM driver under drivers/resctrl/.
  There's a patch touching mm/ to handle spurious faults for huge pmd
  (similar to the pte version). The corresponding arm64 part allows us
  to avoid the TLB maintenance if a (huge) page is reused after a write
  fault. There's EFI refactoring to allow runtime services with
  preemption enabled and the rest is the usual perf/PMU updates and
  several cleanups/typos.

  Summary:

  Core features:

   - Basic Arm MPAM (Memory system resource Partitioning And Monitoring)
     driver under drivers/resctrl/ which makes use of the fs/rectrl/ API

  Perf and PMU:

   - Avoid cycle counter on multi-threaded CPUs

   - Extend CSPMU device probing and add additional filtering support
     for NVIDIA implementations

   - Add support for the PMUs on the NoC S3 interconnect

   - Add additional compatible strings for new Cortex and C1 CPUs

   - Add support for data source filtering to the SPE driver

   - Add support for i.MX8QM and "DB" PMU in the imx PMU driver

  Memory managemennt:

   - Avoid broadcast TLBI if page reused in write fault

   - Elide TLB invalidation if the old PTE was not valid

   - Drop redundant cpu_set_*_tcr_t0sz() macros

   - Propagate pgtable_alloc() errors outside of __create_pgd_mapping()

   - Propagate return value from __change_memory_common()

  ACPI and EFI:

   - Call EFI runtime services without disabling preemption

   - Remove unused ACPI function

  Miscellaneous:

   - ptrace support to disable streaming on SME-only systems

   - Improve sysreg generation to include a 'Prefix' descriptor

   - Replace __ASSEMBLY__ with __ASSEMBLER__

   - Align register dumps in the kselftest zt-test

   - Remove some no longer used macros/functions

   - Various spelling corrections"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits)
  arm64/mm: Document why linear map split failure upon vm_reset_perms is not problematic
  arm64/pageattr: Propagate return value from __change_memory_common
  arm64/sysreg: Remove unused define ARM64_FEATURE_FIELD_BITS
  KVM: arm64: selftests: Consider all 7 possible levels of cache
  KVM: arm64: selftests: Remove ARM64_FEATURE_FIELD_BITS and its last user
  arm64: atomics: lse: Remove unused parameters from ATOMIC_FETCH_OP_AND macros
  Documentation/arm64: Fix the typo of register names
  ACPI: GTDT: Get rid of acpi_arch_timer_mem_init()
  perf: arm_spe: Add support for filtering on data source
  perf: Add perf_event_attr::config4
  perf/imx_ddr: Add support for PMU in DB (system interconnects)
  perf/imx_ddr: Get and enable optional clks
  perf/imx_ddr: Move ida_alloc() from ddr_perf_init() to ddr_perf_probe()
  dt-bindings: perf: fsl-imx-ddr: Add compatible string for i.MX8QM, i.MX8QXP and i.MX8DXL
  arm64: remove duplicate ARCH_HAS_MEM_ENCRYPT
  arm64: mm: use untagged address to calculate page index
  MAINTAINERS: new entry for MPAM Driver
  arm_mpam: Add kunit tests for props_mismatch()
  arm_mpam: Add kunit test for bitmap reset
  arm_mpam: Add helper to reset saved mbwu state
  ...
</content>
</entry>
<entry>
<title>perf: arm_pmuv3: Don't use PMCCNTR_EL0 on SMT cores</title>
<updated>2025-11-03T13:28:48+00:00</updated>
<author>
<name>Yicong Yang</name>
<email>yangyicong@hisilicon.com</email>
</author>
<published>2025-09-22T03:30:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c3d78c34ad009a7cce57ae5b5c93e1bd03bb31a3'/>
<id>urn:sha1:c3d78c34ad009a7cce57ae5b5c93e1bd03bb31a3</id>
<content type='text'>
CPU_CYCLES is expected to count the logical CPU (PE) clock. Currently it's
preferred to use PMCCNTR_EL0 for counting CPU_CYCLES, but it'll count
processor clock rather than the PE clock (ARM DDI0487 L.b D13.1.3) if
one of the SMT siblings is not idle on a multi-threaded implementation.
So don't use it on SMT cores.

Introduce topology_core_has_smt() for knowing the SMT implementation and
cached it in arm_pmu::has_smt during allocation.

When counting cycles on SMT CPU 2-3 and CPU 3 is idle, without this
patch we'll get:
[root@client1 tmp]# perf stat -e cycles -A -C 2-3 -- stress-ng -c 1
--taskset 2 --timeout 1
[...]
 Performance counter stats for 'CPU(s) 2-3':

CPU2           2880457316      cycles
CPU3           2880459810      cycles
       1.254688470 seconds time elapsed

With this patch the idle state of CPU3 is observed as expected:
[root@client1 ~]#  perf stat -e cycles -A -C 2-3 -- stress-ng -c 1
--taskset 2 --timeout 1
[...]
 Performance counter stats for 'CPU(s) 2-3':

CPU2           2558580492      cycles
CPU3               305749      cycles
       1.113626410 seconds time elapsed

Signed-off-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: arm_pmu: Kill last use of per-CPU cpu_armpmu pointer</title>
<updated>2025-10-27T16:16:37+00:00</updated>
<author>
<name>Marc Zyngier</name>
<email>maz@kernel.org</email>
</author>
<published>2025-10-20T12:29:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fa9d2777387346645a40ab37cfb0c37b3ef40cc9'/>
<id>urn:sha1:fa9d2777387346645a40ab37cfb0c37b3ef40cc9</id>
<content type='text'>
Having removed the use of the cpu_armpmu per-CPU variable from the
interrupt handling, the only user left is the BRBE scheduler hook.

It is easy to drop the use of this variable by following the pointer to the
generic PMU structure, and get the arm_pmu structure from there.

Perform the conversion and kill cpu_armpmu altogether.

Suggested-by: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Marc Zyngier &lt;maz@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Will Deacon &lt;will@kernel.org&gt;
Link: https://patch.msgid.link/20251020122944.3074811-27-maz@kernel.org
</content>
</entry>
<entry>
<title>perf: arm_pmu: Request specific affinities for per CPU NMIs/interrupts</title>
<updated>2025-10-27T16:16:35+00:00</updated>
<author>
<name>Will Deacon</name>
<email>will@kernel.org</email>
</author>
<published>2025-10-20T12:29:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=54b350fa8e965dc59622698e2a18d6bf73944bf4'/>
<id>urn:sha1:54b350fa8e965dc59622698e2a18d6bf73944bf4</id>
<content type='text'>
Let the PMU driver request both NMIs and normal interrupts with an affinity mask
matching the PMU affinity.

Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Marc Zyngier &lt;maz@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Will Deacon &lt;will@kernel.org&gt;
Link: https://patch.msgid.link/20251020122944.3074811-19-maz@kernel.org
</content>
</entry>
<entry>
<title>genirq: Update request_percpu_nmi() to take an affinity</title>
<updated>2025-10-27T16:16:35+00:00</updated>
<author>
<name>Marc Zyngier</name>
<email>maz@kernel.org</email>
</author>
<published>2025-10-20T12:29:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b9c6aa9efc71dae656f9f913d1250ea08cd6e10f'/>
<id>urn:sha1:b9c6aa9efc71dae656f9f913d1250ea08cd6e10f</id>
<content type='text'>
Continue spreading the notion of affinity to the per CPU interrupt request
code by updating the call sites that use request_percpu_nmi() (all two of
them) to take an affinity pointer. This pointer is firmly NULL for now.

Signed-off-by: Marc Zyngier &lt;maz@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Will Deacon &lt;will@kernel.org&gt;
Link: https://patch.msgid.link/20251020122944.3074811-16-maz@kernel.org
</content>
</entry>
<entry>
<title>perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)</title>
<updated>2025-07-08T16:58:49+00:00</updated>
<author>
<name>Rob Herring (Arm)</name>
<email>robh@kernel.org</email>
</author>
<published>2025-06-11T18:01:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=58074a0fce66c6c97b35ce8a28ed4e7b780f9a8f'/>
<id>urn:sha1:58074a0fce66c6c97b35ce8a28ed4e7b780f9a8f</id>
<content type='text'>
The ARMv9.2 architecture introduces the optional Branch Record Buffer
Extension (BRBE), which records information about branches as they are
executed into set of branch record registers. BRBE is similar to x86's
Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
(BHRB).

BRBE supports filtering by exception level and can filter just the
source or target address if excluded to avoid leaking privileged
addresses. The h/w filter would be sufficient except when there are
multiple events with disjoint filtering requirements. In this case, BRBE
is configured with a union of all the events' desired branches, and then
the recorded branches are filtered based on each event's filter. For
example, with one event capturing kernel events and another event
capturing user events, BRBE will be configured to capture both kernel
and user branches. When handling event overflow, the branch records have
to be filtered by software to only include kernel or user branch
addresses for that event. In contrast, x86 simply configures LBR using
the last installed event which seems broken.

It is possible on x86 to configure branch filter such that no branches
are ever recorded (e.g. -j save_type). For BRBE, events with a
configuration that will result in no samples are rejected.

Recording branches in KVM guests is not supported like x86. However,
perf on x86 allows requesting branch recording in guests. The guest
events are recorded, but the resulting branches are all from the host.
For BRBE, events with branch recording and "exclude_host" set are
rejected. Requiring "exclude_guest" to be set did not work. The default
for the perf tool does set "exclude_guest" if no exception level
options are specified. However, specifying kernel or user events
defaults to including both host and guest. In this case, only host
branches are recorded.

BRBE can support some additional exception branch types compared to
x86. On x86, all exceptions other than syscalls are recorded as IRQ.
With BRBE, it is possible to better categorize these exceptions. One
limitation relative to x86 is we cannot distinguish a syscall return
from other exception returns. So all exception returns are recorded as
ERET type. The FIQ branch type is omitted as the only FIQ user is Apple
platforms which don't support BRBE. The debug branch types are omitted
as there is no clear need for them.

BRBE records are invalidated whenever events are reconfigured, a new
task is scheduled in, or after recording is paused (and the records
have been recorded for the event). The architecture allows branch
records to be invalidated by the PE under implementation defined
conditions. It is expected that these conditions are rare.

Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Co-developed-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Signed-off-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Co-developed-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Tested-by: James Clark &lt;james.clark@linaro.org&gt;
Signed-off-by: Rob Herring (Arm) &lt;robh@kernel.org&gt;
tested-by: Adam Young &lt;admiyo@os.amperecomputing.com&gt;
Acked-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Link: https://lore.kernel.org/r/20250611-arm-brbe-v19-v23-4-e7775563036e@kernel.org
[will: Fix sparse warnings about mixed declarations and code.
       Fix C99 comment syntax.]
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: arm_pmu: Don't disable counter in armpmu_add()</title>
<updated>2025-03-01T05:08:10+00:00</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2025-02-18T20:39:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dcca27bc1eccb9abc2552aab950b18a9742fb8e7'/>
<id>urn:sha1:dcca27bc1eccb9abc2552aab950b18a9742fb8e7</id>
<content type='text'>
Currently armpmu_add() tries to handle a newly-allocated counter having
a stale associated event, but this should not be possible, and if this
were to happen the current mitigation is insufficient and potentially
expensive. It would be better to warn if we encounter the impossible
case.

Calls to pmu::add() and pmu::del() are serialized by the core perf code,
and armpmu_del() clears the relevant slot in pmu_hw_events::events[]
before clearing the bit in pmu_hw_events::used_mask such that the
counter can be reallocated. Thus when armpmu_add() allocates a counter
index from pmu_hw_events::used_mask, it should not be possible to observe
a stale even in pmu_hw_events::events[] unless either
pmu_hw_events::used_mask or pmu_hw_events::events[] have been corrupted.

If this were to happen, we'd end up with two events with the same
event-&gt;hw.idx, which would clash with each other during reprogramming,
deletion, etc, and produce bogus results. Add a WARN_ON_ONCE() for this
case so that we can detect if this ever occurs in practice.

That possiblity aside, there's no need to call arm_pmu::disable(event)
for the new event. The PMU reset code initialises the counter in a
disabled state, and armpmu_del() will disable the counter before it can
be reused. Remove the redundant disable.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Rob Herring (Arm) &lt;robh@kernel.org&gt;
Reviewed-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Tested-by: James Clark &lt;james.clark@linaro.org&gt;
Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-2-4e9922fc2e8e@kernel.org
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: arm_pmu: Remove event index to counter remapping</title>
<updated>2024-08-16T12:09:11+00:00</updated>
<author>
<name>Rob Herring (Arm)</name>
<email>robh@kernel.org</email>
</author>
<published>2024-07-31T16:51:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf5ffc8c80e0cf5205849cd0c9c3cb261d2beee6'/>
<id>urn:sha1:bf5ffc8c80e0cf5205849cd0c9c3cb261d2beee6</id>
<content type='text'>
Xscale and Armv6 PMUs defined the cycle counter at 0 and event counters
starting at 1 and had 1:1 event index to counter numbering. On Armv7 and
later, this changed the cycle counter to 31 and event counters start at
0. The drivers for Armv7 and PMUv3 kept the old event index numbering
and introduced an event index to counter conversion. The conversion uses
masking to convert from event index to a counter number. This operation
relies on having at most 32 counters so that the cycle counter index 0
can be transformed to counter number 31.

Armv9.4 adds support for an additional fixed function counter
(instructions) which increases possible counters to more than 32, and
the conversion won't work anymore as a simple subtract and mask. The
primary reason for the translation (other than history) seems to be to
have a contiguous mask of counters 0-N. Keeping that would result in
more complicated index to counter conversions. Instead, store a mask of
available counters rather than just number of events. That provides more
information in addition to the number of events.

No (intended) functional changes.

Acked-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Rob Herring (Arm) &lt;robh@kernel.org&gt;
Tested-by: James Clark &lt;james.clark@linaro.org&gt;
Link: https://lore.kernel.org/r/20240731-arm-pmu-3-9-icntr-v3-1-280a8d7ff465@kernel.org
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
</entry>
</feed>
