summaryrefslogtreecommitdiff
path: root/drivers/perf
AgeCommit message (Collapse)AuthorFilesLines
2025-07-14drivers/perf: hisi: Support PMUs with no interruptYicong Yang1-3/+8
We'll have PMUs don't have an interrupt to indicate the counter overflow, but the Uncore PMU core assume all the PMUs have interrupt. So handle this case in the core. The existing PMUs won't be affected. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-7-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Relax the event number check of v2 PMUsJunhao He4-6/+6
The supported event number range of each Uncore PMUs is provided by each driver in hisi_pmu::check_event and out of range events will be rejected. A later version with expanded event number range needs to register the PMU with updated hisi_pmu::check_event even if it's the only update, which means the expanded events cannot be used unless the driver's updated. However the unsupported events won't be counted by the hardware so we can relax the event number check to allow the use the expanded events. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-6-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Add support for HiSilicon SLLC v3 PMU driverJunhao He1-0/+40
SLLC v3 PMU has the following changes compared to previous version: a) update the register layout b) update the definition of SRCID_CTRL and TGTID_CTRL registers. To be compatible with v2, we use maximum width (11 bits) and mask the extra length for themselves. c) remove latency events (driver does not need to be adapted). SLLC v3 PMU is identified with HID HISI0264. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Use ACPI driver_data to retrieve SLLC PMU informationJunhao He1-60/+118
Make use of struct acpi_device_id::driver_data for version specific information rather than judge the version register. This will help to simplify the probe process and also a bit easier for extension. Factor out SLLC register definition to struct hisi_sllc_pmu_regs. No functional changes intended. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Add support for HiSilicon DDRC v3 PMU driverJunhao He1-0/+24
HiSilicon DDRC v3 PMU has the different interrupt register offset compared to the v2. Add device information of v3 PMU with ACPI HID HISI0235. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14drivers/perf: hisi: Simplify the probe process for each DDRC versionJunhao He2-188/+142
Version 1 and 2 of DDRC PMU also use different HID. Make use of struct acpi_device_id::driver_data for version specific information rather than judge the version register. This will help to simplify the probe process and also a bit easier for extension. In order to support this extend struct hisi_pmu_dev_info for version specific counter bits and event range. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250619125557.57372-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14perf/arm-ni: Support sharing IRQs within an NI instanceShouping Wang1-27/+55
NI-700 has a distinct PMU interrupt output for each Clock Domain, however some integrations may still combine these together externally. The initial driver didn't attempt to support this, in anticipation of a more general solution for IRQ sharing between system PMU instances, but that's still a way off, so let's make this intermediate step for now to at least allow sharing IRQs within an individual NI instance. Now that CPU affinity and migration are cleaned up, it's fairly straightforward to adopt similar logic to arm-cmn, to identify CDs with a common interrupt and loop over them directly in the handler. Signed-off-by: Shouping Wang <allen.wang@hj-micro.com> [ rm: Rework for affinity handling, cosmetics, new commit message ] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/f62db639d3b54c959ec477db7b8ccecbef1ca310.1752256072.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14perf/arm-ni: Consolidate CPU affinity handlingRobin Murphy1-40/+34
Since overflow interrupts from the individual PMUs are infrequent and unlikely to coincide, and we make no attempt to balance them across CPUs anyway, there's really not much point tracking a separate CPU affinity per PMU. Move the CPU affinity and hotplug migration up to the NI instance level. Tested-by: Shouping Wang <allen.wang@hj-micro.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/00b622872006c2f0c89485e343b1cb8caaa79c47.1752256072.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14perf/cxlpmu: Fix typos in cxl_pmu.c comments and documentationAlok Tiwari1-3/+3
Fix several minor typo errors in comments: - Remove duplicated word "a" in "a a VID / GroupID". - Correct "Opcopdes" to "Opcodes" in CXL spec reference. - Fix spelling of "implemnted" to "implemented". Improves code readability and documentation consistency. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://lore.kernel.org/r/20250624194350.109790-4-alok.a.tiwari@oracle.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14perf/cxlpmu: Remove unintended newline from IRQ name format stringAlok Tiwari1-1/+1
The IRQ name format string used in devm_kasprintf() mistakenly included a newline character "\n". This could lead to confusing log output or misformatted names in sysfs or debug messages. This fix removes the newline to ensure proper IRQ naming. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://lore.kernel.org/r/20250624194350.109790-3-alok.a.tiwari@oracle.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14perf/cxlpmu: Fix devm_kcalloc() argument order in cxl_pmu_probe()Alok Tiwari1-2/+2
The previous code mistakenly swapped the count and size parameters. This fix corrects the argument order in devm_kcalloc() to follow the conventional count, size form, avoiding potential confusion or bugs. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://lore.kernel.org/r/20250624194350.109790-2-alok.a.tiwari@oracle.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08perf: arm_spe: Relax period restrictionLeo Yan1-7/+11
The minimum interval specified the PMSIDR_EL1.Interval field is a hardware recommendation. However, this value is set by hardware designer before the production. It is not actual hardware limitation but tools currently have no way to test shorter periods. This change relaxes the limitation by allowing any non-zero periods, with simplifying code with clamp_t(). The downside is that small periods may increase the risk of AUX ring buffer overruns. When an overrun occurs, the perf core layer will trigger an irq work to disable the event and wake up the tool in user space to read the trace data. After the tool finishes reading, it will re-enable the AUX event. Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250627163028.3503122-1-leo.yan@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)Rob Herring (Arm)6-5/+982
The ARMv9.2 architecture introduces the optional Branch Record Buffer Extension (BRBE), which records information about branches as they are executed into set of branch record registers. BRBE is similar to x86's Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer (BHRB). BRBE supports filtering by exception level and can filter just the source or target address if excluded to avoid leaking privileged addresses. The h/w filter would be sufficient except when there are multiple events with disjoint filtering requirements. In this case, BRBE is configured with a union of all the events' desired branches, and then the recorded branches are filtered based on each event's filter. For example, with one event capturing kernel events and another event capturing user events, BRBE will be configured to capture both kernel and user branches. When handling event overflow, the branch records have to be filtered by software to only include kernel or user branch addresses for that event. In contrast, x86 simply configures LBR using the last installed event which seems broken. It is possible on x86 to configure branch filter such that no branches are ever recorded (e.g. -j save_type). For BRBE, events with a configuration that will result in no samples are rejected. Recording branches in KVM guests is not supported like x86. However, perf on x86 allows requesting branch recording in guests. The guest events are recorded, but the resulting branches are all from the host. For BRBE, events with branch recording and "exclude_host" set are rejected. Requiring "exclude_guest" to be set did not work. The default for the perf tool does set "exclude_guest" if no exception level options are specified. However, specifying kernel or user events defaults to including both host and guest. In this case, only host branches are recorded. BRBE can support some additional exception branch types compared to x86. On x86, all exceptions other than syscalls are recorded as IRQ. With BRBE, it is possible to better categorize these exceptions. One limitation relative to x86 is we cannot distinguish a syscall return from other exception returns. So all exception returns are recorded as ERET type. The FIQ branch type is omitted as the only FIQ user is Apple platforms which don't support BRBE. The debug branch types are omitted as there is no clear need for them. BRBE records are invalidated whenever events are reconfigured, a new task is scheduled in, or after recording is paused (and the records have been recorded for the event). The architecture allows branch records to be invalidated by the PE under implementation defined conditions. It is expected that these conditions are rare. Cc: Catalin Marinas <catalin.marinas@arm.com> Co-developed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Co-developed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> tested-by: Adam Young <admiyo@os.amperecomputing.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20250611-arm-brbe-v19-v23-4-e7775563036e@kernel.org [will: Fix sparse warnings about mixed declarations and code. Fix C99 comment syntax.] Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04perf/arm: Add missing .suppress_bind_attrsRobin Murphy2-0/+2
PMU drivers should set .suppress_bind_attrs so that userspace is denied the opportunity to pull the driver out from underneath an in-use PMU (with predictably unpleasant consequences). Somehow both the CMN and NI drivers have managed to miss this; put that right. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/acd48c341b33b96804a3969ee00b355d40c546e2.1751465293.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04perf/arm-cmn: Reduce stack usage during discoveryRobin Murphy1-7/+8
Arnd reports that Clang's aggressive inlining of arm_cmn_discover() can lead to stack frame size warnings, and while we could simply prevent such inlining to hide the issue, it seems more productive to actually heed the warning and do something about the overall stack footprint. The xp_region array is already rather large, and CMN_MAX_XPS might only grow larger in future, however it only serves as a convenience to save repeating the first level's worth of register reads in the second pass of discovery. There's no performance concern here, and it only takes a small tweak to the flow to re-extract the offsets instead of stashing them, so let's just do that and save several hundred bytes of stack. Reported-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/e7dd41bf0f1b098e2e4b01ef91318a4b272abff8.1751046159.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04perf: imx9_perf: make the read-only array mask static constColin Ian King1-3/+5
Don't populate the read-only array mask on the stack at run time, instead make it static const. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20250611133917.170888-1-colin.i.king@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04perf/arm-cmn: Broaden module description for wider interconnect supportZhiyuan Dai1-2/+2
The current MODULE_DESCRIPTION only mentions CMN-600, but this driver now supports several Arm mesh interconnects including CMN-650, CMN-700, CI-700, and CMN-S3. Update the MODULE_DESCRIPTION to reflect the expanded scope. Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn> Link: https://lore.kernel.org/r/20250522032122.949373-1-daizhiyuan@phytium.com.cn Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04perf/arm-ni: Set initial IRQ affinityRobin Murphy1-0/+2
While we do request our IRQs with the right flags to stop their affinity changing unexpectedly, we forgot to actually set it to start with. Oops. Cc: stable@vger.kernel.org Fixes: 4d5a7680f2b4 ("perf: Add driver for Arm NI-700 interconnect PMU") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Shouping Wang <allen.wang@hj-micro.com> Link: https://lore.kernel.org/r/614ced9149ee8324e58930862bd82cbf46228d27.1747149165.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-29Merge tag 'arm64-upstream' of ↵Linus Torvalds4-32/+30
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "The headline feature is the re-enablement of support for Arm's Scalable Matrix Extension (SME) thanks to a bumper crop of fixes from Mark Rutland. If matrices aren't your thing, then Ryan's page-table optimisation work is much more interesting. Summary: ACPI, EFI and PSCI: - Decouple Arm's "Software Delegated Exception Interface" (SDEI) support from the ACPI GHES code so that it can be used by platforms booted with device-tree - Remove unnecessary per-CPU tracking of the FPSIMD state across EFI runtime calls - Fix a node refcount imbalance in the PSCI device-tree code CPU Features: - Ensure register sanitisation is applied to fields in ID_AA64MMFR4 - Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM guests can reliably query the underlying CPU types from the VMM - Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes to our context-switching, signal handling and ptrace code Entry code: - Hook up TIF_NEED_RESCHED_LAZY so that CONFIG_PREEMPT_LAZY can be selected Memory management: - Prevent BSS exports from being used by the early PI code - Propagate level and stride information to the low-level TLB invalidation routines when operating on hugetlb entries - Use the page-table contiguous hint for vmap() mappings with VM_ALLOW_HUGE_VMAP where possible - Optimise vmalloc()/vmap() page-table updates to use "lazy MMU mode" and hook this up on arm64 so that the trailing DSB (used to publish the updates to the hardware walker) can be deferred until the end of the mapping operation - Extend mmap() randomisation for 52-bit virtual addresses (on par with 48-bit addressing) and remove limited support for randomisation of the linear map Perf and PMUs: - Add support for probing the CMN-S3 driver using ACPI - Minor driver fixes to the CMN, Arm-NI and amlogic PMU drivers Selftests: - Fix FPSIMD and SME tests to align with the freshly re-enabled SME support - Fix default setting of the OUTPUT variable so that tests are installed in the right location vDSO: - Replace raw counter access from inline assembly code with a call to the the __arch_counter_get_cntvct() helper function Miscellaneous: - Add some missing header inclusions to the CCA headers - Rework rendering of /proc/cpuinfo to follow the x86-approach and avoid repeated buffer expansion (the user-visible format remains identical) - Remove redundant selection of CONFIG_CRC32 - Extend early error message when failing to map the device-tree blob" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits) arm64: cputype: Add cputype definition for HIP12 arm64: el2_setup.h: Make __init_el2_fgt labels consistent, again perf/arm-cmn: Add CMN S3 ACPI binding arm64/boot: Disallow BSS exports to startup code arm64/boot: Move global CPU override variables out of BSS arm64/boot: Move init_pgdir[] and init_idmap_pgdir[] into __pi_ namespace perf/arm-cmn: Initialise cmn->cpu earlier kselftest/arm64: Set default OUTPUT path when undefined arm64: Update comment regarding values in __boot_cpu_mode arm64: mm: Drop redundant check in pmd_trans_huge() arm64/mm: Re-organise setting up FEAT_S1PIE registers PIRE0_EL1 and PIR_EL1 arm64/mm: Permit lazy_mmu_mode to be nested arm64/mm: Disable barrier batching in interrupt contexts arm64/cpuinfo: only show one cpu's info in c_show() arm64/mm: Batch barriers when updating kernel mappings mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes arm64/mm: Support huge pte-mapped pages in vmap mm/vmalloc: Gracefully unmap huge ptes mm/vmalloc: Warn on improper use of vunmap_range() arm64/mm: Hoist barriers out of set_ptes_anysz() loop ...
2025-05-21perf/apple_m1: Remove driver-specific throttle supportKan Liang1-2/+1
The throttle support has been added in the generic code. Remove the driver-specific throttle support. Besides the throttle, perf_event_overflow may return true because of event_limit. It already does an inatomic event disable. The pmu->stop is not required either. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250520181644.2673067-10-kan.liang@linux.intel.com
2025-05-21perf/arm: Remove driver-specific throttle supportKan Liang4-10/+5
The throttle support has been added in the generic code. Remove the driver-specific throttle support. Besides the throttle, perf_event_overflow may return true because of event_limit. It already does an inatomic event disable. The pmu->stop is not required either. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250520181644.2673067-9-kan.liang@linux.intel.com
2025-05-19perf/arm-cmn: Add CMN S3 ACPI bindingRobin Murphy1-0/+1
An ACPI binding for CMN S3 was not yet finalised when the driver support was originally written, but v1.2 of DEN0093 "ACPI for Arm Components" has at last been published; support ACPI systems using the proper HID. Cc: stable@vger.kernel.org Fixes: 0dc2f4963f7e ("perf/arm-cmn: Support CMN S3") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/7dafe147f186423020af49d7037552ee59c60e97.1747652164.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-16perf/arm-cmn: Initialise cmn->cpu earlierRobin Murphy1-1/+1
For all the complexity of handling affinity for CPU hotplug, what we've apparently managed to overlook is that arm_cmn_init_irqs() has in fact always been setting the *initial* affinity of all IRQs to CPU 0, not the CPU we subsequently choose for event scheduling. Oh dear. Cc: stable@vger.kernel.org Fixes: 0ba64770a2f2 ("perf: Add Arm CMN-600 PMU driver") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/b12fccba6b5b4d2674944f59e4daad91cd63420b.1747069914.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-09perf/amlogic: Replace smp_processor_id() with raw_smp_processor_id() in ↵Anand Moon1-1/+1
meson_ddr_pmu_create() The Amlogic DDR PMU driver meson_ddr_pmu_create() function incorrectly uses smp_processor_id(), which assumes disabled preemption. This leads to kernel warnings during module loading because meson_ddr_pmu_create() can be called in a preemptible context. Following kernel warning and stack trace: [ 31.745138] [ T2289] BUG: using smp_processor_id() in preemptible [00000000] code: (udev-worker)/2289 [ 31.745154] [ T2289] caller is debug_smp_processor_id+0x28/0x38 [ 31.745172] [ T2289] CPU: 4 UID: 0 PID: 2289 Comm: (udev-worker) Tainted: GW 6.14.0-0-MANJARO-ARM #1 59519addcbca6ba8de735e151fd7b9e97aac7ff0 [ 31.745181] [ T2289] Tainted: [W]=WARN [ 31.745183] [ T2289] Hardware name: Hardkernel ODROID-N2Plus (DT) [ 31.745188] [ T2289] Call trace: [ 31.745191] [ T2289] show_stack+0x28/0x40 (C) [ 31.745199] [ T2289] dump_stack_lvl+0x4c/0x198 [ 31.745205] [ T2289] dump_stack+0x20/0x50 [ 31.745209] [ T2289] check_preemption_disabled+0xec/0xf0 [ 31.745213] [ T2289] debug_smp_processor_id+0x28/0x38 [ 31.745216] [ T2289] meson_ddr_pmu_create+0x200/0x560 [meson_ddr_pmu_g12 8095101c49676ad138d9961e3eddaee10acca7bd] [ 31.745237] [ T2289] g12_ddr_pmu_probe+0x20/0x38 [meson_ddr_pmu_g12 8095101c49676ad138d9961e3eddaee10acca7bd] [ 31.745246] [ T2289] platform_probe+0x98/0xe0 [ 31.745254] [ T2289] really_probe+0x144/0x3f8 [ 31.745258] [ T2289] __driver_probe_device+0xb8/0x180 [ 31.745261] [ T2289] driver_probe_device+0x54/0x268 [ 31.745264] [ T2289] __driver_attach+0x11c/0x288 [ 31.745267] [ T2289] bus_for_each_dev+0xfc/0x160 [ 31.745274] [ T2289] driver_attach+0x34/0x50 [ 31.745277] [ T2289] bus_add_driver+0x160/0x2b0 [ 31.745281] [ T2289] driver_register+0x78/0x120 [ 31.745285] [ T2289] __platform_driver_register+0x30/0x48 [ 31.745288] [ T2289] init_module+0x30/0xfe0 [meson_ddr_pmu_g12 8095101c49676ad138d9961e3eddaee10acca7bd] [ 31.745298] [ T2289] do_one_initcall+0x11c/0x438 [ 31.745303] [ T2289] do_init_module+0x68/0x228 [ 31.745311] [ T2289] load_module+0x118c/0x13a8 [ 31.745315] [ T2289] __arm64_sys_finit_module+0x274/0x390 [ 31.745320] [ T2289] invoke_syscall+0x74/0x108 [ 31.745326] [ T2289] el0_svc_common+0x90/0xf8 [ 31.745330] [ T2289] do_el0_svc+0x2c/0x48 [ 31.745333] [ T2289] el0_svc+0x60/0x150 [ 31.745337] [ T2289] el0t_64_sync_handler+0x80/0x118 [ 31.745341] [ T2289] el0t_64_sync+0x1b8/0x1c0 Changes replaces smp_processor_id() with raw_smp_processor_id() to ensure safe CPU ID retrieval in preemptible contexts. Cc: Jiucheng Xu <jiucheng.xu@amlogic.com> Fixes: 2016e2113d35 ("perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver") Signed-off-by: Anand Moon <linux.amoon@gmail.com> Link: https://lore.kernel.org/r/20250407063206.5211-1-linux.amoon@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-09perf/arm-cmn: Fix REQ2/SNP2 mixupRobin Murphy1-4/+4
Somehow the encodings for REQ2/SNP2 channels in XP events got mixed up... Unmix them. CC: stable@vger.kernel.org Fixes: 23760a014417 ("perf/arm-cmn: Add CMN-700 support") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/087023e9737ac93d7ec7a841da904758c254cb01.1746717400.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-04-17perf: Do not enable by default during compile testingKrzysztof Kozlowski1-1/+1
Enabling the compile test should not cause automatic enabling of all drivers, but only allow to choose to compile them. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20250417074650.81561-1-krzysztof.kozlowski@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2025-04-17perf: arm-ni: Fix missing platform_set_drvdata()Hongbo Yao1-0/+1
Add missing platform_set_drvdata in arm_ni_probe(), otherwise calling platform_get_drvdata() in remove returns NULL. Fixes: 4d5a7680f2b4 ("perf: Add driver for Arm NI-700 interconnect PMU") Signed-off-by: Hongbo Yao <andy.xu@hj-micro.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20250401054248.3985814-1-andy.xu@hj-micro.com Signed-off-by: Will Deacon <will@kernel.org>
2025-04-17perf: arm-ni: Unregister PMUs on probe failureHongbo Yao1-18/+21
When a resource allocation fails in one clock domain of an NI device, we need to properly roll back all previously registered perf PMUs in other clock domains of the same device. Otherwise, it can lead to kernel panics. Calling arm_ni_init+0x0/0xff8 [arm_ni] @ 2374 arm-ni ARMHCB70:00: Failed to request PMU region 0x1f3c13000 arm-ni ARMHCB70:00: probe with driver arm-ni failed with error -16 list_add corruption: next->prev should be prev (fffffd01e9698a18), but was 0000000000000000. (next=ffff10001a0decc8). pstate: 6340009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : list_add_valid_or_report+0x7c/0xb8 lr : list_add_valid_or_report+0x7c/0xb8 Call trace: __list_add_valid_or_report+0x7c/0xb8 perf_pmu_register+0x22c/0x3a0 arm_ni_probe+0x554/0x70c [arm_ni] platform_probe+0x70/0xe8 really_probe+0xc6/0x4d8 driver_probe_device+0x48/0x170 __driver_attach+0x8e/0x1c0 bus_for_each_dev+0x64/0xf0 driver_add+0x138/0x260 bus_add_driver+0x68/0x138 __platform_driver_register+0x2c/0x40 arm_ni_init+0x14/0x2a [arm_ni] do_init_module+0x36/0x298 ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops - BUG: Fatal exception SMP: stopping secondary CPUs Fixes: 4d5a7680f2b4 ("perf: Add driver for Arm NI-700 interconnect PMU") Signed-off-by: Hongbo Yao <andy.xu@hj-micro.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20250403070918.4153839-1-andy.xu@hj-micro.com Signed-off-by: Will Deacon <will@kernel.org>
2025-04-17perf/arm-cmn: Remove CMN-600 DTC domain special caseRobin Murphy1-7/+0
The special case for trying to infer the DTC domain for DTC-adjacent nodes on CMN-600 is fragile and buggy - currently resulting in subtly messed up DTC counter allocation - and the theoretical benefit it offers to a tiny minority of use-cases arguably doesn't outweigh the inconsistency it offers to others anyway. Just get rid of it. Fixes: ab33c66fd8f1 ("perf/arm-cmn: Enable per-DTC counter allocation") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/67985e39f53b56385d79a4f1264cf7f9cacedb58.1742308248.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-03-29Merge tag 'pci-v6.15-changes' of ↵Linus Torvalds1-22/+3
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Enable Configuration RRS SV, which makes device readiness visible, early instead of during child bus scanning (Bjorn Helgaas) - Log debug messages about reset methods being used (Bjorn Helgaas) - Avoid reset when it has been disabled via sysfs (Nishanth Aravamudan) - Add common pci-ep-bus.yaml schema for exporting several peripherals of a single PCI function via devicetree (Andrea della Porta) - Create DT nodes for PCI host bridges to enable loading device tree overlays to create platform devices for PCI devices that have several features that require multiple drivers (Herve Codina) Resource management: - Enlarge devres table[] to accommodate bridge windows, ROM, IOV BARs, etc., and validate BAR index in devres interfaces (Philipp Stanner) - Fix typo that repeatedly distributed resources to a bridge instead of iterating over subordinate bridges, which resulted in too little space to assign some BARs (Kai-Heng Feng) - Relax bridge window tail sizing for optional resources, e.g., IOV BARs, to avoid failures when removing and re-adding devices (Ilpo Järvinen) - Allow drivers to enable devices even if we haven't assigned optional IOV resources to them (Ilpo Järvinen) - Rework handling of optional resources (IOV BARs, ROMs) to reduce failures if we can't allocate them (Ilpo Järvinen) - Fix a NULL dereference in the SR-IOV VF creation error path (Shay Drory) - Fix s390 mmio_read/write syscalls, which didn't cause page faults in some cases, which broke vfio-pci lazy mapping on first access (Niklas Schnelle) - Add pdev->non_mappable_bars to replace CONFIG_VFIO_PCI_MMAP, which was disabled only for s390 (Niklas Schnelle) - Support mmap of PCI resources on s390 except for ISM devices (Niklas Schnelle) ASPM: - Delay pcie_link_state deallocation to avoid dangling pointers that cause invalid references during hot-unplug (Daniel Stodden) Power management: - Allow PCI bridges to go to D3Hot when suspending on all non-x86 systems (Manivannan Sadhasivam) Power control: - Create pwrctrl devices in pci_scan_device() to make it more symmetric with pci_pwrctrl_unregister() and make pwrctrl devices for PCI bridges possible (Manivannan Sadhasivam) - Unregister pwrctrl devices in pci_destroy_dev() so DOE, ASPM, etc. can still access devices after pci_stop_dev() (Manivannan Sadhasivam) - If there's a pwrctrl device for a PCI device, skip scanning it because the pwrctrl core will rescan the bus after the device is powered on (Manivannan Sadhasivam) - Add a pwrctrl driver for PCI slots based on voltage regulators described via devicetree (Manivannan Sadhasivam) Bandwidth control: - Add set_pcie_speed.sh to TEST_PROGS to fix issue when executing the set_pcie_cooling_state.sh test case (Yi Lai) - Avoid a NULL pointer dereference when we run out of bus numbers to assign for a bridge secondary bus (Lukas Wunner) Hotplug: - Drop superfluous pci_hotplug_slot_list, try_module_get() calls, and NULL pointer checks (Lukas Wunner) - Drop shpchp module init/exit logging, replace shpchp dbg() with ctrl_dbg(), and remove unused dbg(), err(), info(), warn() wrappers (Ilpo Järvinen) - Drop 'shpchp_debug' module parameter in favor of standard dynamic debugging (Ilpo Järvinen) - Drop unused cpcihp .get_power(), .set_power() function pointers (Guilherme Giacomo Simoes) - Disable hotplug interrupts in portdrv only when pciehp is not enabled to avoid issuing two hotplug commands too close together (Feng Tang) - Skip pciehp 'device replaced' check if the device has been removed to address a deadlock when resuming after a device was removed during system sleep (Lukas Wunner) - Don't enable pciehp hotplug interupt when resuming in poll mode (Ilpo Järvinen) Virtualization: - Fix bugs in 'pci=config_acs=' kernel command line parameter (Tushar Dave) DOE: - Expose supported DOE features via sysfs (Alistair Francis) - Allow DOE support to be enabled even if CXL isn't enabled (Alistair Francis) Endpoint framework: - Convert PCI device data so pci-epf-test works correctly on big-endian endpoint systems (Niklas Cassel) - Add BAR_RESIZABLE type to endpoint framework and add DWC core support for EPF drivers to set BAR_RESIZABLE type and size (Niklas Cassel) - Fix pci-epf-test double free that causes an oops if the host reboots and PERST# deassertion restarts endpoint BAR allocation (Christian Bruel) - Fix endpoint BAR testing so tests can skip disabled BARs instead of reporting them as failures (Niklas Cassel) - Widen endpoint test BAR size variable to accommodate BARs larger than INT_MAX (Niklas Cassel) - Remove unused tools 'pci' build target left over after moving tests to tools/testing/selftests/pci_endpoint (Jianfeng Liu) Altera PCIe controller driver: - Add DT binding and driver support for Agilex family (P-Tile, F-Tile, R-Tile) (Matthew Gerlach and D M, Sharath Kumar) AMD MDB PCIe controller driver: - Add DT binding and driver for AMD MDB (Multimedia DMA Bridge) (Thippeswamy Havalige) Broadcom STB PCIe controller driver: - Add BCM2712 MSI-X DT binding and interrupt controller drivers and add softdep on irq_bcm2712_mip driver to ensure that it is loaded first (Stanimir Varbanov) - Expand inbound window map to 64GB so it can accommodate BCM2712 (Stanimir Varbanov) - Add BCM2712 support and DT updates (Stanimir Varbanov) - Apply link speed restriction before bringing link up, not after (Jim Quinlan) - Update Max Link Speed in Link Capabilities via the internal writable register, not the read-only config register (Jim Quinlan) - Handle regulator_bulk_get() error to avoid panic when we call regulator_bulk_free() later (Jim Quinlan) - Disable regulators only when removing the bus immediately below a Root Port because we don't support regulators deeper in the hierarchy (Jim Quinlan) - Make const read-only arrays static (Colin Ian King) Cadence PCIe endpoint driver: - Correct MSG TLP generation so endpoints can generate INTx messages (Hans Zhang) Freescale i.MX6 PCIe controller driver: - Identify the second controller on i.MX8MQ based on devicetree 'linux,pci-domain' instead of DBI 'reg' address (Richard Zhu) - Remove imx_pcie_cpu_addr_fixup() since dwc core can now derive the ATU input address (using parent_bus_offset) from devicetree (Frank Li) Freescale Layerscape PCIe controller driver: - Drop deprecated 'num-ib-windows' and 'num-ob-windows' and unnecessary 'status' from example (Krzysztof Kozlowski) - Correct the syscon_regmap_lookup_by_phandle_args("fsl,pcie-scfg") arg_count to fix probe failure on LS1043A (Ioana Ciornei) HiSilicon STB PCIe controller driver: - Call phy_exit() to clean up if histb_pcie_probe() fails (Christophe JAILLET) Intel Gateway PCIe controller driver: - Remove intel_pcie_cpu_addr() since dwc core can now derive the ATU input address (using parent_bus_offset) from devicetree (Frank Li) Intel VMD host bridge driver: - Convert vmd_dev.cfg_lock from spinlock_t to raw_spinlock_t so pci_ops.read() will never sleep, even on PREEMPT_RT where spinlock_t becomes a sleepable lock, to avoid calling a sleeping function from invalid context (Ryo Takakura) MediaTek PCIe Gen3 controller driver: - Remove leftover mac_reset assert for Airoha EN7581 SoC (Lorenzo Bianconi) - Add EN7581 PBUS controller 'mediatek,pbus-csr' DT property and program host bridge memory aperture to this syscon node (Lorenzo Bianconi) Qualcomm PCIe controller driver: - Add qcom,pcie-ipq5332 binding (Varadarajan Narayanan) - Add qcom i.MX8QM and i.MX8QXP/DXP optional DMA interrupt (Alexander Stein) - Add optional dma-coherent DT property for Qualcomm SA8775P (Dmitry Baryshkov) - Make DT iommu property required for SA8775P and prohibited for SDX55 (Dmitry Baryshkov) - Add DT IOMMU and DMA-related properties for Qualcomm SM8450 (Dmitry Baryshkov) - Add endpoint DT properties for SAR2130P and enable endpoint mode in driver (Dmitry Baryshkov) - Describe endpoint BAR0 and BAR2 as 64-bit only and BAR1 and BAR3 as RESERVED (Manivannan Sadhasivam) Rockchip DesignWare PCIe controller driver: - Describe rk3568 and rk3588 BARs as Resizable, not Fixed (Niklas Cassel) Synopsys DesignWare PCIe controller driver: - Add debugfs-based Silicon Debug, Error Injection, Statistical Counter support for DWC (Shradha Todi) - Add debugfs property to expose LTSSM status of DWC PCIe link (Hans Zhang) - Add Rockchip support for DWC debugfs features (Niklas Cassel) - Add dw_pcie_parent_bus_offset() to look up the parent bus address of a specified 'reg' property and return the offset from the CPU physical address (Frank Li) - Use dw_pcie_parent_bus_offset() to derive CPU -> ATU addr offset via 'reg[config]' for host controllers and 'reg[addr_space]' for endpoint controllers (Frank Li) - Apply struct dw_pcie.parent_bus_offset in ATU users to remove use of .cpu_addr_fixup() when programming ATU (Frank Li) TI J721E PCIe driver: - Correct the 'link down' interrupt bit for J784S4 (Siddharth Vadapalli) TI Keystone PCIe controller driver: - Describe AM65x BARs 2 and 5 as Resizable (not Fixed) and reduce alignment requirement from 1MB to 64KB (Niklas Cassel) Xilinx Versal CPM PCIe controller driver: - Free IRQ domain in probe error path to avoid leaking it (Thippeswamy Havalige) - Add DT .compatible "xlnx,versal-cpm5nc-host" and driver support for Versal Net CPM5NC Root Port controller (Thippeswamy Havalige) - Add driver support for CPM5_HOST1 (Thippeswamy Havalige) Miscellaneous: - Convert fsl,mpc83xx-pcie binding to YAML (J. Neuschäfer) - Use for_each_available_child_of_node_scoped() to simplify apple, kirin, mediatek, mt7621, tegra drivers (Zhang Zekun)" * tag 'pci-v6.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (197 commits) PCI: layerscape: Fix arg_count to syscon_regmap_lookup_by_phandle_args() PCI: j721e: Fix the value of .linkdown_irq_regfield for J784S4 misc: pci_endpoint_test: Add support for PCITEST_IRQ_TYPE_AUTO PCI: endpoint: pci-epf-test: Expose supported IRQ types in CAPS register PCI: dw-rockchip: Endpoint mode cannot raise INTx interrupts PCI: endpoint: Add intx_capable to epc_features struct dt-bindings: PCI: Add common schema for devices accessible through PCI BARs PCI: intel-gw: Remove intel_pcie_cpu_addr() PCI: imx6: Remove imx_pcie_cpu_addr_fixup() PCI: dwc: Use parent_bus_offset to remove need for .cpu_addr_fixup() PCI: dwc: ep: Ensure proper iteration over outbound map windows PCI: dwc: ep: Use devicetree 'reg[addr_space]' to derive CPU -> ATU addr offset PCI: dwc: ep: Consolidate devicetree handling in dw_pcie_ep_get_resources() PCI: dwc: ep: Call epc_create() early in dw_pcie_ep_init() PCI: dwc: Use devicetree 'reg[config]' to derive CPU -> ATU addr offset PCI: dwc: Add dw_pcie_parent_bus_offset() checking and debug PCI: dwc: Add dw_pcie_parent_bus_offset() PCI/bwctrl: Fix NULL pointer dereference on bus number exhaustion PCI: xilinx-cpm: Add cpm_csr register mapping for CPM5_HOST1 variant PCI: brcmstb: Make const read-only arrays static ...
2025-03-26Merge tag 'lsm-pr-20250323' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - Various minor updates to the LSM Rust bindings Changes include marking trivial Rust bindings as inlines and comment tweaks to better reflect the LSM hooks. - Add LSM/SELinux access controls to io_uring_allowed() Similar to the io_uring_disabled sysctl, add a LSM hook to io_uring_allowed() to enable LSMs a simple way to enforce security policy on the use of io_uring. This pull request includes SELinux support for this new control using the io_uring/allowed permission. - Remove an unused parameter from the security_perf_event_open() hook The perf_event_attr struct parameter was not used by any currently supported LSMs, remove it from the hook. - Add an explicit MAINTAINERS entry for the credentials code We've seen problems in the past where patches to the credentials code sent by non-maintainers would often languish on the lists for multiple months as there was no one explicitly tasked with the responsibility of reviewing and/or merging credentials related code. Considering that most of the code under security/ has a vested interest in ensuring that the credentials code is well maintained, I'm volunteering to look after the credentials code and Serge Hallyn has also volunteered to step up as an official reviewer. I posted the MAINTAINERS update as a RFC to LKML in hopes that someone else would jump up with an "I'll do it!", but beyond Serge it was all crickets. - Update Stephen Smalley's old email address to prevent confusion This includes a corresponding update to the mailmap file. * tag 'lsm-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: mailmap: map Stephen Smalley's old email addresses lsm: remove old email address for Stephen Smalley MAINTAINERS: add Serge Hallyn as a credentials reviewer MAINTAINERS: add an explicit credentials entry cred,rust: mark Credential methods inline lsm,rust: reword "destroy" -> "release" in SecurityCtx lsm,rust: mark SecurityCtx methods inline perf: Remove unnecessary parameter of security check lsm: fix a missing security_uring_allowed() prototype io_uring,lsm,selinux: add LSM hooks for io_uring_setup() io_uring: refactor io_uring_allowed()
2025-03-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+35
Pull kvm updates from Paolo Bonzini: "ARM: - Nested virtualization support for VGICv3, giving the nested hypervisor control of the VGIC hardware when running an L2 VM - Removal of 'late' nested virtualization feature register masking, making the supported feature set directly visible to userspace - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers - Paravirtual interface for discovering the set of CPU implementations where a VM may run, addressing a longstanding issue of guest CPU errata awareness in big-little systems and cross-implementation VM migration - Userspace control of the registers responsible for identifying a particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1), allowing VMs to be migrated cross-implementation - pKVM updates, including support for tracking stage-2 page table allocations in the protected hypervisor in the 'SecPageTable' stat - Fixes to vPMU, ensuring that userspace updates to the vPMU after KVM_RUN are reflected into the backing perf events LoongArch: - Remove unnecessary header include path - Assume constant PGD during VM context switch - Add perf events support for guest VM RISC-V: - Disable the kernel perf counter during configure - KVM selftests improvements for PMU - Fix warning at the time of KVM module removal x86: - Add support for aging of SPTEs without holding mmu_lock. Not taking mmu_lock allows multiple aging actions to run in parallel, and more importantly avoids stalling vCPUs. This includes an implementation of per-rmap-entry locking; aging the gfn is done with only a per-rmap single-bin spinlock taken, whereas locking an rmap for write requires taking both the per-rmap spinlock and the mmu_lock. Note that this decreases slightly the accuracy of accessed-page information, because changes to the SPTE outside aging might not use atomic operations even if they could race against a clear of the Accessed bit. This is deliberate because KVM and mm/ tolerate false positives/negatives for accessed information, and testing has shown that reducing the latency of aging is far more beneficial to overall system performance than providing "perfect" young/old information. - Defer runtime CPUID updates until KVM emulates a CPUID instruction, to coalesce updates when multiple pieces of vCPU state are changing, e.g. as part of a nested transition - Fix a variety of nested emulation bugs, and add VMX support for synthesizing nested VM-Exit on interception (instead of injecting #UD into L2) - Drop "support" for async page faults for protected guests that do not set SEND_ALWAYS (i.e. that only want async page faults at CPL3) - Bring a bit of sanity to x86's VM teardown code, which has accumulated a lot of cruft over the years. Particularly, destroy vCPUs before the MMU, despite the latter being a VM-wide operation - Add common secure TSC infrastructure for use within SNP and in the future TDX - Block KVM_CAP_SYNC_REGS if guest state is protected. It does not make sense to use the capability if the relevant registers are not available for reading or writing - Don't take kvm->lock when iterating over vCPUs in the suspend notifier to fix a largely theoretical deadlock - Use the vCPU's actual Xen PV clock information when starting the Xen timer, as the cached state in arch.hv_clock can be stale/bogus - Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across different PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as KVM's suspend notifier only accounts for kvmclock, and there's no evidence that the flag is actually supported by Xen guests - Clean up the per-vCPU "cache" of its reference pvclock, and instead only track the vCPU's TSC scaling (multipler+shift) metadata (which is moderately expensive to compute, and rarely changes for modern setups) - Don't write to the Xen hypercall page on MSR writes that are initiated by the host (userspace or KVM) to fix a class of bugs where KVM can write to guest memory at unexpected times, e.g. during vCPU creation if userspace has set the Xen hypercall MSR index to collide with an MSR that KVM emulates - Restrict the Xen hypercall MSR index to the unofficial synthetic range to reduce the set of possible collisions with MSRs that are emulated by KVM (collisions can still happen as KVM emulates Hyper-V MSRs, which also reside in the synthetic range) - Clean up and optimize KVM's handling of Xen MSR writes and xen_hvm_config - Update Xen TSC leaves during CPUID emulation instead of modifying the CPUID entries when updating PV clocks; there is no guarantee PV clocks will be updated between TSC frequency changes and CPUID emulation, and guest reads of the TSC leaves should be rare, i.e. are not a hot path x86 (Intel): - Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and thus modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1 - Pass XFD_ERR as the payload when injecting #NM, as a preparatory step for upcoming FRED virtualization support - Decouple the EPT entry RWX protection bit macros from the EPT Violation bits, both as a general cleanup and in anticipation of adding support for emulating Mode-Based Execution Control (MBEC) - Reject KVM_RUN if userspace manages to gain control and stuff invalid guest state while KVM is in the middle of emulating nested VM-Enter - Add a macro to handle KVM's sanity checks on entry/exit VMCS control pairs in anticipation of adding sanity checks for secondary exit controls (the primary field is out of bits) x86 (AMD): - Ensure the PSP driver is initialized when both the PSP and KVM modules are built-in (the initcall framework doesn't handle dependencies) - Use long-term pins when registering encrypted memory regions, so that the pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and don't lead to excessive fragmentation - Add macros and helpers for setting GHCB return/error codes - Add support for Idle HLT interception, which elides interception if the vCPU has a pending, unmasked virtual IRQ when HLT is executed - Fix a bug in INVPCID emulation where KVM fails to check for a non-canonical address - Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is invalid, e.g. because the vCPU was "destroyed" via SNP's AP Creation hypercall - Reject SNP AP Creation if the requested SEV features for the vCPU don't match the VM's configured set of features Selftests: - Fix again the Intel PMU counters test; add a data load and do CLFLUSH{OPT} on the data instead of executing code. The theory is that modern Intel CPUs have learned new code prefetching tricks that bypass the PMU counters - Fix a flaw in the Intel PMU counters test where it asserts that an event is counting correctly without actually knowing what the event counts on the underlying hardware - Fix a variety of flaws, bugs, and false failures/passes dirty_log_test, and improve its coverage by collecting all dirty entries on each iteration - Fix a few minor bugs related to handling of stats FDs - Add infrastructure to make vCPU and VM stats FDs available to tests by default (open the FDs during VM/vCPU creation) - Relax an assertion on the number of HLT exits in the xAPIC IPI test when running on a CPU that supports AMD's Idle HLT (which elides interception of HLT if a virtual IRQ is pending and unmasked)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (216 commits) RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed RISC-V: KVM: Teardown riscv specific bits after kvm_exit LoongArch: KVM: Register perf callbacks for guest LoongArch: KVM: Implement arch-specific functions for guest perf LoongArch: KVM: Add stub for kvm_arch_vcpu_preempted_in_kernel() LoongArch: KVM: Remove PGD saving during VM context switch LoongArch: KVM: Remove unnecessary header include path KVM: arm64: Tear down vGIC on failed vCPU creation KVM: arm64: PMU: Reload when resetting KVM: arm64: PMU: Reload when user modifies registers KVM: arm64: PMU: Fix SET_ONE_REG for vPMC regs KVM: arm64: PMU: Assume PMU presence in pmu-emul.c KVM: arm64: PMU: Set raw values from user to PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu KVM: arm64: Factor out pKVM hyp vcpu creation to separate function KVM: arm64: Initialize HCRX_EL2 traps in pKVM KVM: arm64: Factor out setting HCRX_EL2 traps into separate function KVM: x86: block KVM_CAP_SYNC_REGS if guest state is protected KVM: x86: Add infrastructure for secure TSC KVM: x86: Push down setting vcpu.arch.user_set_tsc ...
2025-03-25Merge tag 'arm64-upstream' of ↵Linus Torvalds10-197/+190
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Nothing major this time around. Apart from the usual perf/PMU updates, some page table cleanups, the notable features are average CPU frequency based on the AMUv1 counters, CONFIG_HOTPLUG_SMT and MOPS instructions (memcpy/memset) in the uaccess routines. Perf and PMUs: - Support for the 'Rainier' CPU PMU from Arm - Preparatory driver changes and cleanups that pave the way for BRBE support - Support for partial virtualisation of the Apple-M1 PMU - Support for the second event filter in Arm CSPMU designs - Minor fixes and cleanups (CMN and DWC PMUs) - Enable EL2 requirements for FEAT_PMUv3p9 Power, CPU topology: - Support for AMUv1-based average CPU frequency - Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It adds a generic topology_is_primary_thread() function overridden by x86 and powerpc New(ish) features: - MOPS (memcpy/memset) support for the uaccess routines Security/confidential compute: - Fix the DMA address for devices used in Realms with Arm CCA. The CCA architecture uses the address bit to differentiate between shared and private addresses - Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by default Memory management clean-ups: - Drop the P*D_TABLE_BIT definition in preparation for 128-bit PTEs - Some minor page table accessor clean-ups - PIE/POE (permission indirection/overlay) helpers clean-up Kselftests: - MTE: skip hugetlb tests if MTE is not supported on such mappings and user correct naming for sync/async tag checking modes Miscellaneous: - Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people request) - Sysreg updates for new register fields - CPU type info for some Qualcomm Kryo cores" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (72 commits) arm64: mm: Don't use %pK through printk perf/arm_cspmu: Fix missing io.h include arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists arm64: cputype: Add MIDR_CORTEX_A76AE arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list arm64/sysreg: Enforce whole word match for open/close tokens arm64/sysreg: Fix unbalanced closing block arm64: Kconfig: Enable HOTPLUG_SMT arm64: topology: Support SMT control on ACPI based system arch_topology: Support SMT control for OF based system cpu/SMT: Provide a default topology_is_primary_thread() arm64/mm: Define PTDESC_ORDER perf/arm_cspmu: Add PMEVFILT2R support perf/arm_cspmu: Generalise event filtering perf/arm_cspmu: Move register definitons to header arm64/kernel: Always use level 2 or higher for early mappings arm64/mm: Drop PXD_TABLE_BIT arm64/mm: Check pmd_table() in pmd_trans_huge() ...
2025-03-17perf/arm_cspmu: Fix missing io.h includeRobin Murphy1-0/+1
Adding the writel() calls needs io.h, which apparently gets transiently included somewhere on arm64, but not elsewhere. Fixes: 6de0298a3925 ("perf/arm_cspmu: Generalise event filtering") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202503150649.Dol8RBSh-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202503152245.cAG4FMfi-lkp@intel.com/ Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/657935ca177024ad08d5ec6f85e8faf75f82cf65.1742212833.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-03-14perf/arm_cspmu: Add PMEVFILT2R supportRobin Murphy2-2/+8
Architecturally we have two filters for each regular event counter, so add generic support for the second one too. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/b11be3f23a72bc27088b115099c8fe865b70babc.1741190362.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-03-14perf/arm_cspmu: Generalise event filteringRobin Murphy4-40/+42
The notion of a single u32 filter value for any event doesn't scale well when the potential architectural scope is already two 64-bit values, and implementations may add custom stuff on the side too. Rather than try to thread arbitrary filter data through the common path, let's just make the set_ev_filter op self-contained in terms of parsing and configuring any and all filtering for the given event - splitting out a distinct op for cycles events which inherently differ - and let implementations override the whole thing if they want to do something different. This already allows the Ampere code to stop looking a bit hacky. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/c0cd4d4c12566dbf1b062ccd60241b3e0639f4cc.1741190362.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-03-14perf/arm_cspmu: Move register definitons to headerRobin Murphy3-49/+50
Implementations may occasionally want to refer to register offsets, so for the sake of consistency move all of the register definitions to join the PMIIDR fields in the private header where they can be shared. As an example nicety, we can then define Ampere's imp-def filters in terms of the architectural PMIMPDEF range rather than open-coded offsets. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/5a3c796560665b51cb63fec0d473afd8f8d0a836.1741190362.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-03-14Merge branch 'perf/m1-guest-events' of ↵Will Deacon1-21/+45
git://git.kernel.org/pub/scm/linux/kernel/git/oupton/linux into for-next/perf Pull Apple-M1 PMU driver changes from Oliver Upton, which form a prefix of the series in the KVM/Arm tree that allows the PMU to be virtualised. Sort of, anyway. * 'perf/m1-guest-events' of git://git.kernel.org/pub/scm/linux/kernel/git/oupton/linux: drivers/perf: apple_m1: Support host/guest event filtering drivers/perf: apple_m1: Refactor event select/filter configuration
2025-03-11drivers/perf: apple_m1: Provide helper for mapping PMUv3 eventsOliver Upton1-0/+35
Apple M* parts carry some IMP DEF traps for guest accesses to PMUv3 registers, even though the underlying hardware doesn't implement PMUv3. This means it is possible to virtualize PMUv3 for KVM guests. Add a helper for mapping common PMUv3 event IDs onto hardware event IDs, keeping the implementation-specific crud in the PMU driver rather than KVM proper. Populate the pmceid_bitmap based on the supported events so KVM can provide synthetic PMCEID* values to the guest. Tested-by: Janne Grunau <j@jannau.net> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250305202641.428114-13-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-11drivers/perf: apple_m1: Support host/guest event filteringOliver Upton1-4/+16
The PMU appears to have a separate register for filtering 'guest' exception levels (i.e. EL1 and !ELIsInHost(EL0)) which has the same layout as PMCR1_EL1. Conveniently, there exists a VHE register alias (PMCR1_EL12) that can be used to configure it. Support guest events by programming the EL12 register with the intended guest kernel/userspace filters. Limit support for guest events to VHE (i.e. kernel running at EL2), as it avoids involving KVM to context switch PMU registers. VHE is the only supported mode on M* parts anyway, so this isn't an actual feature limitation. Tested-by: Janne Grunau <j@jannau.net> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250305202641.428114-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-11drivers/perf: apple_m1: Refactor event select/filter configurationOliver Upton1-20/+32
Supporting guest mode events will necessitate programming two event filters. Prepare by splitting up the programming of the event selector + event filter into separate headers. Opportunistically replace RMW patterns with sysreg_clear_set_s(). Tested-by: Janne Grunau <j@jannau.net> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250305202641.428114-2-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-11KVM: arm64: Compute PMCEID from arm_pmu's event bitmapsOliver Upton0-0/+0
The PMUv3 driver populates a couple of bitmaps with the values of PMCEID{0,1}, from which the guest's PMCEID{0,1} can be derived. This is particularly convenient when virtualizing PMUv3 on IMP DEF hardware, as reading the nonexistent PMCEID registers leads to a rather unpleasant UNDEF. Tested-by: Janne Grunau <j@jannau.net> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250305202641.428114-4-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-03perf/dwc_pcie: Move common DWC struct definitions to 'pcie-dwc.h'Manivannan Sadhasivam1-22/+3
Move the common DWC struct definitions, which are shared across all the DesginWare PCIe IPs, to a new header file called 'pcie-dwc.h', so that other users e.g., debugfs, perf and sysfs can make use of them. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Shradha Todi <shradha.t@samsung.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Tested-by: Hrishikesh Deleep <hrishikesh.d@samsung.com> Link: https://lore.kernel.org/r/20250221131548.59616-2-shradha.t@samsung.com [kwilczynski: commit log, tidy up the new header file] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-01perf/dwc_pcie: fix duplicate pci_dev devicesYunhui Cui1-6/+12
During platform_device_register, wrongly using struct device pci_dev as platform_data caused a kmemdup copy of pci_dev. Worse still, accessing the duplicated device leads to list corruption as its mutex content (e.g., list, magic) remains the same as the original. Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/20250220121716.50324-3-cuiyunhui@bytedance.com Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01perf/dwc_pcie: fix some unreleased resourcesYunhui Cui1-11/+22
Release leaked resources, such as plat_dev and dev_info. Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/20250220121716.50324-2-cuiyunhui@bytedance.com Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01perf/arm-cmn: Minor event type housekeepingRobin Murphy1-2/+3
While handling RN-D nodes under the functionally-identical RN-I type works fine for perf tool users using the "rnid_" event aliases, and that is the documented and expected ABI, there's little reason not to be permissive and accept the actual RN-D type as an additional encoding for the same events as well. This may be convenient for other tooling generating event configs directly from its own topology data. In the RN-I event mood, it also seems as good a time as any to clean up a forgotten macro for CCLA_RNI events which ended up being unnecessary. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/ef46a47fc4ab909093f14b2b4289a4835836ab6c.1738851844.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01perf: apple_m1: Don't disable counter in m1_pmu_enable_event()Rob Herring (Arm)1-4/+0
Currently m1_pmu_enable_event() starts by disabling the event counter it has been asked to enable. This should not be necessary as the counter (and the PMU as a whole) should not be active when m1_pmu_enable_event() is called. Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-6-4e9922fc2e8e@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01perf: arm_v7_pmu: Don't disable counter in ↵Rob Herring (Arm)1-6/+0
(armv7|krait_|scorpion_)pmu_enable_event() Currently (armv7|krait_|scorpion_)pmu_enable_event() start by disabling the event counter it has been asked to enable. This should not be necessary as the counter (and the PMU as a whole) should not be active when *_enable_event() is called. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-5-4e9922fc2e8e@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and ↵Rob Herring (Arm)1-44/+0
interrupts The function calls for enabling/disabling counters and interrupts are pretty obvious as to what they are doing, and the comments don't add any additional value. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-4-4e9922fc2e8e@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event()Mark Rutland1-5/+0
Currently armv8pmu_enable_event() starts by disabling the event counter it has been asked to enable. This should not be necessary as the counter (and the PMU as a whole) should not be active when armv8pmu_enable_event() is called. Remove the redundant call to armv8pmu_disable_event_counter(). At the same time, remove the comment immeditately above as everything it says is obvious from the function names below. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-3-4e9922fc2e8e@kernel.org Signed-off-by: Will Deacon <will@kernel.org>