summaryrefslogtreecommitdiff
path: root/drivers/perf
AgeCommit message (Collapse)AuthorFilesLines
2024-04-19perf/thunderx2: Assign parents for event_source devicesJonathan Cameron1-0/+1
Currently all these devices appear directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parents to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Cc: Robert Richter <rric@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-12-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/xgene: Assign parents for event_source devicesJonathan Cameron1-0/+1
Currently all these devices appear directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parents to be the hardware related struct device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Cc: Khuong Dinh <khuong@os.amperecomputing.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-10-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/arm_cspmu: Assign parents for event_source devicesJonathan Cameron1-0/+1
Currently all these devices appear directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parents to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-8-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/amlogic: Assign parents for event_source devicesJonathan Cameron1-0/+1
Currently all these devices appear directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parents to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Jiucheng Xu <jiucheng.xu@amlogic.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-7-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-hns3: Assign parents for event_source deviceJonathan Cameron1-0/+1
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the PCI device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-6-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-uncore: Assign parents for event_source devicesJonathan Cameron1-0/+1
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the platform device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-4-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-19perf/hisi-pcie: Assign parent for event_source deviceJonathan Cameron1-0/+1
Currently the PMU device appears directly under /sys/devices/ Only root busses should appear there, so instead assign the pmu->dev parent to be the PCI device. Link: https://lore.kernel.org/linux-cxl/ZCLI9A40PJsyqAmq@kroah.com/ Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240412161057.14099-2-Jonathan.Cameron@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-10perf/arm-cmn: Set PMU device parentRobin Murphy1-0/+1
Now that perf supports giving the PMU device a parent, we can use our platform device to make the relationship between CMN instances and PMU IDs trivially discoverable, from either nominal direction: root@crazy-taxi:~# ls /sys/devices/platform/ARMHC600:00 | grep cmn arm_cmn_0 root@crazy-taxi:~# realpath /sys/bus/event_source/devices/arm_cmn_0/.. /sys/devices/platform/ARMHC600:00 Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/25d4428df1ddad966c74a3ed60171cd3ca6c8b66.1712682917.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/thunderx2: Avoid placing cpumask on the stackDawei Li1-7/+3
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Link: https://lore.kernel.org/r/20240403155950.2068109-11-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/qcom_l2: Avoid placing cpumask on the stackDawei Li1-5/+3
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Link: https://lore.kernel.org/r/20240403155950.2068109-10-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/hisi_uncore: Avoid placing cpumask on the stackDawei Li1-4/+2
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240403155950.2068109-9-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/hisi_pcie: Avoid placing cpumask on the stackDawei Li1-5/+4
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240403155950.2068109-8-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/dwc_pcie: Avoid placing cpumask on the stackDawei Li1-6/+4
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/20240403155950.2068109-7-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/arm_dsu: Avoid placing cpumask on the stackDawei Li1-13/+6
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Link: https://lore.kernel.org/r/20240403155950.2068109-6-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/arm_cspmu: Avoid placing cpumask on the stackDawei Li1-5/+3
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Link: https://lore.kernel.org/r/20240403155950.2068109-5-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/arm-cmn: Avoid placing cpumask on the stackDawei Li1-5/+5
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Link: https://lore.kernel.org/r/20240403155950.2068109-4-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09perf/alibaba_uncore_drw: Avoid placing cpumask on the stackDawei Li1-7/+3
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Use cpumask_any_and_but() to avoid the need for a temporary cpumask on the stack. Suggested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/20240403155950.2068109-3-dawei.li@shingroup.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09drivers/perf: thunderx2_pmu: Replace open coded acpi_match_acpi_device()Andy Shevchenko1-12/+7
Replace open coded acpi_match_acpi_device() in get_tx2_pmu_type(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240404170016.2466898-1-andriy.shevchenko@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-09drivers: perf: Remove the now superfluous sentinel elements from ctl_table arrayJoel Granados1-1/+0
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/) Remove sentinel from sbi_pmu_sysctl_table Signed-off-by: Joel Granados <j.granados@samsung.com> Link: https://lore.kernel.org/r/20240328-jag-sysctl_remset_misc-v1-7-47c1463b3af2@samsung.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-27drivers/perf: riscv: Disable PERF_SAMPLE_BRANCH_* while not supportedPu Lehui1-0/+4
RISC-V perf driver does not yet support branch sampling. Although the specification is in the works [0], it is best to disable such events until support is available, otherwise we will get unexpected results. Due to this reason, two riscv bpf testcases get_branch_snapshot and perf_branches/perf_branches_hw fail. Link: https://github.com/riscv/riscv-control-transfer-records [0] Fixes: f5bfa23f576f ("RISC-V: Add a perf core library for pmu drivers") Signed-off-by: Pu Lehui <pulehui@huawei.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240312012053.1178140-1-pulehui@huaweicloud.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-22Merge tag 'riscv-for-linus-6.9-mw2' of ↵Linus Torvalds2-5/+46
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for various vector-accelerated crypto routines - Hibernation is now enabled for portable kernel builds - mmap_rnd_bits_max is larger on systems with larger VAs - Support for fast GUP - Support for membarrier-based instruction cache synchronization - Support for the Andes hart-level interrupt controller and PMU - Some cleanups around unaligned access speed probing and Kconfig settings - Support for ACPI LPI and CPPC - Various cleanus related to barriers - A handful of fixes * tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (66 commits) riscv: Fix syscall wrapper for >word-size arguments crypto: riscv - add vector crypto accelerated AES-CBC-CTS crypto: riscv - parallelize AES-CBC decryption riscv: Only flush the mm icache when setting an exec pte riscv: Use kcalloc() instead of kzalloc() riscv/barrier: Add missing space after ',' riscv/barrier: Consolidate fence definitions riscv/barrier: Define RISCV_FULL_BARRIER riscv/barrier: Define __{mb,rmb,wmb} RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ cpufreq: Move CPPC configs to common Kconfig and add RISC-V ACPI: RISC-V: Add CPPC driver ACPI: Enable ACPI_PROCESSOR for RISC-V ACPI: RISC-V: Add LPI driver cpuidle: RISC-V: Move few functions to arch/riscv riscv: Introduce set_compat_task() in asm/compat.h riscv: Introduce is_compat_thread() into compat.h riscv: add compile-time test into is_compat_task() riscv: Replace direct thread flag check with is_compat_task() riscv: Improve arch_get_mmap_end() macro ...
2024-03-22Merge tag 'arm64-fixes' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Re-instate the CPUMASK_OFFSTACK option for arm64 when NR_CPUS > 256. The bug that led to the initial revert was the cpufreq-dt code not using zalloc_cpumask_var(). - Make the STARFIVE_STARLINK_PMU config option depend on 64BIT to prevent compile-test failures on 32-bit architectures due to missing writeq(). * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: perf: starfive: fix 64-bit only COMPILE_TEST condition ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512
2024-03-19perf: starfive: fix 64-bit only COMPILE_TEST conditionConor Dooley1-1/+2
ARCH_STARFIVE is not restricted to 64-bit platforms, so while Will's addition of a 64-bit only condition satisfied the build robots doing COMPILE_TEST builds, Palmer ran into the same problems with writeq() being undefined during regular rv32 builds. Promote the dependency on 64-bit to its own `depends on` so that the driver can never be included in 32-bit builds. Reported-by: Palmer Dabbelt <palmer@rivosinc.com> Fixes: c2b24812f7bc ("perf: starfive: Add StarLink PMU support") Fixes: f0dbc6d0de38 ("perf: starfive: Only allow COMPILE_TEST for 64-bit architectures") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Ji Sheng Teoh <jisheng.teoh@starfivetech.com> Acked-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Link: https://lore.kernel.org/r/20240318-emphatic-rally-f177a4fe1bdc@spud Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-03-15Merge tag 'arm64-upstream' of ↵Linus Torvalds30-208/+885
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "The major features are support for LPA2 (52-bit VA/PA with 4K and 16K pages), the dpISA extension and Rust enabled on arm64. The changes are mostly contained within the usual arch/arm64/, drivers/perf, the arm64 Documentation and kselftests. The exception is the Rust support which touches some generic build files. Summary: - Reorganise the arm64 kernel VA space and add support for LPA2 (at stage 1, KVM stage 2 was merged earlier) - 52-bit VA/PA address range with 4KB and 16KB pages - Enable Rust on arm64 - Support for the 2023 dpISA extensions (data processing ISA), host only - arm64 perf updates: - StarFive's StarLink (integrates one or more CPU cores with a shared L3 memory system) PMU support - Enable HiSilicon Erratum 162700402 quirk for HIP09 - Several updates for the HiSilicon PCIe PMU driver - Arm CoreSight PMU support - Convert all drivers under drivers/perf/ to use .remove_new() - Miscellaneous: - Don't enable workarounds for "rare" errata by default - Clean up the DAIF flags handling for EL0 returns (in preparation for NMI support) - Kselftest update for ptrace() - Update some of the sysreg field definitions - Slight improvement in the code generation for inline asm I/O accessors to permit offset addressing - kretprobes: acquire regs via a BRK exception (previously done via a trampoline handler) - SVE/SME cleanups, comment updates - Allow CALL_OPS+CC_OPTIMIZE_FOR_SIZE with clang (previously disabled due to gcc silently ignoring -falign-functions=N)" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (134 commits) Revert "mm: add arch hook to validate mmap() prot flags" Revert "arm64: mm: add support for WXN memory translation attribute" Revert "ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512" ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512 kselftest/arm64: Add 2023 DPISA hwcap test coverage kselftest/arm64: Add basic FPMR test kselftest/arm64: Handle FPMR context in generic signal frame parser arm64/hwcap: Define hwcaps for 2023 DPISA features arm64/ptrace: Expose FPMR via ptrace arm64/signal: Add FPMR signal handling arm64/fpsimd: Support FEAT_FPMR arm64/fpsimd: Enable host kernel access to FPMR arm64/cpufeature: Hook new identification registers up to cpufeature docs: perf: Fix build warning of hisi-pcie-pmu.rst perf: starfive: Only allow COMPILE_TEST for 64-bit architectures MAINTAINERS: Add entry for StarFive StarLink PMU docs: perf: Add description for StarFive's StarLink PMU dt-bindings: perf: starfive: Add JH8100 StarLink PMU perf: starfive: Add StarLink PMU support docs: perf: Update usage for target filter of hisi-pcie-pmu ...
2024-03-12perf: RISC-V: Introduce Andes PMU to support perf event samplingYu Chien Peter Lin2-3/+46
Assign riscv_pmu_irq_num the value of (256 + 18) for the custome PMU and add SSCOUNTOVF and SIP alternatives to ALT_SBI_PMU_OVERFLOW() and ALT_SBI_PMU_OVF_CLEAR_PENDING() macros, respectively. To make use of Andes PMU extension, "xandespmu" needs to be appended to the riscv,isa-extensions for each cpu node in device-tree, and make sure CONFIG_ANDES_CUSTOM_PMU is enabled. Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com> Reviewed-by: Charles Ci-Jyun Wu <dminus@andestech.com> Reviewed-by: Leo Yu-Chi Liang <ycliang@andestech.com> Co-developed-by: Locus Wei-Han Chen <locus84@andestech.com> Signed-off-by: Locus Wei-Han Chen <locus84@andestech.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20240222083946.3977135-8-peterlin@andestech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-12perf: RISC-V: Eliminate redundant interrupt enable/disable operationsYu Chien Peter Lin1-2/+0
The interrupt enable/disable operations are already performed by the IRQ chip functions riscv_intc_irq_unmask()/riscv_intc_irq_mask() during enable_percpu_irq()/disable_percpu_irq(). It can be done only once. Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240222083946.3977135-7-peterlin@andestech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-03-12Merge tag 'irq-msi-2024-03-10' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MSI updates from Thomas Gleixner: "Updates for the MSI interrupt subsystem and initial RISC-V MSI support. The core changes have been adopted from previous work which converted ARM[64] to the new per device MSI domain model, which was merged to support multiple MSI domain per device. The ARM[64] changes are being worked on too, but have not been ready yet. The core and platform-MSI changes have been split out to not hold up RISC-V and to avoid that RISC-V builds on the scheduled for removal interfaces. The core support provides new interfaces to handle wire to MSI bridges in a straight forward way and introduces new platform-MSI interfaces which are built on top of the per device MSI domain model. Once ARM[64] is converted over the old platform-MSI interfaces and the related ugliness in the MSI core code will be removed. The actual MSI parts for RISC-V were finalized late and have been post-poned for the next merge window. Drivers: - Add a new driver for the Andes hart-level interrupt controller - Rework the SiFive PLIC driver to prepare for MSI suport - Expand the RISC-V INTC driver to support the new RISC-V AIA controller which provides the basis for MSI on RISC-V - A few fixup for the fallout of the core changes" * tag 'irq-msi-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits) irqchip/riscv-intc: Fix low-level interrupt handler setup for AIA x86/apic/msi: Use DOMAIN_BUS_GENERIC_MSI for HPET/IO-APIC domain search genirq/matrix: Dynamic bitmap allocation irqchip/riscv-intc: Add support for RISC-V AIA irqchip/sifive-plic: Improve locking safety by using irqsave/irqrestore irqchip/sifive-plic: Parse number of interrupts and contexts early in plic_probe() irqchip/sifive-plic: Cleanup PLIC contexts upon irqdomain creation failure irqchip/sifive-plic: Use riscv_get_intc_hwnode() to get parent fwnode irqchip/sifive-plic: Use devm_xyz() for managed allocation irqchip/sifive-plic: Use dev_xyz() in-place of pr_xyz() irqchip/sifive-plic: Convert PLIC driver into a platform driver irqchip/riscv-intc: Introduce Andes hart-level interrupt controller irqchip/riscv-intc: Allow large non-standard interrupt number genirq/irqdomain: Don't call ops->select for DOMAIN_BUS_ANY tokens irqchip/imx-intmux: Handle pure domain searches correctly genirq/msi: Provide MSI_FLAG_PARENT_PM_DEV genirq/irqdomain: Reroute device MSI create_mapping genirq/msi: Provide allocation/free functions for "wired" MSI interrupts genirq/msi: Optionally use dev->fwnode for device domain genirq/msi: Provide DOMAIN_BUS_WIRED_TO_MSI ...
2024-03-05perf: starfive: Only allow COMPILE_TEST for 64-bit architecturesWill Deacon1-1/+1
The kbuild robot exploded while wasting its time building the Starfive PMU driver for the 32-bit PA-RISC and Hexagon architectures. Adjust the Kconfig dependencies so that COMPILE_TEST is only applicable for 64-bit architectures (which implement writeq()). Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04perf: starfive: Add StarLink PMU supportJi Sheng Teoh3-0/+652
This patch adds support for StarFive's StarLink PMU (Performance Monitor Unit). StarLink PMU integrates one or more CPU cores with a shared L3 memory system. The PMU supports overflow interrupt, up to 16 programmable 64bit event counters, and an independent 64bit cycle counter. StarLink PMU is accessed via MMIO. Example Perf stat output: [root@user]# perf stat -a -e /starfive_starlink_pmu/cycles/ \ -e /starfive_starlink_pmu/read_miss/ \ -e /starfive_starlink_pmu/read_hit/ \ -e /starfive_starlink_pmu/release_request/ \ -e /starfive_starlink_pmu/write_hit/ \ -e /starfive_starlink_pmu/write_miss/ \ -e /starfive_starlink_pmu/write_request/ \ -e /starfive_starlink_pmu/writeback/ \ -e /starfive_starlink_pmu/read_request/ \ -- openssl speed rsa2048 Doing 2048 bits private rsa's for 10s: 5 2048 bits private RSA's in 2.84s Doing 2048 bits public rsa's for 10s: 169 2048 bits public RSA's in 2.42s version: 3.0.11 built on: Tue Sep 19 13:02:31 2023 UTC options: bn(64,64) CPUINFO: N/A sign verify sign/s verify/s rsa 2048 bits 0.568000s 0.014320s 1.8 69.8 ///////// Performance counter stats for 'system wide': 649991998 starfive_starlink_pmu/cycles/ 1009690 starfive_starlink_pmu/read_miss/ 1079750 starfive_starlink_pmu/read_hit/ 2089405 starfive_starlink_pmu/release_request/ 129 starfive_starlink_pmu/write_hit/ 70 starfive_starlink_pmu/write_miss/ 194 starfive_starlink_pmu/write_request/ 150080 starfive_starlink_pmu/writeback/ 2089423 starfive_starlink_pmu/read_request/ 27.062755678 seconds time elapsed Signed-off-by: Ji Sheng Teoh <jisheng.teoh@starfivetech.com> Link: https://lore.kernel.org/r/20240229072720.3987876-2-jisheng.teoh@starfivetech.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Merge find_related_event() and get_event_idx()Junhao He1-32/+19
The function xxx_find_related_event() scan all working events to find related events. During this process, we also can find the idle counters. If not found related events, return the first idle counter to simplify the code. Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-8-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Relax the check on related eventsJunhao He1-6/+2
If we use two events with the same filter and related event type (see the following example), the driver check whether they are related events and are in the same group, otherwise the function hisi_pcie_pmu_find_related_event() return -EINVAL, then the 2nd event cannot count but the 1st event is running, although the PCIe PMU has other idle counters. In this case, The perf event scheduler will make the two events to multiplex a counter, if the user use the formula (1st event_value / 2nd event_value) to calculate the bandwidth, he/she won't get the correct value, because they are not counting at the same period. This patch tries to fix this by making the related events to use different idle counters if they are not in the same event group. And finally, I'm going to say. The related events are best used in the same group [1]. There are two ways to know if they are related events. a) By event name, such as the latency events "xxx_latency, xxx_cnt" or bandwidth events "xxx_flux, xxx_time". b) By event type, such as "event=0xXXXX, event=0x1XXXX". Use group to count the related events: [1] -e "{pmu_name/xxx_latency,port=1/,pmu_name/xxx_cnt,port=1/}" example: 1st event: hisi_pcie0_core1/event=0x804,port=1 2nd event: hisi_pcie0_core1/event=0x10804,port=1 test cmd: perf stat -e hisi_pcie0_core1/event=0x804,port=1/ \ -e hisi_pcie0_core1/event=0x10804,port=1/ before patch: 25,281 hisi_pcie0_core1/event=0x804,port=1/ (49.91%) 470,598 hisi_pcie0_core1/event=0x10804,port=1/ (50.09%) after patch: 24,147 hisi_pcie0_core1/event=0x804,port=1/ 474,558 hisi_pcie0_core1/event=0x10804,port=1/ Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com> Link: https://lore.kernel.org/r/20240223103359.18669-7-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Check the target filter properlyJunhao He1-4/+4
The PMU can monitor traffic of certain target Root Port or downstream target Endpoint. User can specify the target filter by the "port" or "bdf" option respectively. The PMU can only monitor the Root Port or Endpoint on the same PCIe core so the value of "port" or "bdf" should be valid and will be checked by the driver. Currently at least and only one of "port" and "bdf" option must be set. If "port" filter is not set or is set explicitly to zero (default), driver will regard the user specifies a "bdf" option since "port" option is a bitmask of the target Root Ports and zero is not a valid value. If user not explicitly set "port" or "bdf" filter, the driver uses "bdf" default value (zero) to set target filter, but driver will skip the check of bdf=0, although it's a valid value (meaning 0000:000:00.0). Then the user just gets zero. Therefore, we need to check if both "port" and "bdf" are invalid, then return failure and report warning. Testing: before the patch: 0 hisi_pcie0_core1/rx_mrd_flux/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0/ 24,124 hisi_pcie0_core1/rx_mrd_flux,port=1/ 0 hisi_pcie0_core1/rx_mrd_flux,bdf=0/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/ 24,132 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/ 24,138 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/ 24,126 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/ after the patch: <not supported> hisi_pcie0_core1/rx_mrd_flux/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0/ 24,153 hisi_pcie0_core1/rx_mrd_flux,port=1/ 0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/ 24,117 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/ <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/ 24,120 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/ 24,123 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/ Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-6-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Add more events for counting TLP bandwidthYicong Yang1-0/+8
A typical PCIe transaction is consisted of various TLP packets in both direction. For counting bandwidth only memory read events are exported currently. Add memory write and completion counting events of both direction to complete the bandwidth counting. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Fix incorrect counting under metric modeYicong Yang1-1/+7
The metric counting shows incorrect results if the events in the metric group using the same event but different filter options. This is because we only judge the event code to decide whether the event in the metric group should share the same hardware counter, but ignore the settings of the filter. For example, on a platform of 2 ports 0x1 and 0x2 but only port 0x1 has a downstream PCIe NVME device. The metric counting shows both ports have the same counts because we misassign these two events to one same hardware counter: [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}' Performance counter stats for 'system wide': 7907484924 hisi_pcie0_core1/event=0x0104,port=0x2/ 7907484924 hisi_pcie0_core1/event=0x0104,port=0x1/ 10.153863691 seconds time elapsed Fix this by using the whole config rather than the event only to judge whether two events are the same and should share the same hardware counter. With this patch, the metric counting in the above case tends to be corrected: [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}' Performance counter stats for 'system wide': 0 hisi_pcie0_core1/event=0x0104,port=0x2/ 8123122077 hisi_pcie0_core1/event=0x0104,port=0x1/ 10.152875631 seconds time elapsed Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-4-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Introduce hisi_pcie_pmu_get_event_ctrl_val()Yicong Yang1-3/+10
Factor out retrieving of the register value for the corresponding event from hisi_pcie_config_event_ctrl() into a new function hisi_pcie_pmu_get_event_ctrl_val() allowing future reuse. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi_pcie: Rename hisi_pcie_pmu_{config,clear}_filter()Yicong Yang1-4/+4
hisi_pcie_pmu_{config,clear}_filter() are config/clear HISI_PCIE_EVENT_CTRL register which contains not only the filter but also the event code. The function names are bit misleading. Rename it to hisi_pcie_pmu_{config,clear}_event_ctrl() to reflects their functions more accurately. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240223103359.18669-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04drivers/perf: hisi: Enable HiSilicon Erratum 162700402 quirk for HIP09Junhao He1-1/+41
HiSilicon UC PMU v2 suffers the erratum 162700402 that the PMU counter cannot be set due to the lack of clock under power saving mode. This will lead to error or inaccurate counts. The clock can be enabled by the PMU global enabling control. This patch tries to fix this by set the UC PMU enable before set event period to turn on the clock, and then restore the UC PMU configuration. The counter register can hold its value without a clock. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240227125231.53127-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-03-01Merge tag 'riscv-for-linus-6.8-rc7' of ↵Linus Torvalds3-18/+18
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - detect ".option arch" support on not-yet-released LLVM builds - fix missing TLB flush when modifying non-leaf PTEs - fixes for T-Head custom extensions - fix for systems with the legacy PMU, that manifests as a crash on kernels built without SBI PMU support - fix for systems that clear *envcfg on suspend, which manifests as cbo.zero trapping after resume - fixes for Svnapot systems, including removing Svnapot support for huge vmalloc/vmap regions * tag 'riscv-for-linus-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Sparse-Memory/vmemmap out-of-bounds fix riscv: Fix pte_leaf_size() for NAPOT Revert "riscv: mm: support Svnapot in huge vmap" riscv: Save/restore envcfg CSR during CPU suspend riscv: Add a custom ISA extension for the [ms]envcfg CSR riscv: Fix enabling cbo.zero when running in M-mode perf: RISCV: Fix panic on pmu overflow handler MAINTAINERS: Update SiFive driver maintainers drivers: perf: ctr_get_width function for legacy is not defined drivers: perf: added capabilities for legacy PMU RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs riscv: Fix build error if !CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION riscv: mm: fix NOCACHE_THEAD does not set bit[61] correctly riscv: add CALLER_ADDRx support RISC-V: Drop invalid test from CONFIG_AS_HAS_OPTION_ARCH kbuild: Add -Wa,--fatal-warnings to as-instr invocation riscv: tlb: fix __p*d_free_tlb()
2024-02-29perf: RISCV: Fix panic on pmu overflow handlerFei Wu1-4/+4
(1 << idx) of int is not desired when setting bits in unsigned long overflowed_ctrs, use BIT() instead. This panic happens when running 'perf record -e branches' on sophgo sg2042. [ 273.311852] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098 [ 273.320851] Oops [#1] [ 273.323179] Modules linked in: [ 273.326303] CPU: 0 PID: 1475 Comm: perf Not tainted 6.6.0-rc3+ #9 [ 273.332521] Hardware name: Sophgo Mango (DT) [ 273.336878] epc : riscv_pmu_ctr_get_width_mask+0x8/0x62 [ 273.342291] ra : pmu_sbi_ovf_handler+0x2e0/0x34e [ 273.347091] epc : ffffffff80aecd98 ra : ffffffff80aee056 sp : fffffff6e36928b0 [ 273.354454] gp : ffffffff821f82d0 tp : ffffffd90c353200 t0 : 0000002ade4f9978 [ 273.361815] t1 : 0000000000504d55 t2 : ffffffff8016cd8c s0 : fffffff6e3692a70 [ 273.369180] s1 : 0000000000000020 a0 : 0000000000000000 a1 : 00001a8e81800000 [ 273.376540] a2 : 0000003c00070198 a3 : 0000003c00db75a4 a4 : 0000000000000015 [ 273.383901] a5 : ffffffd7ff8804b0 a6 : 0000000000000015 a7 : 000000000000002a [ 273.391327] s2 : 000000000000ffff s3 : 0000000000000000 s4 : ffffffd7ff8803b0 [ 273.398773] s5 : 0000000000504d55 s6 : ffffffd905069800 s7 : ffffffff821fe210 [ 273.406139] s8 : 000000007fffffff s9 : ffffffd7ff8803b0 s10: ffffffd903f29098 [ 273.413660] s11: 0000000080000000 t3 : 0000000000000003 t4 : ffffffff8017a0ca [ 273.421022] t5 : ffffffff8023cfc2 t6 : ffffffd9040780e8 [ 273.426437] status: 0000000200000100 badaddr: 0000000000000098 cause: 000000000000000d [ 273.434512] [<ffffffff80aecd98>] riscv_pmu_ctr_get_width_mask+0x8/0x62 [ 273.441169] [<ffffffff80076bd8>] handle_percpu_devid_irq+0x98/0x1ee [ 273.447562] [<ffffffff80071158>] generic_handle_domain_irq+0x28/0x36 [ 273.454151] [<ffffffff8047a99a>] riscv_intc_irq+0x36/0x4e [ 273.459659] [<ffffffff80c944de>] handle_riscv_irq+0x4a/0x74 [ 273.465442] [<ffffffff80c94c48>] do_irq+0x62/0x92 [ 273.470360] Code: 0420 60a2 6402 5529 0141 8082 0013 0000 0013 0000 (6d5c) b783 [ 273.477921] ---[ end trace 0000000000000000 ]--- [ 273.482630] Kernel panic - not syncing: Fatal exception in interrupt Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Fei Wu <fei2.wu@intel.com> Link: https://lore.kernel.org/r/20240228115425.2613856-1-fei2.wu@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-27drivers: perf: ctr_get_width function for legacy is not definedVadim Shakirov2-14/+12
With parameters CONFIG_RISCV_PMU_LEGACY=y and CONFIG_RISCV_PMU_SBI=n linux kernel crashes when you try perf record: $ perf record ls [ 46.749286] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 46.750199] Oops [#1] [ 46.750342] Modules linked in: [ 46.750608] CPU: 0 PID: 107 Comm: perf-exec Not tainted 6.6.0 #2 [ 46.750906] Hardware name: riscv-virtio,qemu (DT) [ 46.751184] epc : 0x0 [ 46.751430] ra : arch_perf_update_userpage+0x54/0x13e [ 46.751680] epc : 0000000000000000 ra : ffffffff8072ee52 sp : ff2000000022b8f0 [ 46.751958] gp : ffffffff81505988 tp : ff6000000290d400 t0 : ff2000000022b9c0 [ 46.752229] t1 : 0000000000000001 t2 : 0000000000000003 s0 : ff2000000022b930 [ 46.752451] s1 : ff600000028fb000 a0 : 0000000000000000 a1 : ff600000028fb000 [ 46.752673] a2 : 0000000ae2751268 a3 : 00000000004fb708 a4 : 0000000000000004 [ 46.752895] a5 : 0000000000000000 a6 : 000000000017ffe3 a7 : 00000000000000d2 [ 46.753117] s2 : ff600000028fb000 s3 : 0000000ae2751268 s4 : 0000000000000000 [ 46.753338] s5 : ffffffff8153e290 s6 : ff600000863b9000 s7 : ff60000002961078 [ 46.753562] s8 : ff60000002961048 s9 : ff60000002961058 s10: 0000000000000001 [ 46.753783] s11: 0000000000000018 t3 : ffffffffffffffff t4 : ffffffffffffffff [ 46.754005] t5 : ff6000000292270c t6 : ff2000000022bb30 [ 46.754179] status: 0000000200000100 badaddr: 0000000000000000 cause: 000000000000000c [ 46.754653] Code: Unable to access instruction at 0xffffffffffffffec. [ 46.754939] ---[ end trace 0000000000000000 ]--- [ 46.755131] note: perf-exec[107] exited with irqs disabled [ 46.755546] note: perf-exec[107] exited with preempt_count 4 This happens because in the legacy case the ctr_get_width function was not defined, but it is used in arch_perf_update_userpage. Also remove extra check in riscv_pmu_ctr_get_width_mask Signed-off-by: Vadim Shakirov <vadim.shakirov@syntacore.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Fixes: cc4c07c89aad ("drivers: perf: Implement perf event mmap support in the SBI backend") Link: https://lore.kernel.org/r/20240227170002.188671-3-vadim.shakirov@syntacore.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-27drivers: perf: added capabilities for legacy PMUVadim Shakirov1-0/+2
Added the PERF_PMU_CAP_NO_INTERRUPT flag because the legacy pmu driver does not provide sampling capabilities Added the PERF_PMU_CAP_NO_EXCLUDE flag because the legacy pmu driver does not provide the ability to disable counter incrementation in different privilege modes Suggested-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Vadim Shakirov <vadim.shakirov@syntacore.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Fixes: 9b3e150e310e ("RISC-V: Add a simple platform driver for RISC-V legacy perf") Link: https://lore.kernel.org/r/20240227170002.188671-2-vadim.shakirov@syntacore.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-20perf: CXL: fix CPMU filter value mask lengthHojin Nam1-5/+5
CPMU filter value is described as 4B length in CXL r3.0 8.2.7.2.2. However, it is used as 2B length in code and comments. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Hojin Nam <hj96.nam@samsung.com> Link: https://lore.kernel.org/r/20240216014522.32321-1-hj96.nam@samsung.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-15irqchip: Convert all platform MSI users to the new APIThomas Gleixner1-2/+2
Switch all the users of the platform MSI domain over to invoke the new interfaces which branch to the original platform MSI functions when the irqdomain associated to the caller device does not yet provide MSI parent functionality. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240127161753.114685-7-apatel@ventanamicro.com
2024-02-09perf/arm_cspmu: Add devicetree supportRobin Murphy1-12/+55
Hook up devicetree probing support. For now let's hope that people implement PMIIDR properly and we don't need an override property or match data mechanism. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Besar Wicaksono <bwicaksono@nvidia.com> Tested-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/836722034302ff62f2df56aaeb0036e71945a5d1.1706718007.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09perf/arm_cspmu: Simplify counter resetRobin Murphy1-6/+1
arm_cspmu_reset_counters() inherently also stops them since it is writing 0 to PMCR.E, so there should be no need to do that twice. Also tidy up the reset routine itself for consistency with the start and stop routines, and to be clear at first glance that it is simply writing a constant value. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/3105815327989f6bb7bb068994d0eb4096b4ef64.1706718007.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09perf/arm_cspmu: Simplify attribute groupsRobin Murphy2-17/+10
The attribute group array itself is always the same, so there's no need to allocate it separately. Storing it directly in our instance data saves memory and gives us one less point of failure. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/cf12b803114b0815438833fcb2495f20f2007761.1706718007.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09perf/arm_cspmu: Simplify initialisationRobin Murphy2-39/+22
It's far simpler for implementations to literally override whichever default ops they want to, by initialising to the default ops first. This saves all the bother of checking what the impl_init_ops call has or hasn't touched. Make the same clear distinction for the PMIIDR override as well, in case we gain more sources for overriding that in future. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/dd39718ee4890fd46a8e443c25303e87ae23f422.1706718007.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count)Ilkka Koskinen1-0/+11
AmpereOneX mesh implementation has a bug in HN-P nodes that makes them report incorrect child count. The failing crosspoints report 8 children while they only have two. When the driver tries to access the inexistent child nodes, it believes it has reached an invalid node type and probing fails. The workaround is to ignore those incorrect child nodes and continue normally. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> [ rm: rewrote simpler generalised version ] Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ce4b1442135fe03d0de41859b04b268c88c854a3.1707498577.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09perf: CXL: fix mismatched cpmu event opcodeHojin Nam1-1/+1
S2M NDR BI-ConflictAck opcode is described as 4 in the CXL r3.0 3.3.9 Table 3.43. However, it is defined as 3 in macro definition. Fixes: 5d7107c72796 ("perf: CXL Performance Monitoring Unit driver") Signed-off-by: Hojin Nam <hj96.nam@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240208013415epcms2p2904187c8a863f4d0d2adc980fb91a2dc@epcms2p2 Signed-off-by: Will Deacon <will@kernel.org>
2024-02-09perf/arm-cmn: Improve debugfs pretty-printing for large configsRobin Murphy1-4/+5
The debugfs pretty-printer was written for the CMN-600 assumptions of a maximum 8x8 mesh, but CMN-700 now allows coordinates and ID values up to 12 and 128 respectively, which can overflow the format strings, mess up the alignment of the table and hurt overall readability. This table does prove useful for double-checking that the driver is picking up the topology of new systems correctly and for verifying user expectations, so tweak the formatting to stay nice and readable with wider values. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/1d1517eadd1bac5992fab679c9dc531b381944da.1702484646.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>