Age | Commit message (Collapse) | Author | Files | Lines |
|
Mostafa reports that commit d232606773a0 ("arm64/sysreg: refactor
deprecated strncpy") breaks our early command-line parsing because the
original code is working on space-delimited substrings rather than
NUL-terminated strings.
Rather than simply reverting the broken conversion patch, replace the
strscpy() with a simple memcpy() with an explicit NUL-termination of the
result.
Reported-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Mostafa Saleh <smostafa@google.com>
Fixes: d232606773a0 ("arm64/sysreg: refactor deprecated strncpy")
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20230905-strncpy-arch-arm64-v4-1-bc4b14ddfaef@google.com
Link: https://lore.kernel.org/r/20230831162227.2307863-1-smostafa@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The number of Count Units field is described as 6 bits long
in the CXL 3.0 specification. However, its mask value was
only declared as 5 bits long.
Signed-off-by: Jeongtae Park <jtp.park@samsung.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20230905123309.775854-1-jtp.park@samsung.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
* for-next/selftests: (22 commits)
kselftest/arm64: Fix hwcaps selftest build
kselftest/arm64: add jscvt feature to hwcap test
kselftest/arm64: add pmull feature to hwcap test
kselftest/arm64: add AES feature check to hwcap test
kselftest/arm64: add SHA1 and related features to hwcap test
kselftest/arm64: build BTI tests in output directory
kselftest/arm64: fix a memleak in zt_regs_run()
kselftest/arm64: Size sycall-abi buffers for the actual maximum VL
kselftest/arm64: add lse and lse2 features to hwcap test
kselftest/arm64: add test item that support to capturing the SIGBUS signal
kselftest/arm64: add DEF_SIGHANDLER_FUNC() and DEF_INST_RAISE_SIG() helpers
kselftest/arm64: add crc32 feature to hwcap test
kselftest/arm64: add float-point feature to hwcap test
kselftest/arm64: Use the tools/include compiler.h rather than our own
kselftest/arm64: Use shared OPTIMZER_HIDE_VAR() definiton
kselftest/arm64: Make the tools/include headers available
tools include: Add some common function attributes
tools compiler.h: Add OPTIMIZER_HIDE_VAR()
kselftest/arm64: Exit streaming mode after collecting signal context
kselftest/arm64: add RCpc load-acquire to hwcap test
...
|
|
* for-next/perf:
drivers/perf: hisi: Update HiSilicon PMU maintainers
arm_pmu: acpi: Add a representative platform device for TRBE
arm_pmu: acpi: Refactor arm_spe_acpi_register_device()
hw_breakpoint: fix single-stepping when using bpf_overflow_handler
perf/imx_ddr: don't enable counter0 if none of 4 counters are used
perf/imx_ddr: speed up overflow frequency of cycle
drivers/perf: hisi: Schedule perf session according to locality
perf/arm-dmc620: Fix dmc620_pmu_irqs_lock/cpu_hotplug_lock circular lock dependency
perf/smmuv3: Add MODULE_ALIAS for module auto loading
perf/smmuv3: Enable HiSilicon Erratum 162001900 quirk for HIP08/09
perf: pmuv3: Remove comments from armv8pmu_[enable|disable]_event()
perf/arm-cmn: Add CMN-700 r3 support
perf/arm-cmn: Refactor HN-F event selector macros
perf/arm-cmn: Remove spurious event aliases
drivers/perf: Explicitly include correct DT includes
perf: pmuv3: Add Cortex A520, A715, A720, X3 and X4 PMUs
dt-bindings: arm: pmu: Add Cortex A520, A715, A720, X3, and X4
perf/smmuv3: Remove build dependency on ACPI
perf: xgene_pmu: Convert to devm_platform_ioremap_resource()
driver/perf: Add identifier sysfs file for Yitian 710 DDR
|
|
* for-next/mm:
arm64: fix build warning for ARM64_MEMSTART_SHIFT
arm64: Remove unsued extern declaration init_mem_pgprot()
arm64/mm: Set only the PTE_DIRTY bit while preserving the HW dirty state
arm64/mm: Add pte_rdonly() helper
arm64/mm: Directly use ID_AA64MMFR2_EL1_VARange_MASK
arm64/mm: Replace an open coding with ID_AA64MMFR1_EL1_HAFDBS_MASK
|
|
* for-next/misc:
arm64/sysreg: refactor deprecated strncpy
arm64: sysreg: Generate C compiler warnings on {read,write}_sysreg_s arguments
arm64: sdei: abort running SDEI handlers during crash
arm64: Explicitly include correct DT includes
arm64/Kconfig: Sort the RCpc feature under the ARMv8.3 features menu
arm64: vdso: remove two .altinstructions related symbols
arm64/ptrace: Clean up error handling path in sve_set_common()
|
|
* for-next/errata:
arm64: errata: Group all Cortex-A510 errata together
|
|
* for-next/entry:
arm64: syscall: unmask DAIF earlier for SVCs
|
|
* for-next/docs:
Documentation: arm64: Correct SME ZA macros name
|
|
* for-next/cpufeature:
arm64/fpsimd: Only provide the length to cpufeature for xCR registers
selftests/arm64: add HWCAP2_HBC test
arm64: add HWCAP for FEAT_HBC (hinted conditional branches)
arm64/cpufeature: Use ARM64_CPUID_FIELD() to match EVT
|
|
Since Guangbin and Shaokun have left HiSilicon and will no longer
maintain the drivers, update the maintainer information and
thanks for their work.
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20230824024135.1291459-1-shaojijie@huawei.com
[will: left the HNS3 title as-is to avoid the churn of resorting the entries]
Signed-off-by: Will Deacon <will@kernel.org>
|
|
ACPI TRBE does not have a HID for identification which could create and add
a platform device into the platform bus. Also without a platform device, it
cannot be probed and bound to a platform driver.
This creates a dummy platform device for TRBE after ascertaining that ACPI
provides required interrupts uniformly across all cpus on the system. This
device gets created inside drivers/perf/arm_pmu_acpi.c to accommodate TRBE
being built as a module.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20230817055405.249630-3-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Sanity checking all the GICC tables for same interrupt number, and ensuring
a homogeneous ACPI based machine, could be used for other platform devices
as well. Hence this refactors arm_spe_acpi_register_device() into a common
helper arm_acpi_register_pmu_device().
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Co-developed-by: Will Deacon <will@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20230817055405.249630-2-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The hwcaps selftest currently relies on the assembler being able to
assemble the crc32w instruction but this is not in the base v8.0 so is not
accepted by the standard GCC configurations used by many distributions.
Switch to manually encoding to fix the build.
Fixes: 09d2e95a04ad ("kselftest/arm64: add crc32 feature to hwcap test")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230816-arm64-fix-crc32-build-v1-1-40165c1290f2@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Arm platforms use is_default_overflow_handler() to determine if the
hw_breakpoint code should single-step over the breakpoint trigger or
let the custom handler deal with it.
Since bpf_overflow_handler() currently isn't recognized as a default
handler, attaching a BPF program to a PERF_TYPE_BREAKPOINT event causes
it to keep firing (the instruction triggering the data abort exception
is never skipped). For example:
# bpftrace -e 'watchpoint:0x10000:4:w { print("hit") }' -c ./test
Attaching 1 probe...
hit
hit
[...]
^C
(./test performs a single 4-byte store to 0x10000)
This patch replaces the check with uses_default_overflow_handler(),
which accounts for the bpf_overflow_handler() case by also testing
if one of the perf_event_output functions gets invoked indirectly,
via orig_default_handler.
Signed-off-by: Tomislav Novak <tnovak@meta.com>
Tested-by: Samuel Gosselin <sgosselin@google.com> # arm64
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/linux-arm-kernel/20220923203644.2731604-1-tnovak@fb.com/
Link: https://lore.kernel.org/r/20230605191923.1219974-1-tnovak@meta.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
`strncpy` is deprecated for use on NUL-terminated destination strings
[1]. Which seems to be the case here due to the forceful setting of `buf`'s
tail to 0.
A suitable replacement is `strscpy` [2] due to the fact that it
guarantees NUL-termination on its destination buffer argument which is
_not_ the case for `strncpy`!
In this case, we can simplify the logic and also check for any silent
truncation by using `strscpy`'s return value.
This should have no functional change and yet uses a more robust and
less ambiguous interface whilst reducing code complexity.
Link: www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings[1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Suggested-by: Kees Cook <keescook@chromium.org>
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20230811-strncpy-arch-arm64-v2-1-ba84eabffadb@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add the jscvt feature check in the set of hwcap tests.
Due to the requirement of jscvt feature, a compiler configuration
of v8.3 or above is needed to support assembly. Therefore, hand
encode is used here instead.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230815040915.3966955-5-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add the pmull feature check in the set of hwcap tests.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230815040915.3966955-4-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add the AES feature check in the set of hwcap tests.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230815040915.3966955-3-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add the SHA1 and related features check in the set of hwcap tests.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230815040915.3966955-2-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Evaluate the register before the asm section so that the C compiler
generates warnings when there is an issue with the register argument.
This will prevent possible future issues such as the one seen here [1]
where a missing bracket caused the shift and addition operators to be
evaluated in the wrong order, but no warning was emitted. The GNU
assembler has no warning for when expressions evaluate differently to C
due to different operator precedence, but the C compiler has some
warnings that may suggest something is wrong. For example in this case
the following warning would have been emitted:
error: operator '>>' has lower precedence than '+'; '+' will be evaluated first [-Werror,-Wshift-op-parentheses]
There are currently no existing warnings that need to be fixed.
[1]: https://lore.kernel.org/linux-perf-users/20230728162011.GA22050@willie-the-truck/
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20230815140639.614769-1-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The arm64 BTI selftests are currently built in the source directory,
then the generated binaries are copied to the output directory.
This leaves the object files around in a potentially otherwise pristine
source tree, tainting it for out-of-tree kernel builds.
Prepend $(OUTPUT) to every reference to an object file in the Makefile,
and remove the extra handling and copying. This puts all generated files
under the output directory.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230815145931.2522557-1-andre.przywara@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In current driver, counter0 will be enabled after ddr_perf_pmu_enable()
is called even though none of the 4 counters are used. This will cause
counter0 continue to count until ddr_perf_pmu_disabled() is called. If
pmu is not disabled all the time, the pmu interrupt will be asserted
from time to time due to counter0 will overflow and irq handler will
clear it. It's not an expected behavior. This patch will not enable
counter0 if none of 4 counters are used.
Fixes: 9a66d36cc7ac ("drivers/perf: imx_ddr: Add DDR performance counter support to perf")
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20230811015438.1999307-2-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
For i.MX8MP, we cannot ensure that cycle counter overflow occurs at least
4 times as often as other events. Due to byte counters will count for any
event configured, it will overflow more often. And if byte counters
overflow that related counters would stop since they share the
COUNTER_CNTL. We can speed up cycle counter overflow frequency by setting
counter parameter (CP) field of cycle counter. In this way, we can avoid
stop counting byte counters when interrupt didn't come and the byte
counters can be fetched or updated from each cycle counter overflow
interrupt.
Because we initialize CP filed to shorten counter0 overflow time, the cycle
counter will start couting from a fixed/base value each time. We need to
remove the base from the result too. Therefore, we could get precise result
from cycle counter.
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20230811015438.1999307-1-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The PCIe PMUs locate on different NUMA node but currently we don't
consider it and likely stack all the sessions on the same CPU:
[root@localhost tmp]# cat /sys/devices/hisi_pcie*/cpumask
0
0
0
0
0
0
This can be optimize a bit to use a local CPU for the PMU.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20230815131010.2147-1-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
If memcmp() does not return 0, "zeros" need to be freed to prevent memleak
Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20230815074915.245528-1-dingxiang@cmss.chinamobile.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
dependency
The following circular locking dependency was reported when running
cpus online/offline test on an arm64 system.
[ 84.195923] Chain exists of:
dmc620_pmu_irqs_lock --> cpu_hotplug_lock --> cpuhp_state-down
[ 84.207305] Possible unsafe locking scenario:
[ 84.213212] CPU0 CPU1
[ 84.217729] ---- ----
[ 84.222247] lock(cpuhp_state-down);
[ 84.225899] lock(cpu_hotplug_lock);
[ 84.232068] lock(cpuhp_state-down);
[ 84.238237] lock(dmc620_pmu_irqs_lock);
[ 84.242236]
*** DEADLOCK ***
The following locking order happens when dmc620_pmu_get_irq() calls
cpuhp_state_add_instance_nocalls().
lock(dmc620_pmu_irqs_lock) --> lock(cpu_hotplug_lock)
On the other hand, the calling sequence
cpuhp_thread_fun()
=> cpuhp_invoke_callback()
=> dmc620_pmu_cpu_teardown()
leads to the locking sequence
lock(cpuhp_state-down) => lock(dmc620_pmu_irqs_lock)
Here dmc620_pmu_irqs_lock protects both the dmc620_pmu_irqs and the
pmus_node lists in various dmc620_pmu instances. dmc620_pmu_get_irq()
requires protected access to dmc620_pmu_irqs whereas
dmc620_pmu_cpu_teardown() needs protection to the pmus_node lists.
Break this circular locking dependency by using two separate locks to
protect dmc620_pmu_irqs list and the pmus_node lists respectively.
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/r/20230812235549.494174-1-longman@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
On my ACPI based arm64 server, if the SMMUv3 PMU is configured as
module it won't be loaded automatically after booting even if the
device has already been scanned and added. It's because the module
lacks a platform alias, the uevent mechanism and userspace tools
like udevd make use of this to find the target driver module of the
device. This patch adds the missing platform alias of the module,
then module will be loaded automatically if device exists.
Before this patch:
[root@localhost tmp]# modinfo arm_smmuv3_pmu | grep alias
alias: of:N*T*Carm,smmu-v3-pmcgC*
alias: of:N*T*Carm,smmu-v3-pmcg
After this patch:
[root@localhost tmp]# modinfo arm_smmuv3_pmu | grep alias
alias: platform:arm-smmu-v3-pmcg
alias: of:N*T*Carm,smmu-v3-pmcgC*
alias: of:N*T*Carm,smmu-v3-pmcg
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20230814131642.65263-1-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Some HiSilicon SMMU PMCG suffers the erratum 162001900 that the PMU
disable control sometimes fail to disable the counters. This will lead
to error or inaccurate data since before we enable the counters the
counter's still counting for the event used in last perf session.
This patch tries to fix this by hardening the global disable process.
Before disable the PMU, writing an invalid event type (0xffff) to
focibly stop the counters. Correspondingly restore each events on
pmu::pmu_enable().
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20230814124012.58013-1-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Our ABI opts to provide future proofing by defining a much larger
SVE_VQ_MAX than the architecture actually supports. Since we use
this define to control the size of our vector data buffers this results
in a lot of overhead when we initialise which can be a very noticable
problem in emulation, we fill buffers that are orders of magnitude
larger than we will ever actually use even with virtual platforms that
provide the full range of architecturally supported vector lengths.
Define and use the actual architecture maximum to mitigate this.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230810-arm64-syscall-abi-perf-v1-1-6a0d7656359c@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add the LSE and various features check in the set of hwcap tests.
As stated in the ARM manual, the LSE2 feature allows for atomic access
to unaligned memory. Therefore, for processors that only have the LSE
feature, we register .sigbus_fn to test their ability to perform
unaligned access.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230808134036.668954-6-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Some enhanced features, such as the LSE2 feature, do not result in
SILLILL if LSE2 is missing and LSE is present, but will generate a
SIGBUS exception when atomic access unaligned.
Therefore, we add test item to test this type of features.
Notice that testing for SIGBUS only makes sense after make sure that
the instruction does not cause a SIGILL signal.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230808134036.668954-5-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add macro definition functions DEF_SIGHANDLER_FUNC() and
DEF_INST_RAISE_SIG() helpers.
Furthermore, there is no need to modify the default SIGILL handling
function throughout the entire testing lifecycle in the main() function.
It is reasonable to narrow the scope to the context of the sig_fn
function only.
This is a pre-patch for the subsequent SIGBUS handler patch.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230808134036.668954-4-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add the CRC32 feature check in the set of hwcap tests.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230808134036.668954-3-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add the FP feature check in the set of hwcap tests.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230808134036.668954-2-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
For a number of historical reasons, when handling SVCs we don't unmask
DAIF in el0_svc() or el0_svc_compat(), and instead do so later in
el0_svc_common(). This is unfortunate and makes it harder to make
changes to the DAIF management in entry-common.c as we'd like to do as
cleanup and preparation for FEAT_NMI support. We can move the DAIF
unmasking to entry-common.c as long as we also hoist the
fp_user_discard() logic, as reasoned below.
We converted the syscall trace logic from assembly to C in commit:
f37099b6992a0b81 ("arm64: convert syscall trace logic to C")
... which was intended to have no functional change, and mirrored the
existing assembly logic to avoid the risk of any functional regression.
With the logic in C, it's clear that there is currently no reason to
unmask DAIF so late within el0_svc_common():
* The thread flags are read prior to unmasking DAIF, but are not
consumed until after DAIF is unmasked, and we don't perform a
read-modify-write sequence of the thread flags for which we might need
to serialize against an IPI modifying the flags. Similarly, for any
thread flags set by other threads, whether DAIF is masked or not has
no impact.
The read_thread_flags() helpers performs a single-copy-atomic read of
the flags, and so this can safely be moved after unmasking DAIF.
* The pt_regs::orig_x0 and pt_regs::syscallno fields are neither
consumed nor modified by the handler for any DAIF exception (e.g.
these do not exist in the `perf_event_arm_regs` enum and are not
sampled by perf in its IRQ handler).
Thus, the manipulation of pt_regs::orig_x0 and pt_regs::syscallno can
safely be moved after unmasking DAIF.
Given the above, we can safely hoist unmasking of DAIF out of
el0_svc_common(), and into its immediate callers: do_el0_svc() and
do_el0_svc_compat(). Further:
* In do_el0_svc(), we sample the syscall number from
pt_regs::regs[8]. This is not modified by the handler for any DAIF
exception, and thus can safely be moved after unmasking DAIF.
As fp_user_discard() operates on the live FP/SVE/SME register state,
this needs to occur before we clear DAIF.IF, as interrupts could
result in preemption which would cause this state to become foreign.
As fp_user_discard() is the first function called within do_el0_svc(),
it has no dependency on other parts of do_el0_svc() and can be moved
earlier so long as it is called prior to unmasking DAIF.IF.
* In do_el0_svc_compat(), we sample the syscall number from
pt_regs::regs[7]. This is not modified by the handler for any DAIF
exception, and thus can safely be moved after unmasking DAIF.
Compat threads cannot use SVE or SME, so there's no need for
el0_svc_compat() to call fp_user_discard().
Given the above, we can safely hoist the unmasking of DAIF out of
do_el0_svc() and do_el0_svc_compat(), and into their immediate callers:
el0_svc() and el0_svc_compat(), so long a we also hoist
fp_user_discard() into el0_svc().
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230808101148.1064172-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
For both SVE and SME we abuse the generic register field comparison
support in the cpufeature code as part of our detection of unsupported
variations in the vector lengths available to PEs, reporting the maximum
vector lengths via ZCR_EL1.LEN and SMCR_EL1.LEN. Since these are
configuration registers rather than identification registers the
assumptions the cpufeature code makes about how unknown bitfields behave
are invalid, leading to warnings when SME features like FA64 are enabled
and we hotplug a CPU:
CPU features: SANITY CHECK: Unexpected variation in SYS_SMCR_EL1. Boot CPU: 0x0000000000000f, CPU3: 0x0000008000000f
CPU features: Unsupported CPU feature variation detected.
SVE has no controls other than the vector length so is not yet impacted
but the same issue will apply there if any are defined.
Since the only field we are interested in having the cpufeature code
handle is the length field and we use a custom read function to obtain
the value we can avoid these warnings by filtering out all other bits
when we return the register value, if we're doing that we don't need to
bother reading the register at all and can simply use the RDVL/RDSVL
value we were filling in instead.
Fixes: 2e0f2478ea37 ("arm64/sve: Probe SVE capabilities and usable vector lengths")
FixeS: b42990d3bf77 ("arm64/sme: Identify supported SME vector lengths at boot")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20230731-arm64-sme-fa64-hotplug-v2-1-7714c00dd902@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The BTI test program started life as standalone programs outside the
kselftest suite so provided it's own compiler.h. Now that we have updated
the tools/include compiler.h to have all the definitions that we are using
and the arm64 selftsets pull in tools/includes let's drop our custom
version.
__unreachable() is named unreachable() there requiring an update in the
code.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-6-0c1290db5d46@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
We had open coded the definition of OPTIMIZER_HIDE_VAR() as a fix but now
that we have the generic tools/include available and that has had a
definition of OPTIMIZER_HIDE_VAR() we can switch to the define.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-5-0c1290db5d46@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Make the generic tools/include headers available to the arm64 selftests so
we can reduce some duplication.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-4-0c1290db5d46@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
We don't have definitions of __always_unused or __noreturn in the tools
version of compiler.h, add them so we can use them in kselftests.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-3-0c1290db5d46@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Port over the definition of OPTIMIZER_HIDE_VAR() so we can use it in
kselftests.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-2-0c1290db5d46@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
When we collect a signal context with one of the SME modes enabled we will
have enabled that mode behind the compiler and libc's back so they may
issue some instructions not valid in streaming mode, causing spurious
failures.
For the code prior to issuing the BRK to trigger signal handling we need to
stay in streaming mode if we were already there since that's a part of the
signal context the caller is trying to collect. Unfortunately this code
includes a memset() which is likely to be heavily optimised and is likely
to use FP instructions incompatible with streaming mode. We can avoid this
happening by open coding the memset(), inserting a volatile assembly
statement to avoid the compiler recognising what's being done and doing
something in optimisation. This code is not performance critical so the
inefficiency should not be an issue.
After collecting the context we can simply exit streaming mode, avoiding
these issues. Use a full SMSTOP for safety to prevent any issues appearing
with ZA.
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-1-0c1290db5d46@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Interrupts are blocked in SDEI context, per the SDEI spec: "The client
interrupts cannot preempt the event handler." If we crashed in the SDEI
handler-running context (as with ACPI's AGDI) then we need to clean up the
SDEI state before proceeding to the crash kernel so that the crash kernel
can have working interrupts.
Track the active SDEI handler per-cpu so that we can COMPLETE_AND_RESUME
the handler, discarding the interrupted context.
Fixes: f5df26961853 ("arm64: kernel: Add arch-specific SDEI entry code and CPU masking")
Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
Cc: stable@vger.kernel.org
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: Mihai Carabas <mihai.carabas@oracle.com>
Link: https://lore.kernel.org/r/20230627002939.2758-1-scott@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add the RCpc and various features check in the set of hwcap tests.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230803133905.971697-1-zengheng4@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add a test for the newly added HWCAP2_HBC.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230804143746.3900803-3-joey.gouly@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add a HWCAP for FEAT_HBC, so that userspace can make a decision on using
this feature.
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230804143746.3900803-2-joey.gouly@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The comments in armv8pmu_[enable|disable]_event() are blindingly obvious,
and does not contribute in making things any better. Let's drop them off.
Functional change is not intended.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20230802090853.1190391-1-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
When building with W=1, the following warning occurs.
arch/arm64/include/asm/kernel-pgtable.h:129:41: error: "PUD_SHIFT" is not defined, evaluates to 0 [-Werror=undef]
129 | #define ARM64_MEMSTART_SHIFT PUD_SHIFT
| ^~~~~~~~~
arch/arm64/include/asm/kernel-pgtable.h:142:5: note: in expansion of macro ‘ARM64_MEMSTART_SHIFT’
142 | #if ARM64_MEMSTART_SHIFT < SECTION_SIZE_BITS
| ^~~~~~~~~~~~~~~~~~~~
The generic PUD_SHIFT was defined in include/asm-generic/pgtable-nopud.h,
however the #ifndef __ASSEMBLY__ guard in this header file makes it unavailable
for assembly files. While someone .S file include the <asm/kernel-pgtable.h>,
the build warning would occur. Now move the macro ARM64_MEMSTART_SHIFT and
ARM64_MEMSTART_ALIGN to arch/arm64/mm/init.c where it is used only, to avoid
this issue.
Signed-off-by: Zhang Jianhua <chris.zjh@huawei.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20230804075615.3334756-1-chris.zjh@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Remove unused 'of*.h' header inclusions from the arm64 arch code to
allow for the eventual untangling of 'of_device.h and 'of_platform.h',
which currently include each other.
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230714174021.4039807-1-robh@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|