summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel
AgeCommit message (Collapse)AuthorFilesLines
2018-12-10arm64: mm: Offset TTBR1 to allow 52-bit PTRS_PER_PGDSteve Capper2-0/+2
Enabling 52-bit VAs on arm64 requires that the PGD table expands from 64 entries (for the 48-bit case) to 1024 entries. This quantity, PTRS_PER_PGD is used as follows to compute which PGD entry corresponds to a given virtual address, addr: pgd_index(addr) -> (addr >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1) Userspace addresses are prefixed by 0's, so for a 48-bit userspace address, uva, the following is true: (uva >> PGDIR_SHIFT) & (1024 - 1) == (uva >> PGDIR_SHIFT) & (64 - 1) In other words, a 48-bit userspace address will have the same pgd_index when using PTRS_PER_PGD = 64 and 1024. Kernel addresses are prefixed by 1's so, given a 48-bit kernel address, kva, we have the following inequality: (kva >> PGDIR_SHIFT) & (1024 - 1) != (kva >> PGDIR_SHIFT) & (64 - 1) In other words a 48-bit kernel virtual address will have a different pgd_index when using PTRS_PER_PGD = 64 and 1024. If, however, we note that: kva = 0xFFFF << 48 + lower (where lower[63:48] == 0b) and, PGDIR_SHIFT = 42 (as we are dealing with 64KB PAGE_SIZE) We can consider: (kva >> PGDIR_SHIFT) & (1024 - 1) - (kva >> PGDIR_SHIFT) & (64 - 1) = (0xFFFF << 6) & 0x3FF - (0xFFFF << 6) & 0x3F // "lower" cancels out = 0x3C0 In other words, one can switch PTRS_PER_PGD to the 52-bit value globally provided that they increment ttbr1_el1 by 0x3C0 * 8 = 0x1E00 bytes when running with 48-bit kernel VAs (TCR_EL1.T1SZ = 16). For kernel configuration where 52-bit userspace VAs are possible, this patch offsets ttbr1_el1 and sets PTRS_PER_PGD corresponding to the 52-bit value. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com> [will: added comment to TTBR1_BADDR_4852_OFFSET calculation] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: ftrace: Set FTRACE_MAY_SLEEP before ftrace_modify_all_code()Steven Rostedt (VMware)1-0/+1
It has been reported that ftrace_replace_code() which is called by ftrace_modify_all_code() can cause a soft lockup warning for an allmodconfig kernel. This is because all the debug options enabled causes the loop in ftrace_replace_code() (which loops over all the functions being enabled where there can be 10s of thousands), is too slow, and never schedules out. To solve this, setting FTRACE_MAY_SLEEP to the command passed into ftrace_replace_code() will make it call cond_resched() in the loop, which prevents the soft lockup warning from triggering. Link: http://lkml.kernel.org/r/20181204192903.8193-1-anders.roxell@linaro.org Link: http://lkml.kernel.org/r/20181205183304.000714627@goodmis.org Acked-by: Will Deacon <will.deacon@arm.com> Reported-by: Anders Roxell <anders.roxell@linaro.org> Tested-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-12-10arm64: KVM: Force VHE for systems affected by erratum 1165522Marc Zyngier1-0/+8
In order to easily mitigate ARM erratum 1165522, we need to force affected CPUs to run in VHE mode if using KVM. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: remove arm64ksyms.cMark Rutland2-31/+1
Now that arm64ksyms.c has been reduced to a stub, let's remove it entirely. New exports should be associated with their function definition. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: frace: use asm EXPORT_SYMBOL()Mark Rutland2-5/+4
For a while now it's been possible to use EXPORT_SYMBOL() in assembly files, which allows us to place exports immediately after assembly functions, as we do for C functions. As a step towards removing arm64ksyms.c, let's move the ftrace exports to the assembly files the functions are defined in. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: string: use asm EXPORT_SYMBOL()Mark Rutland1-20/+0
For a while now it's been possible to use EXPORT_SYMBOL() in assembly files, which allows us to place exports immediately after assembly functions, as we do for C functions. As a step towards removing arm64ksyms.c, let's move the string routine exports to the assembly files the functions are defined in. Routines which should only be exported for !KASAN builds are exported using the EXPORT_SYMBOL_NOKASAN() helper. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: uaccess: use asm EXPORT_SYMBOL()Mark Rutland1-7/+0
For a while now it's been possible to use EXPORT_SYMBOL() in assembly files, which allows us to place exports immediately after assembly functions, as we do for C functions. As a step towards removing arm64ksyms.c, let's move the uaccess exports to the assembly files the functions are defined in. As we have to include <asm/assembler.h>, the existing includes are fixed to follow the usual ordering conventions. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: page: use asm EXPORT_SYMBOL()Mark Rutland1-3/+0
For a while now it's been possible to use EXPORT_SYMBOL() in assembly files, which allows us to place exports immediately after assembly functions, as we do for C functions. As a step towards removing arm64ksyms.c, let's move the copy_page and clear_page exports to the assembly files the functions are defined in. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: smccc: use asm EXPORT_SYMBOL()Mark Rutland2-5/+4
For a while now it's been possible to use EXPORT_SYMBOL() in assembly files, which allows us to place exports immediately after assembly functions, as we do for C functions. As a step towards removing arm64ksyms.c, let's move the SMCCC exports to the assembly file the functions are defined in. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: tishift: use asm EXPORT_SYMBOL()Mark Rutland1-7/+0
For a while now it's been possible to use EXPORT_SYMBOL() in assembly files, which allows us to place exports immediately after assembly functions, as we do for C functions. As a step towards removing arm64ksyms.c, let's move the tishift exports to the assembly file the functions are defined in. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: move memstart_addr export inlineMark Rutland1-3/+0
Since we define memstart_addr in a C file, we can have the export immediately after the definition of the symbol, as we do elsewhere. As a step towards removing arm64ksyms.c, move the export of memstart_addr to init.c, where the symbol is defined. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10arm64: remove bitop exportsMark Rutland1-8/+0
Now that the arm64 bitops are inlines built atop of the regular atomics, we don't need to export anything. Remove the redundant exports. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-09arm64: function_graph: Remove use of FTRACE_NOTRACE_DEPTHSteven Rostedt (VMware)1-3/+0
Functions in the set_graph_notrace no longer subtract FTRACE_NOTRACE_DEPTH from curr_ret_stack, as that is now implemented via the trace_recursion flags. Access to curr_ret_stack no longer needs to worry about checking for this. curr_ret_stack is still initialized to -1, when there's not a shadow stack allocated. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-arm-kernel@lists.infradead.org Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-12-07arm64: hibernate: Avoid sending cross-calling with interrupts disabledWill Deacon1-1/+1
Since commit 3b8c9f1cdfc50 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings"), a call to flush_icache_range() will use an IPI to cross-call other online CPUs so that any stale instructions are flushed from their pipelines. This triggers a WARN during the hibernation resume path, where flush_icache_range() is called with interrupts disabled and is therefore prone to deadlock: | Disabling non-boot CPUs ... | CPU1: shutdown | psci: CPU1 killed. | CPU2: shutdown | psci: CPU2 killed. | CPU3: shutdown | psci: CPU3 killed. | WARNING: CPU: 0 PID: 1 at ../kernel/smp.c:416 smp_call_function_many+0xd4/0x350 | Modules linked in: | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc4 #1 Since all secondary CPUs have been taken offline prior to invalidating the I-cache, there's actually no need for an IPI and we can simply call __flush_icache_range() instead. Cc: <stable@vger.kernel.org> Fixes: 3b8c9f1cdfc50 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings") Reported-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Tested-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Tested-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-12-07arm64: kexec_file: forbid kdump via kexec_file_load()James Morse1-0/+4
Now that kexec_walk_memblock() can do the crash-kernel placement itself architectures that don't support kdump via kexe_file_load() need to explicitly forbid it. We don't support this on arm64 until the kernel can add the elfcorehdr and usable-memory-range fields to the DT. Without these the crash-kernel overwrites the previous kernel's memory during startup. Add a check to refuse crash image loading. Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: entry: Remove confusing commentWill Deacon1-4/+0
The comment about SYS_MEMBARRIER_SYNC_CORE relying on ERET being context-synchronizing is confusing and misplaced with kpti. Given that this is already documented under Documentation/ (see arch-support.txt for membarrier), remove the comment altogether. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: entry: Place an SB sequence following an ERET instructionWill Deacon1-0/+2
Some CPUs can speculate past an ERET instruction and potentially perform speculative accesses to memory before processing the exception return. Since the register state is often controlled by a lower privilege level at the point of an ERET, this could potentially be used as part of a side-channel attack. This patch emits an SB sequence after each ERET so that speculation is held up on exception return. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: Add support for SB barrier and patch in over DSB; ISB sequencesWill Deacon2-0/+13
We currently use a DSB; ISB sequence to inhibit speculation in set_fs(). Whilst this works for current CPUs, future CPUs may implement a new SB barrier instruction which acts as an architected speculation barrier. On CPUs that support it, patch in an SB; NOP sequence over the DSB; ISB sequence and advertise the presence of the new instruction to userspace. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: kexec_file: Refactor setup_dtb() to consolidate error checkingWill Deacon1-30/+34
setup_dtb() is a little difficult to read. This is largely because it duplicates the FDT -> Linux errno conversion for every intermediate return value, but also because of silly cosmetic things like naming and formatting. Given that this is all brand new, refactor the function to get us off on the right foot. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: kexec_file: add kaslr supportAKASHI Takahiro1-1/+18
Adding "kaslr-seed" to dtb enables triggering kaslr, or kernel virtual address randomization, at secondary kernel boot. We always do this as it will have no harm on kaslr-incapable kernel. We don't have any "switch" to turn off this feature directly, but still can suppress it by passing "nokaslr" as a kernel boot argument. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> [will: Use rng_is_initialized()] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: kexec_file: add kernel signature verification supportAKASHI Takahiro1-5/+18
With this patch, kernel verification can be done without IMA security subsystem enabled. Turn on CONFIG_KEXEC_VERIFY_SIG instead. On x86, a signature is embedded into a PE file (Microsoft's format) header of binary. Since arm64's "Image" can also be seen as a PE file as far as CONFIG_EFI is enabled, we adopt this format for kernel signing. You can create a signed kernel image with: $ sbsign --key ${KEY} --cert ${CERT} Image Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: James Morse <james.morse@arm.com> [will: removed useless pr_debug()] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: capabilities: Batch cpu_enable callbacksSuzuki K Poulose1-26/+44
We use a stop_machine call for each available capability to enable it on all the CPUs available at boot time. Instead we could batch the cpu_enable callbacks to a single stop_machine() call to save us some time. Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: capabilities: Use linear array for detection and verificationSuzuki K Poulose1-25/+17
Use the sorted list of capability entries for the detection and verification. Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: capabilities: Optimize this_cpu_has_capSuzuki K Poulose1-22/+9
Make use of the sorted capability list to access the capability entry in this_cpu_has_cap() to avoid iterating over the two tables. Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: capabilities: Speed up capability lookupSuzuki K Poulose1-2/+30
We maintain two separate tables of capabilities, errata and features, which decide the system capabilities. We iterate over each of these tables for various operations (e.g, detection, verification etc.). We do not have a way to map a system "capability" to its entry, (i.e, cap -> struct arm64_cpu_capabilities) which is needed for this_cpu_has_cap(). So we iterate over the table one by one to find the entry and then do the operation. Also, this prevents us from optimizing the way we "enable" the capabilities on the CPUs, where we now issue a stop_machine() for each available capability. One solution is to merge the two tables into a single table, sorted by the capability. But this is has the following disadvantages: - We loose the "classification" of an errata vs. feature - It is quite easy to make a mistake when adding an entry, unless we sort the table at runtime. So we maintain a list of pointers to the capability entry, sorted by the "cap number" in a separate array, initialized at boot time. The only restriction is that we can have one "entry" per capability. While at it, remove the duplicate declaration of arm64_errata table. Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: kexec_file: invoke the kernel without purgatoryAKASHI Takahiro3-7/+16
On arm64, purgatory would do almost nothing. So just invoke secondary kernel directly by jumping into its entry code. While, in this case, cpu_soft_restart() must be called with dtb address in the fifth argument, the behavior still stays compatible with kexec_load case as long as the argument is null. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: James Morse <james.morse@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: kexec_file: allow for loading Image-format kernelAKASHI Takahiro3-1/+115
This patch provides kexec_file_ops for "Image"-format kernel. In this implementation, a binary is always loaded with a fixed offset identified in text_offset field of its header. Regarding signature verification for trusted boot, this patch doesn't contains CONFIG_KEXEC_VERIFY_SIG support, which is to be added later in this series, but file-attribute-based verification is still a viable option by enabling IMA security subsystem. You can sign(label) a to-be-kexec'ed kernel image on target file system with: $ evmctl ima_sign --key /path/to/private_key.pem Image On live system, you must have IMA enforced with, at least, the following security policy: "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" See more details about IMA here: https://sourceforge.net/p/linux-ima/wiki/Home/ Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: kexec_file: load initrd and device-treeAKASHI Takahiro1-0/+185
load_other_segments() is expected to allocate and place all the necessary memory segments other than kernel, including initrd and device-tree blob (and elf core header for crash). While most of the code was borrowed from kexec-tools' counterpart, users may not be allowed to specify dtb explicitly, instead, the dtb presented by the original boot loader is reused. arch_kimage_kernel_post_load_cleanup() is responsible for freeing arm64- specific data allocated in load_other_segments(). Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: enable KEXEC_FILE configAKASHI Takahiro2-1/+18
Modify arm64/Kconfig to enable kexec_file_load support. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Acked-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: add image head flag definitionsAKASHI Takahiro2-9/+15
Those image head's flags will be used later by kexec_file loader. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Acked-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: capabilities: Merge duplicate entries for Qualcomm erratum 1003Suzuki K Poulose1-9/+16
Remove duplicate entries for Qualcomm erratum 1003. Since the entries are not purely based on generic MIDR checks, use the multi_cap_entry type to merge the entries. Cc: Christopher Covington <cov@codeaurora.org> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: capabilities: Merge duplicate Cavium erratum entriesSuzuki K Poulose1-26/+24
Merge duplicate entries for a single capability using the midr range list for Cavium errata 30115 and 27456. Cc: Andrew Pinski <apinski@cavium.com> Cc: David Daney <david.daney@cavium.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06arm64: capabilities: Merge entries for ARM64_WORKAROUND_CLEAN_CACHESuzuki K Poulose1-12/+16
We have two entries for ARM64_WORKAROUND_CLEAN_CACHE capability : 1) ARM Errata 826319, 827319, 824069, 819472 on A53 r0p[012] 2) ARM Errata 819472 on A53 r0p[01] Both have the same work around. Merge these entries to avoid duplicate entries for a single capability. Add a new Kconfig entry to control the "capability" entry to make it easier to handle combinations of the CONFIGs. Cc: Will Deacon <will.deacon@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-04arm64: relocatable: fix inconsistencies in linker script and optionsArd Biesheuvel1-4/+5
readelf complains about the section layout of vmlinux when building with CONFIG_RELOCATABLE=y (for KASLR): readelf: Warning: [21]: Link field (0) should index a symtab section. readelf: Warning: [21]: Info field (0) should index a relocatable section. Also, it seems that our use of '-pie -shared' is contradictory, and thus ambiguous. In general, the way KASLR is wired up at the moment is highly tailored to how ld.bfd happens to implement (and conflate) PIE executables and shared libraries, so given the current effort to support other toolchains, let's fix some of these issues as well. - Drop the -pie linker argument and just leave -shared. In ld.bfd, the differences between them are unclear (except for the ELF type of the produced image [0]) but lld chokes on seeing both at the same time. - Rename the .rela output section to .rela.dyn, as is customary for shared libraries and PIE executables, so that it is not misidentified by readelf as a static relocation section (producing the warnings above). - Pass the -z notext and -z norelro options to explicitly instruct the linker to permit text relocations, and to omit the RELRO program header (which requires a certain section layout that we don't adhere to in the kernel). These are the defaults for current versions of ld.bfd. - Discard .eh_frame and .gnu.hash sections to avoid them from being emitted between .head.text and .text, screwing up the section layout. These changes only affect the ELF image, and produce the same binary image. [0] b9dce7f1ba01 ("arm64: kernel: force ET_DYN ELF type for ...") Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Smith <peter.smith@linaro.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-01Merge tag 'arm64-fixes' of ↵Linus Torvalds1-3/+17
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Cortex-A76 erratum workaround - ftrace fix to enable syscall events on arm64 - Fix uninitialised pointer in iort_get_platform_device_domain() * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: ACPI/IORT: Fix iort_get_platform_device_domain() uninitialized pointer value arm64: ftrace: Fix to enable syscall events on arm64 arm64: Add workaround for Cortex-A76 erratum 1286807
2018-11-30Merge tag 'trace-v4.20-rc3' of ↵Linus Torvalds1-14/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "While rewriting the function graph tracer, I discovered a design flaw that was introduced by a patch that tried to fix one bug, but by doing so created another bug. As both bugs corrupt the output (but they do not crash the kernel), I decided to fix the design such that it could have both bugs fixed. The original fix, fixed time reporting of the function graph tracer when doing a max_depth of one. This was code that can test how much the kernel interferes with userspace. But in doing so, it could corrupt the time keeping of the function profiler. The issue is that the curr_ret_stack variable was being used for two different meanings. One was to keep track of the stack pointer on the ret_stack (shadow stack used by the function graph tracer), and the other use case was the graph call depth. Although, the two may be closely related, where they got updated was the issue that lead to the two different bugs that required the two use cases to be updated differently. The big issue with this fix is that it requires changing each architecture. The good news is, I was able to remove a lot of code that was duplicated within the architectures and place it into a single location. Then I could make the fix in one place. I pushed this code into linux-next to let it settle over a week, and before doing so, I cross compiled all the affected architectures to make sure that they built fine. In the mean time, I also pulled in a patch that fixes the sched_switch previous tasks state output, that was not actually correct" * tag 'trace-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: sched, trace: Fix prev_state output in sched_switch tracepoint function_graph: Have profiler use curr_ret_stack and not depth function_graph: Reverse the order of pushing the ret_stack and the callback function_graph: Move return callback before update of curr_ret_stack function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack function_graph: Make ftrace_push_return_trace() static sparc/function_graph: Simplify with function_graph_enter() sh/function_graph: Simplify with function_graph_enter() s390/function_graph: Simplify with function_graph_enter() riscv/function_graph: Simplify with function_graph_enter() powerpc/function_graph: Simplify with function_graph_enter() parisc: function_graph: Simplify with function_graph_enter() nds32: function_graph: Simplify with function_graph_enter() MIPS: function_graph: Simplify with function_graph_enter() microblaze: function_graph: Simplify with function_graph_enter() arm64: function_graph: Simplify with function_graph_enter() ARM: function_graph: Simplify with function_graph_enter() x86/function_graph: Simplify with function_graph_enter() function_graph: Create function_graph_enter() to consolidate architecture code
2018-11-30arm64: ftrace: always pass instrumented pc in x0Mark Rutland2-4/+4
The core ftrace hooks take the instrumented PC in x0, but for some reason arm64's prepare_ftrace_return() takes this in x1. For consistency, let's flip the argument order and always pass the instrumented PC in x0. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-30arm64: ftrace: remove return_regs macrosMark Rutland1-20/+15
The save_return_regs and restore_return_regs macros are only used by return_to_handler, and having them defined out-of-line only serves to obscure the logic. Before we complicate, let's clean this up and fold the logic directly into return_to_handler, saving a few lines of macro boilerplate in the process. At the same time, a missing trailing space is added to the comments, fixing a code style violation. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-30arm64: ftrace: don't adjust the LR valueMark Rutland1-1/+0
The core ftrace code requires that when it is handed the PC of an instrumented function, this PC is the address of the instrumented instruction. This is necessary so that the core ftrace code can identify the specific instrumentation site. Since the instrumented function will be a BL, the address of the instrumented function is LR - 4 at entry to the ftrace code. This fixup is applied in the mcount_get_pc and mcount_get_pc0 helpers, which acquire the PC of the instrumented function. The mcount_get_lr helper is used to acquire the LR of the instrumented function, whose value does not require this adjustment, and cannot be adjusted to anything meaningful. No adjustment of this value is made on other architectures, including arm. However, arm64 adjusts this value by 4. This patch brings arm64 in line with other architectures and removes the adjustment of the LR value. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-30arm64: ftrace: enable graph FP testMark Rutland1-2/+1
The core frace code has an optional sanity check on the frame pointer passed by ftrace_graph_caller and return_to_handler. This is cheap, useful, and enabled unconditionally on x86, sparc, and riscv. Let's do the same on arm64, so that we can catch any problems early. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-30arm64: ftrace: use GLOBAL()Mark Rutland1-4/+2
The global exports of ftrace_call and ftrace_graph_call are somewhat painful to read. Let's use the generic GLOBAL() macro to ameliorate matters. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-30arm64: drop linker script hack to hide __efistub_ symbolsArd Biesheuvel1-28/+18
Commit 1212f7a16af4 ("scripts/kallsyms: filter arm64's __efistub_ symbols") updated the kallsyms code to filter out symbols with the __efistub_ prefix explicitly, so we no longer require the hack in our linker script to emit them as absolute symbols. Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-29arm64: Add workaround for Cortex-A76 erratum 1286807Catalin Marinas1-3/+17
On the affected Cortex-A76 cores (r0p0 to r3p0), if a virtual address for a cacheable mapping of a location is being accessed by a core while another core is remapping the virtual address to a new physical page using the recommended break-before-make sequence, then under very rare circumstances TLBI+DSB completes before a read using the translation being invalidated has been observed by other observers. The workaround repeats the TLBI+DSB operation and is shared with the Qualcomm Falkor erratum 1009 Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-11-28arm64: function_graph: Simplify with function_graph_enter()Steven Rostedt (VMware)1-14/+1
The function_graph_enter() function does the work of calling the function graph hook function and the management of the shadow stack, simplifying the work done in the architecture dependent prepare_ftrace_return(). Have arm64 use the new code, and remove the shadow stack management as well as having to set up the trace structure. This is needed to prepare for a fix of a design bug on how the curr_ret_stack is used. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: stable@kernel.org Fixes: 03274a3ffb449 ("tracing/fgraph: Adjust fgraph depth before calling trace return callback") Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-27arm64/module: switch to ADRP/ADD sequences for PLT entriesArd Biesheuvel3-25/+80
Now that we have switched to the small code model entirely, and reduced the extended KASLR range to 4 GB, we can be sure that the targets of relative branches that are out of range are in range for a ADRP/ADD pair, which is one instruction shorter than our current MOVN/MOVK/MOVK sequence, and is more idiomatic and so it is more likely to be implemented efficiently by micro-architectures. So switch over the ordinary PLT code and the special handling of the Cortex-A53 ADRP errata, as well as the ftrace trampline handling. Reviewed-by: Torsten Duwe <duwe@lst.de> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: Added a couple of comments in the plt equality check] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-27arm64/insn: add support for emitting ADR/ADRP instructionsArd Biesheuvel1-0/+29
Add support for emitting ADR and ADRP instructions so we can switch over our PLT generation code in a subsequent patch. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-27arm64: Use a raw spinlock in __install_bp_hardening_cb()James Morse1-3/+3
__install_bp_hardening_cb() is called via stop_machine() as part of the cpu_enable callback. To force each CPU to take its turn when allocating slots, they take a spinlock. With the RT patches applied, the spinlock becomes a mutex, and we get warnings about sleeping while in stop_machine(): | [ 0.319176] CPU features: detected: RAS Extension Support | [ 0.319950] BUG: scheduling while atomic: migration/3/36/0x00000002 | [ 0.319955] Modules linked in: | [ 0.319958] Preemption disabled at: | [ 0.319969] [<ffff000008181ae4>] cpu_stopper_thread+0x7c/0x108 | [ 0.319973] CPU: 3 PID: 36 Comm: migration/3 Not tainted 4.19.1-rt3-00250-g330fc2c2a880 #2 | [ 0.319975] Hardware name: linux,dummy-virt (DT) | [ 0.319976] Call trace: | [ 0.319981] dump_backtrace+0x0/0x148 | [ 0.319983] show_stack+0x14/0x20 | [ 0.319987] dump_stack+0x80/0xa4 | [ 0.319989] __schedule_bug+0x94/0xb0 | [ 0.319991] __schedule+0x510/0x560 | [ 0.319992] schedule+0x38/0xe8 | [ 0.319994] rt_spin_lock_slowlock_locked+0xf0/0x278 | [ 0.319996] rt_spin_lock_slowlock+0x5c/0x90 | [ 0.319998] rt_spin_lock+0x54/0x58 | [ 0.320000] enable_smccc_arch_workaround_1+0xdc/0x260 | [ 0.320001] __enable_cpu_capability+0x10/0x20 | [ 0.320003] multi_cpu_stop+0x84/0x108 | [ 0.320004] cpu_stopper_thread+0x84/0x108 | [ 0.320008] smpboot_thread_fn+0x1e8/0x2b0 | [ 0.320009] kthread+0x124/0x128 | [ 0.320010] ret_from_fork+0x10/0x18 Switch this to a raw spinlock, as we know this is only called with IRQs masked. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-23arm64: cpufeature: Fix mismerge of CONFIG_ARM64_SSBD blockWill Deacon1-1/+1
When merging support for SSBD and the CRC32 instructions, the conflict resolution for the new capability entries in arm64_features[] inadvertedly predicated the availability of the CRC32 instructions on CONFIG_ARM64_SSBD, despite the functionality being entirely unrelated. Move the #ifdef CONFIG_ARM64_SSBD down so that it only covers the SSBD capability. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-11-21arm64: perf: set suppress_bind_attrs flag to trueAnders Roxell1-0/+1
The armv8_pmuv3 driver doesn't have a remove function, and when the test 'CONFIG_DEBUG_TEST_DRIVER_REMOVE=y' is enabled, the following Call trace can be seen. [ 1.424287] Failed to register pmu: armv8_pmuv3, reason -17 [ 1.424870] WARNING: CPU: 0 PID: 1 at ../kernel/events/core.c:11771 perf_event_sysfs_init+0x98/0xdc [ 1.425220] Modules linked in: [ 1.425531] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.19.0-rc7-next-20181012-00003-ge7a97b1ad77b-dirty #35 [ 1.425951] Hardware name: linux,dummy-virt (DT) [ 1.426212] pstate: 80000005 (Nzcv daif -PAN -UAO) [ 1.426458] pc : perf_event_sysfs_init+0x98/0xdc [ 1.426720] lr : perf_event_sysfs_init+0x98/0xdc [ 1.426908] sp : ffff00000804bd50 [ 1.427077] x29: ffff00000804bd50 x28: ffff00000934e078 [ 1.427429] x27: ffff000009546000 x26: 0000000000000007 [ 1.427757] x25: ffff000009280710 x24: 00000000ffffffef [ 1.428086] x23: ffff000009408000 x22: 0000000000000000 [ 1.428415] x21: ffff000009136008 x20: ffff000009408730 [ 1.428744] x19: ffff80007b20b400 x18: 000000000000000a [ 1.429075] x17: 0000000000000000 x16: 0000000000000000 [ 1.429418] x15: 0000000000000400 x14: 2e79726f74636572 [ 1.429748] x13: 696420656d617320 x12: 656874206e692065 [ 1.430060] x11: 6d616e20656d6173 x10: 2065687420687469 [ 1.430335] x9 : ffff00000804bd50 x8 : 206e6f7361657220 [ 1.430610] x7 : 2c3376756d705f38 x6 : ffff00000954d7ce [ 1.430880] x5 : 0000000000000000 x4 : 0000000000000000 [ 1.431226] x3 : 0000000000000000 x2 : ffffffffffffffff [ 1.431554] x1 : 4d151327adc50b00 x0 : 0000000000000000 [ 1.431868] Call trace: [ 1.432102] perf_event_sysfs_init+0x98/0xdc [ 1.432382] do_one_initcall+0x6c/0x1a8 [ 1.432637] kernel_init_freeable+0x1bc/0x280 [ 1.432905] kernel_init+0x18/0x160 [ 1.433115] ret_from_fork+0x10/0x18 [ 1.433297] ---[ end trace 27fd415390eb9883 ]--- Rework to set suppress_bind_attrs flag to avoid removing the device when CONFIG_DEBUG_TEST_DRIVER_REMOVE=y, since there's no real reason to remove the armv8_pmuv3 driver. Cc: Arnd Bergmann <arnd@arndb.de> Co-developed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-21arm64: perf: Fix typos in commentShaokun Zhang1-1/+1
Fix up one typos: Onl -> Only Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Will Deacon <will.deacon@arm.com>