summaryrefslogtreecommitdiff
path: root/arch/arm64/include/asm/pgtable.h
AgeCommit message (Collapse)AuthorFilesLines
2020-02-04arm64: mm: add p?d_leaf() definitionsSteven Price1-0/+2
walk_page_range() is going to be allowed to walk page tables other than those of user space. For this it needs to know when it has reached a 'leaf' entry in the page tables. This information will be provided by the p?d_leaf() functions/macros. For arm64, we already have p?d_sect() macros which we can reuse for p?d_leaf(). pud_sect() is defined as a dummy function when CONFIG_PGTABLE_LEVELS < 3 or CONFIG_ARM64_64K_PAGES is defined. However when the kernel is configured this way then architecturally it isn't allowed to have a large page at this level, and any code using these page walking macros is implicitly relying on the page size/number of levels being the same as the kernel. So it is safe to reuse this for p?d_leaf() as it is an architectural restriction. Link: http://lkml.kernel.org/r/20191218162402.45610-5-steven.price@arm.com Signed-off-by: Steven Price <steven.price@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Liang, Kan" <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Burton <paul.burton@mips.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Zong Li <zong.li@sifive.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-06arm64: Revert support for execute-only user mappingsCatalin Marinas1-7/+3
The ARMv8 64-bit architecture supports execute-only user permissions by clearing the PTE_USER and PTE_UXN bits, practically making it a mostly privileged mapping but from which user running at EL0 can still execute. The downside, however, is that the kernel at EL1 inadvertently reading such mapping would not trip over the PAN (privileged access never) protection. Revert the relevant bits from commit cab15ce604e5 ("arm64: Introduce execute-only page access permissions") so that PROT_EXEC implies PROT_READ (and therefore PTE_USER) until the architecture gains proper support for execute-only user mappings. Fixes: cab15ce604e5 ("arm64: Introduce execute-only page access permissions") Cc: <stable@vger.kernel.org> # 4.9.x- Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-26Merge tag 'arm64-upstream' of ↵Linus Torvalds1-1/+15
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Apart from the arm64-specific bits (core arch and perf, new arm64 selftests), it touches the generic cow_user_page() (reviewed by Kirill) together with a macro for x86 to preserve the existing behaviour on this architecture. Summary: - On ARMv8 CPUs without hardware updates of the access flag, avoid failing cow_user_page() on PFN mappings if the pte is old. The patches introduce an arch_faults_on_old_pte() macro, defined as false on x86. When true, cow_user_page() makes the pte young before attempting __copy_from_user_inatomic(). - Covert the synchronous exception handling paths in arch/arm64/kernel/entry.S to C. - FTRACE_WITH_REGS support for arm64. - ZONE_DMA re-introduced on arm64 to support Raspberry Pi 4 - Several kselftest cases specific to arm64, together with a MAINTAINERS update for these files (moved to the ARM64 PORT entry). - Workaround for a Neoverse-N1 erratum where the CPU may fetch stale instructions under certain conditions. - Workaround for Cortex-A57 and A72 errata where the CPU may speculatively execute an AT instruction and associate a VMID with the wrong guest page tables (corrupting the TLB). - Perf updates for arm64: additional PMU topologies on HiSilicon platforms, support for CCN-512 interconnect, AXI ID filtering in the IMX8 DDR PMU, support for the CCPI2 uncore PMU in ThunderX2. - GICv3 optimisation to avoid a heavy barrier when accessing the ICC_PMR_EL1 register. - ELF HWCAP documentation updates and clean-up. - SMC calling convention conduit code clean-up. - KASLR diagnostics printed during boot - NVIDIA Carmel CPU added to the KPTI whitelist - Some arm64 mm clean-ups: use generic free_initrd_mem(), remove stale macro, simplify calculation in __create_pgd_mapping(), typos. - Kconfig clean-ups: CMDLINE_FORCE to depend on CMDLINE, choice for endinanness to help with allmodconfig" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits) arm64: Kconfig: add a choice for endianness kselftest: arm64: fix spelling mistake "contiguos" -> "contiguous" arm64: Kconfig: make CMDLINE_FORCE depend on CMDLINE MAINTAINERS: Add arm64 selftests to the ARM64 PORT entry arm64: kaslr: Check command line before looking for a seed arm64: kaslr: Announce KASLR status on boot kselftest: arm64: fake_sigreturn_misaligned_sp kselftest: arm64: fake_sigreturn_bad_size kselftest: arm64: fake_sigreturn_duplicated_fpsimd kselftest: arm64: fake_sigreturn_missing_fpsimd kselftest: arm64: fake_sigreturn_bad_size_for_magic0 kselftest: arm64: fake_sigreturn_bad_magic kselftest: arm64: add helper get_current_context kselftest: arm64: extend test_init functionalities kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht] kselftest: arm64: mangle_pstate_invalid_daif_bits kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils kselftest: arm64: extend toplevel skeleton Makefile drivers/perf: hisi: update the sccl_id/ccl_id for certain HiSilicon platform arm64: mm: reserve CMA and crashkernel in ZONE_DMA32 ...
2019-11-08Merge branches 'for-next/elf-hwcap-docs', 'for-next/smccc-conduit-cleanup', ↵Catalin Marinas1-1/+15
'for-next/zone-dma', 'for-next/relax-icc_pmr_el1-sync', 'for-next/double-page-fault', 'for-next/misc', 'for-next/kselftest-arm64-signal' and 'for-next/kaslr-diagnostics' into for-next/core * for-next/elf-hwcap-docs: : Update the arm64 ELF HWCAP documentation docs/arm64: cpu-feature-registers: Rewrite bitfields that don't follow [e, s] docs/arm64: cpu-feature-registers: Documents missing visible fields docs/arm64: elf_hwcaps: Document HWCAP_SB docs/arm64: elf_hwcaps: sort the HWCAP{, 2} documentation by ascending value * for-next/smccc-conduit-cleanup: : SMC calling convention conduit clean-up firmware: arm_sdei: use common SMCCC_CONDUIT_* firmware/psci: use common SMCCC_CONDUIT_* arm: spectre-v2: use arm_smccc_1_1_get_conduit() arm64: errata: use arm_smccc_1_1_get_conduit() arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit() * for-next/zone-dma: : Reintroduction of ZONE_DMA for Raspberry Pi 4 support arm64: mm: reserve CMA and crashkernel in ZONE_DMA32 dma/direct: turn ARCH_ZONE_DMA_BITS into a variable arm64: Make arm64_dma32_phys_limit static arm64: mm: Fix unused variable warning in zone_sizes_init mm: refresh ZONE_DMA and ZONE_DMA32 comments in 'enum zone_type' arm64: use both ZONE_DMA and ZONE_DMA32 arm64: rename variables used to calculate ZONE_DMA32's size arm64: mm: use arm64_dma_phys_limit instead of calling max_zone_dma_phys() * for-next/relax-icc_pmr_el1-sync: : Relax ICC_PMR_EL1 (GICv3) accesses when ICC_CTLR_EL1.PMHE is clear arm64: Document ICC_CTLR_EL3.PMHE setting requirements arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear * for-next/double-page-fault: : Avoid a double page fault in __copy_from_user_inatomic() if hw does not support auto Access Flag mm: fix double page fault on arm64 if PTE_AF is cleared x86/mm: implement arch_faults_on_old_pte() stub on x86 arm64: mm: implement arch_faults_on_old_pte() on arm64 arm64: cpufeature: introduce helper cpu_has_hw_af() * for-next/misc: : Various fixes and clean-ups arm64: kpti: Add NVIDIA's Carmel core to the KPTI whitelist arm64: mm: Remove MAX_USER_VA_BITS definition arm64: mm: simplify the page end calculation in __create_pgd_mapping() arm64: print additional fault message when executing non-exec memory arm64: psci: Reduce the waiting time for cpu_psci_cpu_kill() arm64: pgtable: Correct typo in comment arm64: docs: cpu-feature-registers: Document ID_AA64PFR1_EL1 arm64: cpufeature: Fix typos in comment arm64/mm: Poison initmem while freeing with free_reserved_area() arm64: use generic free_initrd_mem() arm64: simplify syscall wrapper ifdeffery * for-next/kselftest-arm64-signal: : arm64-specific kselftest support with signal-related test-cases kselftest: arm64: fake_sigreturn_misaligned_sp kselftest: arm64: fake_sigreturn_bad_size kselftest: arm64: fake_sigreturn_duplicated_fpsimd kselftest: arm64: fake_sigreturn_missing_fpsimd kselftest: arm64: fake_sigreturn_bad_size_for_magic0 kselftest: arm64: fake_sigreturn_bad_magic kselftest: arm64: add helper get_current_context kselftest: arm64: extend test_init functionalities kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht] kselftest: arm64: mangle_pstate_invalid_daif_bits kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils kselftest: arm64: extend toplevel skeleton Makefile * for-next/kaslr-diagnostics: : Provide diagnostics on boot for KASLR arm64: kaslr: Check command line before looking for a seed arm64: kaslr: Announce KASLR status on boot
2019-11-06arm64: Do not mask out PTE_RDONLY in pte_same()Catalin Marinas1-17/+0
Following commit 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()"), the PTE_RDONLY bit is no longer managed by set_pte_at() but built into the PAGE_* attribute definitions. Consequently, pte_same() must include this bit when checking two PTEs for equality. Remove the arm64-specific pte_same() function, practically reverting commit 747a70e60b72 ("arm64: Fix copy-on-write referencing in HugeTLB") Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") Cc: <stable@vger.kernel.org> # 4.14.x- Cc: Will Deacon <will@kernel.org> Cc: Steve Capper <steve.capper@arm.com> Reported-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-10-25arm64: pgtable: Correct typo in commentMark Brown1-1/+1
vmmemmap -> vmemmap Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-18arm64: mm: implement arch_faults_on_old_pte() on arm64Jia He1-0/+14
On arm64 without hardware Access Flag, copying from user will fail because the pte is old and cannot be marked young. So we always end up with zeroed page after fork() + CoW for pfn mappings. We don't always have a hardware-managed Access Flag on arm64. Hence implement arch_faults_on_old_pte on arm64 to indicate that it might cause page fault when accessing old pte. Signed-off-by: Jia He <justin.he@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-11arm64: Fix kcore macros after 52-bit virtual addressing falloutChris von Recklinghausen1-3/+0
We export the entire kernel address space (i.e. the whole of the TTBR1 address range) via /proc/kcore. The kc_vaddr_to_offset() and kc_offset_to_vaddr() macros are intended to convert between a kernel virtual address and its offset relative to the start of the TTBR1 address space. Prior to commit: 14c127c957c1c607 ("arm64: mm: Flip kernel VA space") ... the offset was calculated relative to VA_START, which at the time was the start of the TTBR1 address space. At this time, PAGE_OFFSET pointed to the high half of the TTBR1 address space where arm64's linear map lived. That commit swapped the position of VA_START and PAGE_OFFSET, but failed to update kc_vaddr_to_offset() or kc_offset_to_vaddr(), so since then the two macros behave incorrectly. Note that VA_START was subsequently renamed to PAGE_END in commit: 77ad4ce69321abbe ("arm64: memory: rename VA_START to PAGE_END") As the generic implementations of the two macros calculate the offset relative to PAGE_OFFSET (which is now the start of the TTBR1 address space), we can delete the arm64 implementation and use those. Fixes: 14c127c957c1c607 ("arm64: mm: Flip kernel VA space") Reviewed-by: James Morse <james.morse@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-09-25mm: consolidate pgtable_cache_init() and pgd_cache_init()Mike Rapoport1-2/+0
Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem cache for page table allocations on several architectures that do not use PAGE_SIZE tables for one or more levels of the page table hierarchy. Most architectures do not implement these functions and use __weak default NOP implementation of pgd_cache_init(). Since there is no such default for pgtable_cache_init(), its empty stub is duplicated among most architectures. Rename the definitions of pgd_cache_init() to pgtable_cache_init() and drop empty stubs of pgtable_cache_init(). Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Will Deacon <will@kernel.org> [arm64] Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-19Merge tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds1-0/+12
Pull dma-mapping updates from Christoph Hellwig: - add dma-mapping and block layer helpers to take care of IOMMU merging for mmc plus subsequent fixups (Yoshihiro Shimoda) - rework handling of the pgprot bits for remapping (me) - take care of the dma direct infrastructure for swiotlb-xen (me) - improve the dma noncoherent remapping infrastructure (me) - better defaults for ->mmap, ->get_sgtable and ->get_required_mask (me) - cleanup mmaping of coherent DMA allocations (me) - various misc cleanups (Andy Shevchenko, me) * tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping: (41 commits) mmc: renesas_sdhi_internal_dmac: Add MMC_CAP2_MERGE_CAPABLE mmc: queue: Fix bigger segments usage arm64: use asm-generic/dma-mapping.h swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page swiotlb-xen: simplify cache maintainance swiotlb-xen: use the same foreign page check everywhere swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable xen: remove the exports for xen_{create,destroy}_contiguous_region xen/arm: remove xen_dma_ops xen/arm: simplify dma_cache_maint xen/arm: use dev_is_dma_coherent xen/arm: consolidate page-coherent.h xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance arm: remove wrappers for the generic dma remap helpers dma-mapping: introduce a dma_common_find_pages helper dma-mapping: always use VM_DMA_COHERENT for generic DMA remap vmalloc: lift the arm flag for coherent mappings to common code dma-mapping: provide a better default ->get_required_mask dma-mapping: remove the dma_declare_coherent_memory export remoteproc: don't allow modular build ...
2019-09-17Merge tag 'arm64-upstream' of ↵Linus Torvalds1-8/+15
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "Although there isn't tonnes of code in terms of line count, there are a fair few headline features which I've noted both in the tag and also in the merge commits when I pulled everything together. The part I'm most pleased with is that we had 35 contributors this time around, which feels like a big jump from the usual small group of core arm64 arch developers. Hopefully they all enjoyed it so much that they'll continue to contribute, but we'll see. It's probably worth highlighting that we've pulled in a branch from the risc-v folks which moves our CPU topology code out to where it can be shared with others. Summary: - 52-bit virtual addressing in the kernel - New ABI to allow tagged user pointers to be dereferenced by syscalls - Early RNG seeding by the bootloader - Improve robustness of SMP boot - Fix TLB invalidation in light of recent architectural clarifications - Support for i.MX8 DDR PMU - Remove direct LSE instruction patching in favour of static keys - Function error injection using kprobes - Support for the PPTT "thread" flag introduced by ACPI 6.3 - Move PSCI idle code into proper cpuidle driver - Relaxation of implicit I/O memory barriers - Build with RELR relocations when toolchain supports them - Numerous cleanups and non-critical fixes" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (114 commits) arm64: remove __iounmap arm64: atomics: Use K constraint when toolchain appears to support it arm64: atomics: Undefine internal macros after use arm64: lse: Make ARM64_LSE_ATOMICS depend on JUMP_LABEL arm64: asm: Kill 'asm/atomic_arch.h' arm64: lse: Remove unused 'alt_lse' assembly macro arm64: atomics: Remove atomic_ll_sc compilation unit arm64: avoid using hard-coded registers for LSE atomics arm64: atomics: avoid out-of-line ll/sc atomics arm64: Use correct ll/sc atomic constraints jump_label: Don't warn on __exit jump entries docs/perf: Add documentation for the i.MX8 DDR PMU perf/imx_ddr: Add support for AXI ID filtering arm64: kpti: ensure patched kernel text is fetched from PoU arm64: fix fixmap copy for 16K pages and 48-bit VA perf/smmuv3: Validate groups for global filtering perf/smmuv3: Validate group size arm64: Relax Documentation/arm64/tagged-pointers.rst arm64: kvm: Replace hardcoded '1' with SYS_PAR_EL1_F arm64: mm: Ignore spurious translation faults taken from the kernel ...
2019-08-30Merge branches 'for-next/52-bit-kva', 'for-next/cpu-topology', ↵Will Deacon1-8/+15
'for-next/error-injection', 'for-next/perf', 'for-next/psci-cpuidle', 'for-next/rng', 'for-next/smpboot', 'for-next/tbi' and 'for-next/tlbi' into for-next/core * for-next/52-bit-kva: (25 commits) Support for 52-bit virtual addressing in kernel space * for-next/cpu-topology: (9 commits) Move CPU topology parsing into core code and add support for ACPI 6.3 * for-next/error-injection: (2 commits) Support for function error injection via kprobes * for-next/perf: (8 commits) Support for i.MX8 DDR PMU and proper SMMUv3 group validation * for-next/psci-cpuidle: (7 commits) Move PSCI idle code into a new CPUidle driver * for-next/rng: (4 commits) Support for 'rng-seed' property being passed in the devicetree * for-next/smpboot: (3 commits) Reduce fragility of secondary CPU bringup in debug configurations * for-next/tbi: (10 commits) Introduce new syscall ABI with relaxed requirements for pointer tags * for-next/tlbi: (6 commits) Handle spurious page faults arising from kernel space
2019-08-29arm64: document the choice of page attributes for pgprot_dmacoherentChristoph Hellwig1-0/+8
Based on an email from Will Deacon. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com>
2019-08-29dma-mapping: remove arch_dma_mmap_pgprotChristoph Hellwig1-0/+4
arch_dma_mmap_pgprot is used for two things: 1) to override the "normal" uncached page attributes for mapping memory coherent to devices that can't snoop the CPU caches 2) to provide the special DMA_ATTR_WRITE_COMBINE semantics on older arm systems and some mips platforms Replace one with the pgprot_dmacoherent macro that is already provided by arm and much simpler to use, and lift the DMA_ATTR_WRITE_COMBINE handling to common code with an explicit arch opt-in. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Acked-by: Paul Burton <paul.burton@mips.com> # mips
2019-08-27arm64: mm: Add ISB instruction to set_pgd()Will Deacon1-0/+1
Commit 6a4cbd63c25a ("Revert "arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}"") reintroduced ISB instructions to some of our page table setter functions in light of a recent clarification to the Armv8 architecture. Although 'set_pgd()' isn't currently used to update a live page table, add the ISB instruction there too for consistency with the other macros and to provide some future-proofing if we use it on live tables in the future. Reported-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-27Revert "arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}"Will Deacon1-3/+9
This reverts commit 24fe1b0efad4fcdd32ce46cffeab297f22581707. Commit 24fe1b0efad4fcdd ("arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}") removed ISB instructions immediately following updates to the page table, on the grounds that they are not required by the architecture and a DSB alone is sufficient to ensure that subsequent data accesses use the new translation: DDI0487E_a, B2-128: | ... no instruction that appears in program order after the DSB | instruction can alter any state of the system or perform any part of | its functionality until the DSB completes other than: | | * Being fetched from memory and decoded | * Reading the general-purpose, SIMD and floating-point, | Special-purpose, or System registers that are directly or indirectly | read without causing side-effects. However, the same document also states the following: DDI0487E_a, B2-125: | DMB and DSB instructions affect reads and writes to the memory system | generated by Load/Store instructions and data or unified cache | maintenance instructions being executed by the PE. Instruction fetches | or accesses caused by a hardware translation table access are not | explicit accesses. which appears to claim that the DSB alone is insufficient. Unfortunately, some CPU designers have followed the second clause above, whereas in Linux we've been relying on the first. This means that our mapping sequence: MOV X0, <valid pte> STR X0, [Xptep] // Store new PTE to page table DSB ISHST LDR X1, [X2] // Translates using the new PTE can actually raise a translation fault on the load instruction because the translation can be performed speculatively before the page table update and then marked as "faulting" by the CPU. For user PTEs, this is ok because we can handle the spurious fault, but for kernel PTEs and intermediate table entries this results in a panic(). Revert the offending commit to reintroduce the missing barriers. Cc: <stable@vger.kernel.org> Fixes: 24fe1b0efad4fcdd ("arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}") Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14arm64: memory: rename VA_START to PAGE_ENDMark Rutland1-2/+2
Prior to commit: 14c127c957c1c607 ("arm64: mm: Flip kernel VA space") ... VA_START described the start of the TTBR1 address space for a given VA size described by VA_BITS, where all kernel mappings began. Since that commit, VA_START described a portion midway through the address space, where the linear map ends and other kernel mappings begin. To avoid confusion, let's rename VA_START to PAGE_END, making it clear that it's not the start of the TTBR1 address space and implying that it's related to PAGE_OFFSET. Comments and other mnemonics are updated accordingly, along with a typo fix in the decription of VMEMMAP_SIZE. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09arm64: mm: Separate out vmemmapSteve Capper1-2/+2
vmemmap is a preprocessor definition that depends on a variable, memstart_addr. In a later patch we will need to expand the size of the VMEMMAP region and optionally modify vmemmap depending upon whether or not hardware support is available for 52-bit virtual addresses. This patch changes vmemmap to be a variable. As the old definition depended on a variable load, this should not affect performance noticeably. Signed-off-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09arm64: mm: Flip kernel VA spaceSteve Capper1-1/+1
In order to allow for a KASAN shadow that changes size at boot time, one must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the start address. Also, it is highly desirable to maintain the same function addresses in the kernel .text between VA sizes. Both of these requirements necessitate us to flip the kernel address space halves s.t. the direct linear map occupies the lower addresses. This patch puts the direct linear map in the lower addresses of the kernel VA range and everything else in the higher ranges. We need to adjust: *) KASAN shadow region placement logic, *) KASAN_SHADOW_OFFSET computation logic, *) virt_to_phys, phys_to_virt checks, *) page table dumper. These are all small changes, that need to take place atomically, so they are bundled into this commit. As part of the re-arrangement, a guard region of 2MB (to preserve alignment for fixed map) is added after the vmemmap. Otherwise the vmemmap could intersect with IS_ERR pointers. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-08arm64: mm: add missing PTE_SPECIAL in pte_mkdevmap on arm64Jia He1-2/+5
Without this patch, the MAP_SYNC test case will cause a print_bad_pte warning on arm64 as follows: [ 25.542693] BUG: Bad page map in process mapdax333 pte:2e8000448800f53 pmd:41ff5f003 [ 25.546360] page:ffff7e0010220000 refcount:1 mapcount:-1 mapping:ffff8003e29c7440 index:0x0 [ 25.550281] ext4_dax_aops [ 25.550282] name:"__aaabbbcccddd__" [ 25.551553] flags: 0x3ffff0000001002(referenced|reserved) [ 25.555802] raw: 03ffff0000001002 ffff8003dfffa908 0000000000000000 ffff8003e29c7440 [ 25.559446] raw: 0000000000000000 0000000000000000 00000001fffffffe 0000000000000000 [ 25.563075] page dumped because: bad pte [ 25.564938] addr:0000ffffbe05b000 vm_flags:208000fb anon_vma:0000000000000000 mapping:ffff8003e29c7440 index:0 [ 25.574272] file:__aaabbbcccddd__ fault:ext4_dax_fault mmmmap:ext4_file_mmap readpage:0x0 [ 25.578799] CPU: 1 PID: 1180 Comm: mapdax333 Not tainted 5.2.0+ #21 [ 25.581702] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 25.585624] Call trace: [ 25.587008] dump_backtrace+0x0/0x178 [ 25.588799] show_stack+0x24/0x30 [ 25.590328] dump_stack+0xa8/0xcc [ 25.591901] print_bad_pte+0x18c/0x218 [ 25.593628] unmap_page_range+0x778/0xc00 [ 25.595506] unmap_single_vma+0x94/0xe8 [ 25.597304] unmap_vmas+0x90/0x108 [ 25.598901] unmap_region+0xc0/0x128 [ 25.600566] __do_munmap+0x284/0x3f0 [ 25.602245] __vm_munmap+0x78/0xe0 [ 25.603820] __arm64_sys_munmap+0x34/0x48 [ 25.605709] el0_svc_common.constprop.0+0x78/0x168 [ 25.607956] el0_svc_handler+0x34/0x90 [ 25.609698] el0_svc+0x8/0xc [...] The root cause is in _vm_normal_page, without the PTE_SPECIAL bit, the return value will be incorrectly set to pfn_to_page(pfn) instead of NULL. Besides, this patch also rewrite the pmd_mkdevmap to avoid setting PTE_SPECIAL for pmd The MAP_SYNC test case is as follows(Provided by Yibo Cai) $#include <stdio.h> $#include <string.h> $#include <unistd.h> $#include <sys/file.h> $#include <sys/mman.h> $#ifndef MAP_SYNC $#define MAP_SYNC 0x80000 $#endif /* mount -o dax /dev/pmem0 /mnt */ $#define F "/mnt/__aaabbbcccddd__" int main(void) { int fd; char buf[4096]; void *addr; if ((fd = open(F, O_CREAT|O_TRUNC|O_RDWR, 0644)) < 0) { perror("open1"); return 1; } if (write(fd, buf, 4096) != 4096) { perror("lseek"); return 1; } addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_SYNC, fd, 0); if (addr == MAP_FAILED) { perror("mmap"); printf("did you mount with '-o dax'?\n"); return 1; } memset(addr, 0x55, 4096); if (munmap(addr, 4096) == -1) { perror("munmap"); return 1; } close(fd); return 0; } Fixes: 73b20c84d42d ("arm64: mm: implement pte_devmap support") Reported-by: Yibo Cai <Yibo.Cai@arm.com> Acked-by: Will Deacon <will@kernel.org> Acked-by: Robin Murphy <Robin.Murphy@arm.com> Signed-off-by: Jia He <justin.he@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-08-01arm64/mm: fix variable 'pud' set but not usedQian Cai1-2/+2
GCC throws a warning, arch/arm64/mm/mmu.c: In function 'pud_free_pmd_page': arch/arm64/mm/mmu.c:1033:8: warning: variable 'pud' set but not used [-Wunused-but-set-variable] pud_t pud; ^~~ because pud_table() is a macro and compiled away. Fix it by making it a static inline function and for pud_sect() as well. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22arm64: mm: Drop pte_huge()Anshuman Khandual1-1/+0
This helper is required from generic huge_pte_alloc() which is available when arch subscribes ARCH_WANT_GENERAL_HUGETLB. arm64 implements it's own huge_pte_alloc() and does not depend on the generic definition. Drop this helper which is redundant on arm64. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Steve Capper <Steve.Capper@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-17arm64: mm: implement pte_devmap supportRobin Murphy1-0/+21
In order for things like get_user_pages() to work on ZONE_DEVICE memory, we need a software PTE bit to identify device-backed PFNs. Hook this up along with the relevant helpers to join in with ARCH_HAS_PTE_DEVMAP. [robin.murphy@arm.com: build fixes] Link: http://lkml.kernel.org/r/13026c4e64abc17133bbfa07d7731ec6691c0bcd.1559050949.git.robin.murphy@arm.com Link: http://lkml.kernel.org/r/817d92886fc3b33bcbf6e105ee83a74babb3a5aa.1558547956.git.robin.murphy@arm.com Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-08Merge tag 'arm64-upstream' of ↵Linus Torvalds1-19/+37
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP} - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to manage the permissions of executable vmalloc regions more strictly - Slight performance improvement by keeping softirqs enabled while touching the FPSIMD/SVE state (kernel_neon_begin/end) - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new XAFLAG and AXFLAG instructions for floating point comparison flags manipulation) and FRINT (rounding floating point numbers to integers) - Re-instate ARM64_PSEUDO_NMI support which was previously marked as BROKEN due to some bugs (now fixed) - Improve parking of stopped CPUs and implement an arm64-specific panic_smp_self_stop() to avoid warning on not being able to stop secondary CPUs during panic - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI platforms - perf: DDR performance monitor support for iMX8QXP - cache_line_size() can now be set from DT or ACPI/PPTT if provided to cope with a system cache info not exposed via the CPUID registers - Avoid warning on hardware cache line size greater than ARCH_DMA_MINALIGN if the system is fully coherent - arm64 do_page_fault() and hugetlb cleanups - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep) - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the 'arm_boot_flags' introduced in 5.1) - CONFIG_RANDOMIZE_BASE now enabled in defconfig - Allow the selection of ARM64_MODULE_PLTS, currently only done via RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill over into the vmalloc area - Make ZONE_DMA32 configurable * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits) perf: arm_spe: Enable ACPI/Platform automatic module loading arm_pmu: acpi: spe: Add initial MADT/SPE probing ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens ACPI/PPTT: Modify node flag detection to find last IDENTICAL x86/entry: Simplify _TIF_SYSCALL_EMU handling arm64: rename dump_instr as dump_kernel_instr arm64/mm: Drop [PTE|PMD]_TYPE_FAULT arm64: Implement panic_smp_self_stop() arm64: Improve parking of stopped CPUs arm64: Expose FRINT capabilities to userspace arm64: Expose ARMv8.5 CondM capability to userspace arm64: defconfig: enable CONFIG_RANDOMIZE_BASE arm64: ARM64_MODULES_PLTS must depend on MODULES arm64: bpf: do not allocate executable memory arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP arm64: module: create module allocations without exec permissions arm64: Allow user selection of ARM64_MODULE_PLTS acpi/arm64: ignore 5.1 FADTs that are reported as 5.0 arm64: Allow selecting Pseudo-NMI again ...
2019-06-21Merge tag 'spdx-5.2-rc6' of ↵Linus Torvalds1-12/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull still more SPDX updates from Greg KH: "Another round of SPDX updates for 5.2-rc6 Here is what I am guessing is going to be the last "big" SPDX update for 5.2. It contains all of the remaining GPLv2 and GPLv2+ updates that were "easy" to determine by pattern matching. The ones after this are going to be a bit more difficult and the people on the spdx list will be discussing them on a case-by-case basis now. Another 5000+ files are fixed up, so our overall totals are: Files checked: 64545 Files with SPDX: 45529 Compared to the 5.1 kernel which was: Files checked: 63848 Files with SPDX: 22576 This is a huge improvement. Also, we deleted another 20000 lines of boilerplate license crud, always nice to see in a diffstat" * tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (65 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 507 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 506 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 503 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 502 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 501 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 498 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 496 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 495 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 491 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 490 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 489 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 488 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 487 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 486 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 485 ...
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234Thomas Gleixner1-12/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-18arm64/mm: don't initialize pgd_cache twiceMike Rapoport1-2/+1
When PGD_SIZE != PAGE_SIZE, arm64 uses kmem_cache for allocation of PGD memory. That cache was initialized twice: first through pgtable_cache_init() alias and then as an override for weak pgd_cache_init(). Remove the alias from pgtable_cache_init() and keep the only pgd_cache initialization in pgd_cache_init(). Fixes: caa841360134 ("x86/mm: Initialize PGD cache during mm initialization") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-06-10arm64: mm: avoid redundant READ_ONCE(*ptep)Mark Rutland1-17/+30
In set_pte_at(), we read the old pte value so that it can be passed into checks for racy hw updates. These checks are only performed for CONFIG_DEBUG_VM, and the value is not used otherwise. Since we read the pte value with READ_ONCE(), the compiler cannot elide the redundant read for !CONFIG_DEBUG_VM kernels. Let's ameliorate matters by moving the read and the checks into a helper, __check_racy_pte_update(), which only performs the read when the value will be used. This also allows us to reformat the conditions for clarity. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-04arm64/mm: Simplify protection flag creation for kernel huge mappingsAnshuman Khandual1-2/+7
Even though they have got the same value, PMD_TYPE_SECT and PUD_TYPE_SECT get used for kernel huge mappings. But before that first the table bit gets cleared using leaf level PTE_TABLE_BIT. Though functionally they are same, we should use page table level specific macros to be consistent as per the MMU specifications. Create page table level specific wrappers for kernel huge mapping entries and just drop mk_sect_prot() which does not have any other user. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-04-30arm64: mm: Remove pte_unmap_nested()Qian Cai1-2/+0
As of commit ece0e2b6406a ("mm: remove pte_*map_nested()"), pte_unmap_nested() is no longer used and can be removed from the arm64 code. Signed-off-by: Qian Cai <cai@lca.pw> [will: also remove pte_offset_map_nested()] Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-30arm64: Fix compiler warning from pte_unmap() with -Wunused-but-set-variableQian Cai1-1/+2
When building with -Wunused-but-set-variable, the compiler shouts about a number of pte_unmap() users, since this expands to an empty macro on arm64: | mm/gup.c: In function 'gup_pte_range': | mm/gup.c:1727:16: warning: variable 'ptem' set but not used | [-Wunused-but-set-variable] | mm/gup.c: At top level: | mm/memory.c: In function 'copy_pte_range': | mm/memory.c:821:24: warning: variable 'orig_dst_pte' set but not used | [-Wunused-but-set-variable] | mm/memory.c:821:9: warning: variable 'orig_src_pte' set but not used | [-Wunused-but-set-variable] | mm/swap_state.c: In function 'swap_ra_info': | mm/swap_state.c:641:15: warning: variable 'orig_pte' set but not used | [-Wunused-but-set-variable] | mm/madvise.c: In function 'madvise_free_pte_range': | mm/madvise.c:318:9: warning: variable 'orig_pte' set but not used | [-Wunused-but-set-variable] Rewrite pte_unmap() as a static inline function, which silences the warnings. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+9
Pull KVM updates from Paolo Bonzini: "ARM: - selftests improvements - large PUD support for HugeTLB - single-stepping fixes - improved tracing - various timer and vGIC fixes x86: - Processor Tracing virtualization - STIBP support - some correctness fixes - refactorings and splitting of vmx.c - use the Hyper-V range TLB flush hypercall - reduce order of vcpu struct - WBNOINVD support - do not use -ftrace for __noclone functions - nested guest support for PAUSE filtering on AMD - more Hyper-V enlightenments (direct mode for synthetic timers) PPC: - nested VFIO s390: - bugfixes only this time" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: x86: Add CPUID support for new instruction WBNOINVD kvm: selftests: ucall: fix exit mmio address guessing Revert "compiler-gcc: disable -ftracer for __noclone functions" KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixup MAINTAINERS: Add arch/x86/kvm sub-directories to existing KVM/x86 entry KVM/x86: Use SVM assembly instruction mnemonics instead of .byte streams KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range() KVM/MMU: Flush tlb directly in kvm_set_pte_rmapp() KVM/MMU: Move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte() KVM: Make kvm_set_spte_hva() return int KVM: Replace old tlb flush function with new one to flush a specified range. KVM/MMU: Add tlb flush with range helper function KVM/VMX: Add hv tlb range flush support x86/hyper-v: Add HvFlushGuestAddressList hypercall support KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops KVM: x86: Disable Intel PT when VMXON in L1 guest KVM: x86: Set intercept for Intel PT MSRs read/write KVM: x86: Implement Intel PT MSRs read/write emulation ...
2018-12-18KVM: arm64: Add support for creating PUD hugepages at stage 2Punit Agrawal1-0/+2
KVM only supports PMD hugepages at stage 2. Now that the various page handling routines are updated, extend the stage 2 fault handling to map in PUD hugepages. Addition of PUD hugepage support enables additional page sizes (e.g., 1G with 4K granule) which can be useful on cores that support mapping larger block sizes in the TLB entries. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> [ Replace BUG() => WARN_ON(1) for arm32 PUD helpers ] Signed-off-by: Suzuki Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18KVM: arm64: Update age handlers to support PUD hugepagesPunit Agrawal1-0/+1
In preparation for creating larger hugepages at Stage 2, add support to the age handling notifiers for PUD hugepages when encountered. Provide trivial helpers for arm32 to allow sharing code. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> [ Replaced BUG() => WARN_ON(1) for arm32 PUD helpers ] Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-18KVM: arm64: Support handling access faults for PUD hugepagesPunit Agrawal1-0/+6
In preparation for creating larger hugepages at Stage 2, extend the access fault handling at Stage 2 to support PUD hugepages when encountered. Provide trivial helpers for arm32 to allow sharing of code. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> [ Replaced BUG() => WARN_ON(1) in PUD helpers ] Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-11-26arm64: mm: Don't wait for completion of TLB invalidation when page agingAlex Van Brunt1-0/+22
When transitioning a PTE from young to old as part of page aging, we can avoid waiting for the TLB invalidation to complete and therefore drop the subsequent DSB instruction. Whilst this opens up a race with page reclaim, where a PTE in active use via a stale, young TLB entry does not update the underlying descriptor, the worst thing that happens is that the page is reclaimed and then immediately faulted back in. Given that we have a DSB in our context-switch path, the window for a spurious reclaim is fairly limited and eliding the barrier claims to boost NVMe/SSD accesses by over 10% on some platforms. A similar optimisation was made for x86 in commit b13b1d2d8692 ("x86/mm: In the PTE swapout page reclaim case clear the accessed bit instead of flushing the TLB"). Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com> Signed-off-by: Ashish Mhetre <amhetre@nvidia.com> [will: rewrote patch] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-10-05arm64: mm: Use #ifdef for the __PAGETABLE_P?D_FOLDED definesJames Morse1-2/+6
__is_defined(__PAGETABLE_P?D_FOLDED) doesn't quite work as intended as these symbols are internal to asm-generic and aren't defined in the way kconfig expects. This makes them always evaluate to false. Switch to #ifdef. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-25arm64/mm: use fixmap to modify swapper_pg_dirJun Yao1-6/+29
Once swapper_pg_dir is in the rodata section, it will not be possible to modify it directly, but we will need to modify it in some cases. To enable this, we can use the fixmap when deliberately modifying swapper_pg_dir. As the pgd is only transiently mapped, this provides some resilience against illicit modification of the pgd, e.g. for Kernel Space Mirror Attack (KSMA). Signed-off-by: Jun Yao <yaojun8558363@gmail.com> Reviewed-by: James Morse <james.morse@arm.com> [Mark: simplify ifdeffery, commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-25arm64/mm: Separate boot-time page tables from swapper_pg_dirJun Yao1-1/+2
Since the address of swapper_pg_dir is fixed for a given kernel image, it is an attractive target for manipulation via an arbitrary write. To mitigate this we'd like to make it read-only by moving it into the rodata section. We require that swapper_pg_dir is at a fixed offset from tramp_pg_dir and reserved_ttbr0, so these will also need to move into rodata. However, swapper_pg_dir is allocated along with some transient page tables used for boot which we do not want to move into rodata. As a step towards this, this patch separates the boot-time page tables into a new init_pg_dir, and reduces swapper_pg_dir to the single page it needs to be. This allows us to retain the relationship between swapper_pg_dir, tramp_pg_dir, and swapper_pg_dir, while cleanly separating these from the boot-time page tables. The init_pg_dir holds all of the pgd/pud/pmd/pte levels needed during boot, and all of these levels will be freed when we switch to the swapper_pg_dir, which is initialized by the existing code in paging_init(). Since we start off on the init_pg_dir, we no longer need to allocate a transient page table in paging_init() in order to ensure that swapper_pg_dir isn't live while we initialize it. There should be no functional change as a result of this patch. Signed-off-by: Jun Yao <yaojun8558363@gmail.com> Reviewed-by: James Morse <james.morse@arm.com> [Mark: place init_pg_dir after BSS, fold mm changes, commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-11arm64: pgtable: Implement p[mu]d_valid() and check in set_p[mu]d()Will Deacon1-2/+8
Now that our walk-cache invalidation routines imply a DSB before the invalidation, we no longer need one when we are clearing an entry during unmap. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-27arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}Will Deacon1-5/+1
Commit 7f0b1bf04511 ("arm64: Fix barriers used for page table modifications") fixed a reported issue with fixmap page-table entries not being visible to the walker due to a missing DSB instruction. At the same time, it added ISB instructions to the arm64 set_{pte,pmd,pud} functions, which are not required by the architecture and make little sense in isolation. Remove the redundant ISBs. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-08mm: introduce ARCH_HAS_PTE_SPECIALLaurent Dufour1-2/+0
Currently the PTE special supports is turned on in per architecture header files. Most of the time, it is defined in arch/*/include/asm/pgtable.h depending or not on some other per architecture static definition. This patch introduce a new configuration variable to manage this directly in the Kconfig files. It would later replace __HAVE_ARCH_PTE_SPECIAL. Here notes for some architecture where the definition of __HAVE_ARCH_PTE_SPECIAL is not obvious: arm __HAVE_ARCH_PTE_SPECIAL which is currently defined in arch/arm/include/asm/pgtable-3level.h which is included by arch/arm/include/asm/pgtable.h when CONFIG_ARM_LPAE is set. So select ARCH_HAS_PTE_SPECIAL if ARM_LPAE. powerpc __HAVE_ARCH_PTE_SPECIAL is defined in 2 files: - arch/powerpc/include/asm/book3s/64/pgtable.h - arch/powerpc/include/asm/pte-common.h The first one is included if (PPC_BOOK3S & PPC64) while the second is included in all the other cases. So select ARCH_HAS_PTE_SPECIAL all the time. sparc: __HAVE_ARCH_PTE_SPECIAL is defined if defined(__sparc__) && defined(__arch64__) which are defined through the compiler in sparc/Makefile if !SPARC32 which I assume to be if SPARC64. So select ARCH_HAS_PTE_SPECIAL if SPARC64 There is no functional change introduced by this patch. Link: http://lkml.kernel.org/r/1523433816-14460-2-git-send-email-ldufour@linux.vnet.ibm.com Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Suggested-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <albert@sifive.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Christophe LEROY <christophe.leroy@c-s.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-24arm64: mm: drop addr parameter from sync icache and dcacheShaokun Zhang1-2/+2
The addr parameter isn't used for anything. Let's simplify and get rid of it, like arm. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-16arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tablesWill Deacon1-10/+13
In many cases, page tables can be accessed concurrently by either another CPU (due to things like fast gup) or by the hardware page table walker itself, which may set access/dirty bits. In such cases, it is important to use READ_ONCE/WRITE_ONCE when accessing page table entries so that entries cannot be torn, merged or subject to apparent loss of coherence due to compiler transformations. Whilst there are some scenarios where this cannot happen (e.g. pinned kernel mappings for the linear region), the overhead of using READ_ONCE /WRITE_ONCE everywhere is minimal and makes the code an awful lot easier to reason about. This patch consistently uses these macros in the arch code, as well as explicitly namespacing pointers to page table entries from the entries themselves by using adopting a 'p' suffix for the former (as is sometimes used elsewhere in the kernel source). Tested-by: Yury Norov <ynorov@caviumnetworks.com> Tested-by: Richard Ruigrok <rruigrok@codeaurora.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-02-01arm64: provide pmdp_establish() helperCatalin Marinas1-0/+7
We need an atomic way to setup pmd page table entry, avoiding races with CPU setting dirty/accessed bits. This is required to implement pmdp_invalidate() that doesn't lose these bits. Link: http://lkml.kernel.org/r/20171213105756.69879-5-kirill.shutemov@linux.intel.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-31Merge tag 'arm64-upstream' of ↵Linus Torvalds1-11/+46
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "The main theme of this pull request is security covering variants 2 and 3 for arm64. I expect to send additional patches next week covering an improved firmware interface (requires firmware changes) for variant 2 and way for KPTI to be disabled on unaffected CPUs (Cavium's ThunderX doesn't work properly with KPTI enabled because of a hardware erratum). Summary: - Security mitigations: - variant 2: invalidate the branch predictor with a call to secure firmware - variant 3: implement KPTI for arm64 - 52-bit physical address support for arm64 (ARMv8.2) - arm64 support for RAS (firmware first only) and SDEI (software delegated exception interface; allows firmware to inject a RAS error into the OS) - perf support for the ARM DynamIQ Shared Unit PMU - CPUID and HWCAP bits updated for new floating point multiplication instructions in ARMv8.4 - remove some virtual memory layout printks during boot - fix initial page table creation to cope with larger than 32M kernel images when 16K pages are enabled" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (104 commits) arm64: Fix TTBR + PAN + 52-bit PA logic in cpu_do_switch_mm arm64: Turn on KPTI only on CPUs that need it arm64: Branch predictor hardening for Cavium ThunderX2 arm64: Run enable method for errata work arounds on late CPUs arm64: Move BP hardening to check_and_switch_context arm64: mm: ignore memory above supported physical address size arm64: kpti: Fix the interaction between ASID switching and software PAN KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA KVM: arm64: Handle RAS SErrors from EL2 on guest exit KVM: arm64: Handle RAS SErrors from EL1 on guest exit KVM: arm64: Save ESR_EL2 on guest SError KVM: arm64: Save/Restore guest DISR_EL1 KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. KVM: arm/arm64: mask/unmask daif around VHE guests arm64: kernel: Prepare for a DISR user arm64: Unconditionally enable IESB on exception entry/return for firmware-first arm64: kernel: Survive corrected RAS errors notified by SError arm64: cpufeature: Detect CPU RAS Extentions arm64: sysreg: Move to use definitions for all the SCTLR bits arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early ...
2018-01-14arm64: Extend early page table code to allow for larger kernelsSteve Capper1-0/+1
Currently the early assembler page table code assumes that precisely 1xpgd, 1xpud, 1xpmd are sufficient to represent the early kernel text mappings. Unfortunately this is rarely the case when running with a 16KB granule, and we also run into limits with 4KB granule when building much larger kernels. This patch re-writes the early page table logic to compute indices of mappings for each level of page table, and if multiple indices are required, the next-level page table is scaled up accordingly. Also the required size of the swapper_pg_dir is computed at link time to cover the mapping [KIMAGE_ADDR + VOFFSET, _end]. When KASLR is enabled, an extra page is set aside for each level that may require extra entries at runtime. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-22Merge branch 'for-next/52-bit-pa' into for-next/coreCatalin Marinas1-11/+44
* for-next/52-bit-pa: arm64: enable 52-bit physical address support arm64: allow ID map to be extended to 52 bits arm64: handle 52-bit physical addresses in page table entries arm64: don't open code page table entry creation arm64: head.S: handle 52-bit PAs in PTEs in early page table setup arm64: handle 52-bit addresses in TTBR arm64: limit PA size to supported range arm64: add kconfig symbol to configure physical address size
2017-12-22arm64: handle 52-bit physical addresses in page table entriesKristina Martsenko1-12/+38
The top 4 bits of a 52-bit physical address are positioned at bits 12..15 of a page table entry. Introduce macros to convert between a physical address and its placement in a table entry, and change all macros/functions that access PTEs to use them. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Tested-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [catalin.marinas@arm.com: some long lines wrapped] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-22arm64: don't open code page table entry creationKristina Martsenko1-0/+1
Instead of open coding the generation of page table entries, use the macros/functions that exist for this - pfn_p*d and p*d_populate. Most code in the kernel already uses these macros, this patch tries to fix up the few places that don't. This is useful for the next patch in this series, which needs to change the page table entry logic, and it's better to have that logic in one place. The KVM extended ID map is special, since we're creating a level above CONFIG_PGTABLE_LEVELS and the required function isn't available. Leave it as is and add a comment to explain it. (The normal kernel ID map code doesn't need this change because its page tables are created in assembly (__create_page_tables)). Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>