summaryrefslogtreecommitdiff
path: root/arch/arm64/kvm/hyp
AgeCommit message (Collapse)AuthorFilesLines
2021-05-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds30-212/+2012
Pull kvm updates from Paolo Bonzini: "This is a large update by KVM standards, including AMD PSP (Platform Security Processor, aka "AMD Secure Technology") and ARM CoreSight (debug and trace) changes. ARM: - CoreSight: Add support for ETE and TRBE - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler x86: - AMD PSP driver changes - Optimizations and cleanup of nested SVM code - AMD: Support for virtual SPEC_CTRL - Optimizations of the new MMU code: fast invalidation, zap under read lock, enable/disably dirty page logging under read lock - /dev/kvm API for AMD SEV live migration (guest API coming soon) - support SEV virtual machines sharing the same encryption context - support SGX in virtual machines - add a few more statistics - improved directed yield heuristics - Lots and lots of cleanups Generic: - Rework of MMU notifier interface, simplifying and optimizing the architecture-specific code - a handful of "Get rid of oprofile leftovers" patches - Some selftests improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits) KVM: selftests: Speed up set_memory_region_test selftests: kvm: Fix the check of return value KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt() KVM: SVM: Skip SEV cache flush if no ASIDs have been used KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids() KVM: SVM: Drop redundant svm_sev_enabled() helper KVM: SVM: Move SEV VMCB tracking allocation to sev.c KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup() KVM: SVM: Unconditionally invoke sev_hardware_teardown() KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported) KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features KVM: SVM: Move SEV module params/variables to sev.c KVM: SVM: Disable SEV/SEV-ES if NPT is disabled KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails KVM: SVM: Zero out the VMCB array used to track SEV ASID association x86/sev: Drop redundant and potentially misleading 'sev_enabled' KVM: x86: Move reverse CPUID helpers to separate header file KVM: x86: Rename GPR accessors to make mode-aware variants the defaults ...
2021-04-23Merge tag 'kvmarm-5.13' of ↵Paolo Bonzini30-212/+2012
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.13 New features: - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler - Alexandru is now a reviewer (not really a new feature...) Fixes: - Proper emulation of the GICR_TYPER register - Handle the complete set of relocation in the nVHE EL2 object - Get rid of the oprofile dependency in the PMU code (and of the oprofile body parts at the same time) - Debug and SPE fixes - Fix vcpu reset
2021-04-13Merge branch 'kvm-arm64/nvhe-wxn' into kvmarm-master/nextMarc Zyngier1-10/+2
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13Merge branch 'kvm-arm64/nvhe-panic-info' into kvmarm-master/nextMarc Zyngier8-28/+15
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13Merge branch 'kvm-arm64/misc-5.13' into kvmarm-master/nextMarc Zyngier1-0/+18
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-13Merge branch 'kvm-arm64/host-stage2' into kvmarm-master/nextMarc Zyngier25-168/+1940
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-09KVM: arm64: Disable CFI for nVHESami Tolvanen1-3/+3
Disable CFI for the nVHE code to avoid address space confusion. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-18-samitolvanen@google.com
2021-04-07arm64: KVM: Enable access to TRBE support for hostSuzuki K Poulose2-0/+35
For a nvhe host, the EL2 must allow the EL1&0 translation regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must be saved/restored over a trip to the guest. Also, before entering the guest, we must flush any trace data if the TRBE was enabled. And we must prohibit the generation of trace while we are in EL1 by clearing the TRFCR_EL1. For vhe, the EL2 must prevent the EL1 access to the Trace Buffer. The MDCR_EL2 bit definitions for TRBE are available here : https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Registers/ Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405164307.1720226-8-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2021-04-07KVM: arm64: Move SPE availability check to VCPU loadSuzuki K Poulose1-13/+9
At the moment, we check the availability of SPE on the given CPU (i.e, SPE is implemented and is allowed at the host) during every guest entry. This can be optimized a bit by moving the check to vcpu_load time and recording the availability of the feature on the current CPU via a new flag. This will also be useful for adding the TRBE support. Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Alexandru Elisei <Alexandru.Elisei@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210405164307.1720226-7-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2021-04-01KVM: arm64: Log source when panicking from nVHE hypAndrew Scull4-18/+8
To aid with debugging, add details of the source of a panic from nVHE hyp. This is done by having nVHE hyp exit to nvhe_hyp_panic_handler() rather than directly to panic(). The handler will then add the extra details for debugging before panicking the kernel. If the panic was due to a BUG(), look up the metadata to log the file and line, if available, otherwise log an address that can be looked up in vmlinux. The hyp offset is also logged to allow other hyp VAs to be converted, similar to how the kernel offset is logged during a panic. __hyp_panic_string is now inlined since it no longer needs to be referenced as a symbol and the message is free to diverge between VHE and nVHE. The following is an example of the logs generated by a BUG in nVHE hyp. [ 46.754840] kvm [307]: nVHE hyp BUG at: arch/arm64/kvm/hyp/nvhe/switch.c:242! [ 46.755357] kvm [307]: Hyp Offset: 0xfffea6c58e1e0000 [ 46.755824] Kernel panic - not syncing: HYP panic: [ 46.755824] PS:400003c9 PC:0000d93a82c705ac ESR:f2000800 [ 46.755824] FAR:0000000080080000 HPFAR:0000000000800800 PAR:0000000000000000 [ 46.755824] VCPU:0000d93a880d0000 [ 46.756960] CPU: 3 PID: 307 Comm: kvm-vcpu-0 Not tainted 5.12.0-rc3-00005-gc572b99cf65b-dirty #133 [ 46.757459] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 [ 46.758366] Call trace: [ 46.758601] dump_backtrace+0x0/0x1b0 [ 46.758856] show_stack+0x18/0x70 [ 46.759057] dump_stack+0xd0/0x12c [ 46.759236] panic+0x16c/0x334 [ 46.759426] arm64_kernel_unmapped_at_el0+0x0/0x30 [ 46.759661] kvm_arch_vcpu_ioctl_run+0x134/0x750 [ 46.759936] kvm_vcpu_ioctl+0x2f0/0x970 [ 46.760156] __arm64_sys_ioctl+0xa8/0xec [ 46.760379] el0_svc_common.constprop.0+0x60/0x120 [ 46.760627] do_el0_svc+0x24/0x90 [ 46.760766] el0_svc+0x2c/0x54 [ 46.760915] el0_sync_handler+0x1a4/0x1b0 [ 46.761146] el0_sync+0x170/0x180 [ 46.761889] SMP: stopping secondary CPUs [ 46.762786] Kernel Offset: 0x3e1cd2820000 from 0xffff800010000000 [ 46.763142] PHYS_OFFSET: 0xffffa9f680000000 [ 46.763359] CPU features: 0x00240022,61806008 [ 46.763651] Memory Limit: none [ 46.813867] ---[ end Kernel panic - not syncing: HYP panic: [ 46.813867] PS:400003c9 PC:0000d93a82c705ac ESR:f2000800 [ 46.813867] FAR:0000000080080000 HPFAR:0000000000800800 PAR:0000000000000000 [ 46.813867] VCPU:0000d93a880d0000 ]--- Signed-off-by: Andrew Scull <ascull@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210318143311.839894-6-ascull@google.com
2021-04-01KVM: arm64: Use BUG and BUG_ON in nVHE hypAndrew Scull2-5/+3
hyp_panic() reports the address of the panic by using ELR_EL2, but this isn't a useful address when hyp_panic() is called directly. Replace such direct calls with BUG() and BUG_ON() which use BRK to trigger an exception that then goes to hyp_panic() with the correct address. Also remove the hyp_panic() declaration from the header file to avoid accidental misuse. Signed-off-by: Andrew Scull <ascull@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210318143311.839894-5-ascull@google.com
2021-03-31KVM: arm64: Support PREL/PLT relocs in EL2 codeDavid Brazdil1-0/+18
gen-hyprel tool parses object files of the EL2 portion of KVM and generates runtime relocation data. While only filtering for R_AARCH64_ABS64 relocations in the input object files, it has an allow-list of relocation types that are used for relative addressing. Other, unexpected, relocation types are rejected and cause the build to fail. This allow-list did not include the position-relative relocation types R_AARCH64_PREL64/32/16 and the recently introduced _PLT32. While not seen used by toolchains in the wild, add them to the allow-list for completeness. Fixes: 8c49b5d43d4c ("KVM: arm64: Generate hyp relocation data") Cc: <stable@vger.kernel.org> Reported-by: Will Deacon <will@kernel.org> Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210331133048.63311-1-dbrazdil@google.com
2021-03-25KVM: arm64: Drop the CPU_FTR_REG_HYP_COPY infrastructureMarc Zyngier2-15/+9
Now that the read_ctr macro has been specialised for nVHE, the whole CPU_FTR_REG_HYP_COPY infrastrcture looks completely overengineered. Simplify it by populating the two u64 quantities (MMFR0 and 1) that the hypervisor need. Reviewed-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-03-24KVM: arm64: Fix CPU interface MMIO compatibility detectionMarc Zyngier1-0/+9
In order to detect whether a GICv3 CPU interface is MMIO capable, we switch ICC_SRE_EL1.SRE to 0 and check whether it sticks. However, this is only possible if *ALL* of the HCR_EL2 interrupt overrides are set, and the CPU is perfectly allowed to ignore the write to ICC_SRE_EL1 otherwise. This leads KVM to pretend that a whole bunch of ARMv8.0 CPUs aren't MMIO-capable, and breaks VMs that should work correctly otherwise. Fix this by setting IMO/FMO/IMO before touching ICC_SRE_EL1, and clear them afterwards. This allows us to reliably detect the CPU interface capabilities. Tested-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Fixes: 9739f6ef053f ("KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility") Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-03-19KVM: arm64: Protect the .hyp sections from the hostQuentin Perret3-0/+44
When KVM runs in nVHE protected mode, use the host stage 2 to unmap the hypervisor sections by marking them as owned by the hypervisor itself. The long-term goal is to ensure the EL2 code can remain robust regardless of the host's state, so this starts by making sure the host cannot e.g. write to the .hyp sections directly. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-39-qperret@google.com
2021-03-19KVM: arm64: Wrap the host with a stage 2Quentin Perret8-7/+302
When KVM runs in protected nVHE mode, make use of a stage 2 page-table to give the hypervisor some control over the host memory accesses. The host stage 2 is created lazily using large block mappings if possible, and will default to page mappings in absence of a better solution. >From this point on, memory accesses from the host to protected memory regions (e.g. not 'owned' by the host) are fatal and lead to hyp_panic(). Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-36-qperret@google.com
2021-03-19KVM: arm64: Provide sanitized mmfr* registers at EL2Quentin Perret1-0/+2
We will need to read sanitized values of mmfr{0,1}_el1 at EL2 soon, so add them to the list of copied variables. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-35-qperret@google.com
2021-03-19KVM: arm64: Introduce KVM_PGTABLE_S2_IDMAP stage 2 flagQuentin Perret1-0/+3
Introduce a new stage 2 configuration flag to specify that all mappings in a given page-table will be identity-mapped, as will be the case for the host. This allows to introduce sanity checks in the map path and to avoid programming errors. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-34-qperret@google.com
2021-03-19KVM: arm64: Introduce KVM_PGTABLE_S2_NOFWB stage 2 flagQuentin Perret1-25/+31
In order to further configure stage 2 page-tables, pass flags to the init function using a new enum. The first of these flags allows to disable FWB even if the hardware supports it as we will need to do so for the host stage 2. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-33-qperret@google.com
2021-03-19KVM: arm64: Add kvm_pgtable_stage2_find_range()Quentin Perret1-4/+85
Since the host stage 2 will be identity mapped, and since it will own most of memory, it would preferable for performance to try and use large block mappings whenever that is possible. To ease this, introduce a new helper in the KVM page-table code which allows to search for large ranges of available IPA space. This will be used in the host memory abort path to greedily idmap large portion of the PA space. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-32-qperret@google.com
2021-03-19KVM: arm64: Refactor the *_map_set_prot_attr() helpersQuentin Perret1-8/+8
In order to ease their re-use in other code paths, refactor the *_map_set_prot_attr() helpers to not depend on a map_data struct. No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-31-qperret@google.com
2021-03-19KVM: arm64: Use page-table to track page ownershipQuentin Perret1-24/+102
As the host stage 2 will be identity mapped, all the .hyp memory regions and/or memory pages donated to protected guestis will have to marked invalid in the host stage 2 page-table. At the same time, the hypervisor will need a way to track the ownership of each physical page to ensure memory sharing or donation between entities (host, guests, hypervisor) is legal. In order to enable this tracking at EL2, let's use the host stage 2 page-table itself. The idea is to use the top bits of invalid mappings to store the unique identifier of the page owner. The page-table owner (the host) gets identifier 0 such that, at boot time, it owns the entire IPA space as the pgd starts zeroed. Provide kvm_pgtable_stage2_set_owner() which allows to modify the ownership of pages in the host stage 2. It re-uses most of the map() logic, but ends up creating invalid mappings instead. This impacts how we do refcount as we now need to count invalid mappings when they are used for ownership tracking. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-30-qperret@google.com
2021-03-19KVM: arm64: Always zero invalid PTEsQuentin Perret1-10/+16
kvm_set_invalid_pte() currently only clears bit 0 from a PTE because stage2_map_walk_table_post() needs to be able to follow the anchor. In preparation for re-using bits 63-01 from invalid PTEs, make sure to zero it entirely by ensuring to cache the anchor's child upfront. Acked-by: Will Deacon <will@kernel.org> Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-29-qperret@google.com
2021-03-19KVM: arm64: Sort the hypervisor memblocksQuentin Perret1-0/+19
We will soon need to check if a Physical Address belongs to a memblock at EL2, so make sure to sort them so this can be done efficiently. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-28-qperret@google.com
2021-03-19KVM: arm64: Reserve memory for host stage 2Quentin Perret3-1/+40
Extend the memory pool allocated for the hypervisor to include enough pages to map all of memory at page granularity for the host stage 2. While at it, also reserve some memory for device mappings. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-27-qperret@google.com
2021-03-19KVM: arm64: Make memcache anonymous in pgtable allocatorQuentin Perret1-2/+2
The current stage2 page-table allocator uses a memcache to get pre-allocated pages when it needs any. To allow re-using this code at EL2 which uses a concept of memory pools, make the memcache argument of kvm_pgtable_stage2_map() anonymous, and let the mm_ops zalloc_page() callbacks use it the way they need to. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-26-qperret@google.com
2021-03-19KVM: arm64: Refactor __populate_fault_info()Quentin Perret1-11/+17
Refactor __populate_fault_info() to introduce __get_fault_info() which will be used once the host is wrapped in a stage 2. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-25-qperret@google.com
2021-03-19KVM: arm64: Refactor kvm_arm_setup_stage2()Quentin Perret1-0/+32
In order to re-use some of the stage 2 setup code at EL2, factor parts of kvm_arm_setup_stage2() out into separate functions. No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-23-qperret@google.com
2021-03-19KVM: arm64: Set host stage 2 using kvm_nvhe_init_paramsQuentin Perret2-9/+10
Move the registers relevant to host stage 2 enablement to kvm_nvhe_init_params to prepare the ground for enabling it in later patches. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-22-qperret@google.com
2021-03-19KVM: arm64: Use kvm_arch for stage 2 pgtableQuentin Perret1-3/+3
In order to make use of the stage 2 pgtable code for the host stage 2, use struct kvm_arch in lieu of struct kvm as the host will have the former but not the latter. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-20-qperret@google.com
2021-03-19KVM: arm64: Prepare the creation of s1 mappings at EL2Quentin Perret9-5/+612
When memory protection is enabled, the EL2 code needs the ability to create and manage its own page-table. To do so, introduce a new set of hypercalls to bootstrap a memory management system at EL2. This leads to the following boot flow in nVHE Protected mode: 1. the host allocates memory for the hypervisor very early on, using the memblock API; 2. the host creates a set of stage 1 page-table for EL2, installs the EL2 vectors, and issues the __pkvm_init hypercall; 3. during __pkvm_init, the hypervisor re-creates its stage 1 page-table and stores it in the memory pool provided by the host; 4. the hypervisor then extends its stage 1 mappings to include a vmemmap in the EL2 VA space, hence allowing to use the buddy allocator introduced in a previous patch; 5. the hypervisor jumps back in the idmap page, switches from the host-provided page-table to the new one, and wraps up its initialization by enabling the new allocator, before returning to the host. 6. the host can free the now unused page-table created for EL2, and will now need to issue hypercalls to make changes to the EL2 stage 1 mappings instead of modifying them directly. Note that for the sake of simplifying the review, this patch focuses on the hypervisor side of things. In other words, this only implements the new hypercalls, but does not make use of them from the host yet. The host-side changes will follow in a subsequent patch. Credits to Will for __pkvm_init_switch_pgd. Acked-by: Will Deacon <will@kernel.org> Co-authored-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-18-qperret@google.com
2021-03-19KVM: arm64: Provide __flush_dcache_area at EL2Quentin Perret3-1/+21
We will need to do cache maintenance at EL2 soon, so compile a copy of __flush_dcache_area at EL2, and provide a copy of arm64_ftr_reg_ctrel0 as it is needed by the read_ctr macro. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-15-qperret@google.com
2021-03-19KVM: arm64: Introduce a Hyp buddy page allocatorQuentin Perret4-1/+292
When memory protection is enabled, the hyp code will require a basic form of memory management in order to allocate and free memory pages at EL2. This is needed for various use-cases, including the creation of hyp mappings or the allocation of stage 2 page tables. To address these use-case, introduce a simple memory allocator in the hyp code. The allocator is designed as a conventional 'buddy allocator', working with a page granularity. It allows to allocate and free physically contiguous pages from memory 'pools', with a guaranteed order alignment in the PA space. Each page in a memory pool is associated with a struct hyp_page which holds the page's metadata, including its refcount, as well as its current order, hence mimicking the kernel's buddy system in the GFP infrastructure. The hyp_page metadata are made accessible through a hyp_vmemmap, following the concept of SPARSE_VMEMMAP in the kernel. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-13-qperret@google.com
2021-03-19KVM: arm64: Stub CONFIG_DEBUG_LIST at HypQuentin Perret2-1/+23
In order to use the kernel list library at EL2, introduce stubs for the CONFIG_DEBUG_LIST out-of-lines calls. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-12-qperret@google.com
2021-03-19KVM: arm64: Introduce an early Hyp page allocatorQuentin Perret5-4/+94
With nVHE, the host currently creates all stage 1 hypervisor mappings at EL1 during boot, installs them at EL2, and extends them as required (e.g. when creating a new VM). But in a world where the host is no longer trusted, it cannot have full control over the code mapped in the hypervisor. In preparation for enabling the hypervisor to create its own stage 1 mappings during boot, introduce an early page allocator, with minimal functionality. This allocator is designed to be used only during early bootstrap of the hyp code when memory protection is enabled, which will then switch to using a full-fledged page allocator after init. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-11-qperret@google.com
2021-03-19KVM: arm64: Introduce a BSS section for use at HypQuentin Perret1-0/+1
Currently, the hyp code cannot make full use of a bss, as the kernel section is mapped read-only. While this mapping could simply be changed to read-write, it would intermingle even more the hyp and kernel state than they currently are. Instead, introduce a __hyp_bss section, that uses reserved pages, and create the appropriate RW hyp mappings during KVM init. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-8-qperret@google.com
2021-03-19KVM: arm64: Factor memory allocation out of pgtable.cQuentin Perret1-38/+60
In preparation for enabling the creation of page-tables at EL2, factor all memory allocation out of the page-table code, hence making it re-usable with any compatible memory allocator. No functional changes intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-7-qperret@google.com
2021-03-19KVM: arm64: Avoid free_page() in page-table allocatorQuentin Perret1-5/+5
Currently, the KVM page-table allocator uses a mix of put_page() and free_page() calls depending on the context even though page-allocation is always achieved using variants of __get_free_page(). Make the code consistent by using put_page() throughout, and reduce the memory management API surface used by the page-table code. This will ease factoring out page-allocation from pgtable.c, which is a pre-requisite to creating page-tables at EL2. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-6-qperret@google.com
2021-03-19arm64: kvm: Add standalone ticket spinlock implementation for use at hypWill Deacon1-0/+92
We will soon need to synchronise multiple CPUs in the hyp text at EL2. The qspinlock-based locking used by the host is overkill for this purpose and relies on the kernel's "percpu" implementation for the MCS nodes. Implement a simple ticket locking scheme based heavily on the code removed by commit c11090474d70 ("arm64: locking: Replace ticket lock implementation with qspinlock"). Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-4-qperret@google.com
2021-03-19KVM: arm64: Link position-independent string routines into .hyp.textWill Deacon1-0/+4
Pull clear_page(), copy_page(), memcpy() and memset() into the nVHE hyp code and ensure that we always execute the '__pi_' entry point on the offchance that it changes in future. [ qperret: Commit title nits and added linker script alias ] Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-3-qperret@google.com
2021-03-19Merge tag 'v5.12-rc3' into kvm-arm64/host-stage2Marc Zyngier10-28/+89
Linux 5.12-rc3 Signed-off-by: Marc Zyngier <maz@kernel.org> # gpg: Signature made Sun 14 Mar 2021 21:41:02 GMT # gpg: using RSA key ABAF11C65A2970B130ABE3C479BE3E4300411886 # gpg: issuer "torvalds@linux-foundation.org" # gpg: Can't check signature: No public key
2021-03-18KVM: arm64: Fix host's ZCR_EL2 restore on nVHEMarc Zyngier1-1/+2
We re-enter the EL1 host with CPTR_EL2.TZ set in order to be able to lazily restore ZCR_EL2 when required. However, the same CPTR_EL2 configuration also leads to trapping when ZCR_EL2 is accessed from EL2. Duh! Clear CPTR_EL2.TZ *before* writing to ZCR_EL2. Fixes: beed09067b42 ("KVM: arm64: Trap host SVE accesses when the FPSIMD state is dirty") Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-03-18KVM: arm64: Turn SCTLR_ELx_FLAGS into INIT_SCTLR_EL2_MMU_ONMarc Zyngier1-7/+1
Only the nVHE EL2 code is using this define, so let's make it plain that it is EL2 only, and refactor it to contain all the bits we need when configuring the EL2 MMU, and only those. Acked-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-03-18KVM: arm64: Use INIT_SCTLR_EL2_MMU_OFF to disable the MMU on KVM teardownMarc Zyngier1-3/+1
Instead of doing a RMW on SCTLR_EL2 to disable the MMU, use the existing define that loads the right set of bits. Acked-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-03-18KVM: arm64: Save/restore SVE state for nVHEMarc Zyngier2-26/+15
Implement the SVE save/restore for nVHE, following a similar logic to that of the VHE implementation: - the SVE state is switched on trap from EL1 to EL2 - no further changes to ZCR_EL2 occur as long as the guest isn't preempted or exit to userspace - ZCR_EL2 is reset to its default value on the first SVE access from the host EL1, and ZCR_EL1 restored to the default guest value in vcpu_put() Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-03-18KVM: arm64: Trap host SVE accesses when the FPSIMD state is dirtyMarc Zyngier2-2/+11
ZCR_EL2 controls the upper bound for ZCR_EL1, and is set to a potentially lower limit when the guest uses SVE. In order to restore the SVE state on the EL1 host, we must first reset ZCR_EL2 to its original value. To make it as lazy as possible on the EL1 host side, set the SVE trapping in place when exiting from the guest. On the first EL1 access to SVE, ZCR_EL2 will be restored to its full glory. Suggested-by: Andrew Scull <ascull@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-03-18KVM: arm64: Rework SVE host-save/guest-restoreMarc Zyngier2-17/+25
In order to keep the code readable, move the host-save/guest-restore sequences in their own functions, with the following changes: - the hypervisor ZCR is now set from C code - ZCR_EL2 is always used as the EL2 accessor This results in some minor assembler macro rework. No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-03-18KVM: arm64: Introduce vcpu_sve_vq() helperMarc Zyngier1-1/+1
The KVM code contains a number of "sve_vq_from_vl(vcpu->arch.sve_max_vl)" instances, and we are about to add more. Introduce vcpu_sve_vq() as a shorthand for this expression. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-03-18KVM: arm64: Use {read,write}_sysreg_el1 to access ZCR_EL1Marc Zyngier1-1/+1
Switch to the unified EL1 accessors for ZCR_EL1, which will make things easier for nVHE support. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-03-18KVM: arm64: Provide KVM's own save/restore SVE primitivesMarc Zyngier2-5/+15
as we are about to change the way KVM deals with SVE, provide KVM with its own save/restore SVE primitives. No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>