summaryrefslogtreecommitdiff
path: root/arch/x86/kvm/svm
AgeCommit message (Collapse)AuthorFilesLines
2021-05-29Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-8/+4
Pull KVM fixes from Paolo Bonzini: "ARM fixes: - Another state update on exit to userspace fix - Prevent the creation of mixed 32/64 VMs - Fix regression with irqbypass not restarting the guest on failed connect - Fix regression with debug register decoding resulting in overlapping access - Commit exception state on exit to usrspace - Fix the MMU notifier return values - Add missing 'static' qualifiers in the new host stage-2 code x86 fixes: - fix guest missed wakeup with assigned devices - fix WARN reported by syzkaller - do not use BIT() in UAPI headers - make the kvm_amd.avic parameter bool PPC fixes: - make halt polling heuristics consistent with other architectures selftests: - various fixes - new performance selftest memslot_perf_test - test UFFD minor faults in demand_paging_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (44 commits) selftests: kvm: fix overlapping addresses in memslot_perf_test KVM: X86: Kill off ctxt->ud KVM: X86: Fix warning caused by stale emulation context KVM: X86: Use kvm_get_linear_rip() in single-step and #DB/#BP interception KVM: x86/mmu: Fix comment mentioning skip_4k KVM: VMX: update vcpu posted-interrupt descriptor when assigning device KVM: rename KVM_REQ_PENDING_TIMER to KVM_REQ_UNBLOCK KVM: x86: add start_assignment hook to kvm_x86_ops KVM: LAPIC: Narrow the timer latency between wait_lapic_expire and world switch selftests: kvm: do only 1 memslot_perf_test run by default KVM: X86: Use _BITUL() macro in UAPI headers KVM: selftests: add shared hugetlbfs backing source type KVM: selftests: allow using UFFD minor faults for demand paging KVM: selftests: create alias mappings when using shared memory KVM: selftests: add shmem backing source type KVM: selftests: refactor vm_mem_backing_src_type flags KVM: selftests: allow different backing source types KVM: selftests: compute correct demand paging size KVM: selftests: simplify setup_demand_paging error handling KVM: selftests: Print a message if /dev/kvm is missing ...
2021-05-24KVM: SVM: make the avic parameter a boolPaolo Bonzini2-3/+3
Make it consistent with kvm_intel.enable_apicv. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-24KVM: SVM: Drop unneeded CONFIG_X86_LOCAL_APIC checkVitaly Kuznetsov2-5/+1
AVIC dependency on CONFIG_X86_LOCAL_APIC is dead code since commit e42eef4ba388 ("KVM: add X86_LOCAL_APIC dependency"). Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210518144339.1987982-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com>
2021-05-17Merge tag 'kvmarm-fixes-5.13-1' of ↵Paolo Bonzini1-37/+2
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.13, take #1 - Fix regression with irqbypass not restarting the guest on failed connect - Fix regression with debug register decoding resulting in overlapping access - Commit exception state on exit to usrspace - Fix the MMU notifier return values - Add missing 'static' qualifiers in the new host stage-2 code
2021-05-16Merge tag 'x86_urgent_for_v5.13_rc2' of ↵Linus Torvalds2-36/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "The three SEV commits are not really urgent material. But we figured since getting them in now will avoid a huge amount of conflicts between future SEV changes touching tip, the kvm and probably other trees, sending them to you now would be best. The idea is that the tip, kvm etc branches for 5.14 will all base ontop of -rc2 and thus everything will be peachy. What is more, those changes are purely mechanical and defines movement so they should be fine to go now (famous last words). Summary: - Enable -Wundef for the compressed kernel build stage - Reorganize SEV code to streamline and simplify future development" * tag 'x86_urgent_for_v5.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/compressed: Enable -Wundef x86/msr: Rename MSR_K8_SYSCFG to MSR_AMD64_SYSCFG x86/sev: Move GHCB MSR protocol and NAE definitions in a common header x86/sev-es: Rename sev-es.{ch} to sev.{ch}
2021-05-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds4-56/+62
Pull kvm fixes from Paolo Bonzini: - Lots of bug fixes. - Fix virtualization of RDPID - Virtualization of DR6_BUS_LOCK, which on bare metal is new to this release - More nested virtualization migration fixes (nSVM and eVMCS) - Fix for KVM guest hibernation - Fix for warning in SEV-ES SRCU usage - Block KVM from loading on AMD machines with 5-level page tables, due to the APM not mentioning how host CR4.LA57 exactly impacts the guest. * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (48 commits) KVM: SVM: Move GHCB unmapping to fix RCU warning KVM: SVM: Invert user pointer casting in SEV {en,de}crypt helpers kvm: Cap halt polling at kvm->max_halt_poll_ns tools/kvm_stat: Fix documentation typo KVM: x86: Prevent deadlock against tk_core.seq KVM: x86: Cancel pvclock_gtod_work on module removal KVM: x86: Prevent KVM SVM from loading on kernels with 5-level paging KVM: X86: Expose bus lock debug exception to guest KVM: X86: Add support for the emulation of DR6_BUS_LOCK bit KVM: PPC: Book3S HV: Fix conversion to gfn-based MMU notifier callbacks KVM: x86: Hide RDTSCP and RDPID if MSR_TSC_AUX probing failed KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU model KVM: x86: Move uret MSR slot management to common x86 KVM: x86: Export the number of uret MSRs to vendor modules KVM: VMX: Disable loading of TSX_CTRL MSR the more conventional way KVM: VMX: Use common x86's uret MSR list as the one true list KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list KVM: VMX: Configure list of user return MSRs at module init KVM: x86: Add support for RDPID without RDTSCP KVM: SVM: Probe and load MSR_TSC_AUX regardless of RDTSCP support in host ...
2021-05-10x86/msr: Rename MSR_K8_SYSCFG to MSR_AMD64_SYSCFGBrijesh Singh1-2/+2
The SYSCFG MSR continued being updated beyond the K8 family; drop the K8 name from it. Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/20210427111636.1207-4-brijesh.singh@amd.com
2021-05-10x86/sev: Move GHCB MSR protocol and NAE definitions in a common headerBrijesh Singh1-34/+4
The guest and the hypervisor contain separate macros to get and set the GHCB MSR protocol and NAE event fields. Consolidate the GHCB protocol definitions and helper macros in one place. Leave the supported protocol version define in separate files to keep the guest and hypervisor flexibility to support different GHCB version in the same release. There is no functional change intended. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/20210427111636.1207-3-brijesh.singh@amd.com
2021-05-07KVM: SVM: Move GHCB unmapping to fix RCU warningTom Lendacky3-4/+5
When an SEV-ES guest is running, the GHCB is unmapped as part of the vCPU run support. However, kvm_vcpu_unmap() triggers an RCU dereference warning with CONFIG_PROVE_LOCKING=y because the SRCU lock is released before invoking the vCPU run support. Move the GHCB unmapping into the prepare_guest_switch callback, which is invoked while still holding the SRCU lock, eliminating the RCU dereference warning. Fixes: 291bd20d5d88 ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT") Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <b2f9b79d15166f2c3e4375c0d9bc3268b7696455.1620332081.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-07KVM: SVM: Invert user pointer casting in SEV {en,de}crypt helpersSean Christopherson1-13/+11
Invert the user pointer params for SEV's helpers for encrypting and decrypting guest memory so that they take a pointer and cast to an unsigned long as necessary, as opposed to doing the opposite. Tagging a non-pointer as __user is confusing and weird since a cast of some form needs to occur to actually access the user data. This also fixes Sparse warnings triggered by directly consuming the unsigned longs, which are "noderef" due to the __user tag. Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210506231542.2331138-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-07KVM: x86: Prevent KVM SVM from loading on kernels with 5-level pagingSean Christopherson1-0/+5
Disallow loading KVM SVM if 5-level paging is supported. In theory, NPT for L1 should simply work, but there unknowns with respect to how the guest's MAXPHYADDR will be handled by hardware. Nested NPT is more problematic, as running an L1 VMM that is using 2-level page tables requires stacking single-entry PDP and PML4 tables in KVM's NPT for L2, as there are no equivalent entries in L1's NPT to shadow. Barring hardware magic, for 5-level paging, KVM would need stack another layer to handle PML5. Opportunistically rename the lm_root pointer, which is used for the aforementioned stacking when shadowing 2-level L1 NPT, to pml4_root to call out that it's specifically for PML4. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210505204221.1934471-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-07KVM: x86: Tie Intel and AMD behavior for MSR_TSC_AUX to guest CPU modelSean Christopherson1-24/+0
Squish the Intel and AMD emulation of MSR_TSC_AUX together and tie it to the guest CPU model instead of the host CPU behavior. While not strictly necessary to avoid guest breakage, emulating cross-vendor "architecture" will provide consistent behavior for the guest, e.g. WRMSR fault behavior won't change if the vCPU is migrated to a host with divergent behavior. Note, the "new" kvm_is_supported_user_return_msr() checks do not add new functionality on either SVM or VMX. On SVM, the equivalent was "tsc_aux_uret_slot < 0", and on VMX the check was buried in the vmx_find_uret_msr() call at the find_uret_msr label. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-15-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-07KVM: x86: Move uret MSR slot management to common x86Sean Christopherson1-4/+1
Now that SVM and VMX both probe MSRs before "defining" user return slots for them, consolidate the code for probe+define into common x86 and eliminate the odd behavior of having the vendor code define the slot for a given MSR. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-14-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-07KVM: x86: Add support for RDPID without RDTSCPSean Christopherson1-2/+4
Allow userspace to enable RDPID for a guest without also enabling RDTSCP. Aside from checking for RDPID support in the obvious flows, VMX also needs to set ENABLE_RDTSCP=1 when RDPID is exposed. For the record, there is no known scenario where enabling RDPID without RDTSCP is desirable. But, both AMD and Intel architectures allow for the condition, i.e. this is purely to make KVM more architecturally accurate. Fixes: 41cd02c6f7f6 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID") Cc: stable@vger.kernel.org Reported-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-07KVM: SVM: Probe and load MSR_TSC_AUX regardless of RDTSCP support in hostSean Christopherson1-8/+10
Probe MSR_TSC_AUX whether or not RDTSCP is supported in the host, and if probing succeeds, load the guest's MSR_TSC_AUX into hardware prior to VMRUN. Because SVM doesn't support interception of RDPID, RDPID cannot be disallowed in the guest (without resorting to binary translation). Leaving the host's MSR_TSC_AUX in hardware would leak the host's value to the guest if RDTSCP is not supported. Note, there is also a kernel bug that prevents leaking the host's value. The host kernel initializes MSR_TSC_AUX if and only if RDTSCP is supported, even though the vDSO usage consumes MSR_TSC_AUX via RDPID. I.e. if RDTSCP is not supported, there is no host value to leak. But, if/when the host kernel bug is fixed, KVM would start leaking MSR_TSC_AUX in the case where hardware supports RDPID but RDTSCP is unavailable for whatever reason. Probing MSR_TSC_AUX will also allow consolidating the probe and define logic in common x86, and will make it simpler to condition the existence of MSR_TSX_AUX (from the guest's perspective) on RDTSCP *or* RDPID. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-07KVM: SVM: Inject #UD on RDTSCP when it should be disabled in the guestSean Christopherson1-4/+13
Intercept RDTSCP to inject #UD if RDTSC is disabled in the guest. Note, SVM does not support intercepting RDPID. Unlike VMX's ENABLE_RDTSCP control, RDTSCP interception does not apply to RDPID. This is a benign virtualization hole as the host kernel (incorrectly) sets MSR_TSC_AUX if RDTSCP is supported, and KVM loads the guest's MSR_TSC_AUX into hardware if RDTSCP is supported in the host, i.e. KVM will not leak the host's MSR_TSC_AUX to the guest. But, when the kernel bug is fixed, KVM will start leaking the host's MSR_TSC_AUX if RDPID is supported in hardware, but RDTSCP isn't available for whatever reason. This leak will be remedied in a future commit. Fixes: 46896c73c1a4 ("KVM: svm: add support for RDTSCP") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210504171734.1434054-4-seanjc@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-07KVM: nSVM: remove a warning about vmcb01 VM exit reasonMaxim Levitsky1-1/+0
While in most cases, when returning to use the VMCB01, the exit reason stored in it will be SVM_EXIT_VMRUN, on first VM exit after a nested migration this field can contain anything since the VM entry did happen before the migration. Remove this warning to avoid the false positive. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210504143936.1644378-3-mlevitsk@redhat.com> Fixes: 9a7de6ecc3ed ("KVM: nSVM: If VMRUN is single-stepped, queue the #DB intercept in nested_svm_vmexit()") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-07KVM: nSVM: always restore the L1's GIF on migrationMaxim Levitsky1-0/+2
While usually the L1's GIF is set while L2 runs, and usually migration nested state is loaded after a vCPU reset which also sets L1's GIF to true, this is not guaranteed. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210504143936.1644378-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-05KVM: x86: Consolidate guest enter/exit logic to common helpersSean Christopherson1-37/+2
Move the enter/exit logic in {svm,vmx}_vcpu_enter_exit() to common helpers. Opportunistically update the somewhat stale comment about the updates needing to occur immediately after VM-Exit. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210505002735.1684165-9-seanjc@google.com
2021-05-05KVM: x86: Defer vtime accounting 'til after IRQ handlingWanpeng Li1-3/+3
Defer the call to account guest time until after servicing any IRQ(s) that happened in the guest or immediately after VM-Exit. Tick-based accounting of vCPU time relies on PF_VCPU being set when the tick IRQ handler runs, and IRQs are blocked throughout the main sequence of vcpu_enter_guest(), including the call into vendor code to actually enter and exit the guest. This fixes a bug where reported guest time remains '0', even when running an infinite loop in the guest: https://bugzilla.kernel.org/show_bug.cgi?id=209831 Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210505002735.1684165-4-seanjc@google.com
2021-05-03KVM: x86: Fix potential fput on a null source_kvm_fileColin Ian King1-1/+2
The fget can potentially return null, so the fput on the error return path can cause a null pointer dereference. Fix this by checking for a null source_kvm_file before doing a fput. Addresses-Coverity: ("Dereference null return") Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context") Signed-off-by: Colin Ian King <colin.king@canonical.com> Message-Id: <20210430170303.131924-1-colin.king@canonical.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-03KVM: nSVM: leave the guest mode prior to loading a nested stateMaxim Levitsky1-2/+5
This allows the KVM to load the nested state more than once without warnings. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210503125446.1353307-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-03KVM: nSVM: fix few bugs in the vmcb02 caching logicMaxim Levitsky2-2/+13
* Define and use an invalid GPA (all ones) for init value of last and current nested vmcb physical addresses. * Reset the current vmcb12 gpa to the invalid value when leaving the nested mode, similar to what is done on nested vmexit. * Reset the last seen vmcb12 address when disabling the nested SVM, as it relies on vmcb02 fields which are freed at that point. Fixes: 4995a3685f1b ("KVM: SVM: Use a separate vmcb for the nested L2 guest") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210503125446.1353307-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-03KVM: nSVM: fix a typo in svm_leave_nestedMaxim Levitsky1-1/+1
When forcibly leaving the nested mode, we should switch to vmcb01 Fixes: 4995a3685f1b ("KVM: SVM: Use a separate vmcb for the nested L2 guest") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210503125446.1353307-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds6-1092/+1672
Pull kvm updates from Paolo Bonzini: "This is a large update by KVM standards, including AMD PSP (Platform Security Processor, aka "AMD Secure Technology") and ARM CoreSight (debug and trace) changes. ARM: - CoreSight: Add support for ETE and TRBE - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler x86: - AMD PSP driver changes - Optimizations and cleanup of nested SVM code - AMD: Support for virtual SPEC_CTRL - Optimizations of the new MMU code: fast invalidation, zap under read lock, enable/disably dirty page logging under read lock - /dev/kvm API for AMD SEV live migration (guest API coming soon) - support SEV virtual machines sharing the same encryption context - support SGX in virtual machines - add a few more statistics - improved directed yield heuristics - Lots and lots of cleanups Generic: - Rework of MMU notifier interface, simplifying and optimizing the architecture-specific code - a handful of "Get rid of oprofile leftovers" patches - Some selftests improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits) KVM: selftests: Speed up set_memory_region_test selftests: kvm: Fix the check of return value KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt() KVM: SVM: Skip SEV cache flush if no ASIDs have been used KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids() KVM: SVM: Drop redundant svm_sev_enabled() helper KVM: SVM: Move SEV VMCB tracking allocation to sev.c KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup() KVM: SVM: Unconditionally invoke sev_hardware_teardown() KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported) KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features KVM: SVM: Move SEV module params/variables to sev.c KVM: SVM: Disable SEV/SEV-ES if NPT is disabled KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails KVM: SVM: Zero out the VMCB array used to track SEV ASID association x86/sev: Drop redundant and potentially misleading 'sev_enabled' KVM: x86: Move reverse CPUID helpers to separate header file KVM: x86: Rename GPR accessors to make mode-aware variants the defaults ...
2021-04-28Merge branch 'for-5.13' of ↵Linus Torvalds2-10/+61
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup changes from Tejun Heo: "The only notable change is Vipin's new misc cgroup controller. This implements generic support for resources which can be controlled by simply counting and limiting the number of resource instances - ie there's X number of these on the system and this cgroup subtree can have upto Y of those. The first user is the address space IDs used for virtual machine memory encryption and expected future usages are similar - niche hardware features with concrete resource limits and simple usage models" * 'for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: use tsk->in_iowait instead of delayacct_is_task_waiting_on_io() cgroup/cpuset: fix typos in comments cgroup: misc: mark dummy misc_cg_res_total_usage() static inline svm/sev: Register SEV and SEV-ES ASIDs to the misc controller cgroup: Miscellaneous cgroup documentation. cgroup: Add misc cgroup controller
2021-04-26Merge tag 'x86_cleanups_for_v5.13' of ↵Linus Torvalds3-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 cleanups from Borislav Petkov: "Trivial cleanups and fixes all over the place" * tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Remove me from IDE/ATAPI section x86/pat: Do not compile stubbed functions when X86_PAT is off x86/asm: Ensure asm/proto.h can be included stand-alone x86/platform/intel/quark: Fix incorrect kernel-doc comment syntax in files x86/msr: Make locally used functions static x86/cacheinfo: Remove unneeded dead-store initialization x86/process/64: Move cpu_current_top_of_stack out of TSS tools/turbostat: Unmark non-kernel-doc comment x86/syscalls: Fix -Wmissing-prototypes warnings from COND_SYSCALL() x86/fpu/math-emu: Fix function cast warning x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes x86: Fix various typos in comments, take #2 x86: Remove unusual Unicode characters from comments x86/kaslr: Return boolean values from a function returning bool x86: Fix various typos in comments x86/setup: Remove unused RESERVE_BRK_ARRAY() stacktrace: Move documentation for arch_stack_walk_reliable() to header x86: Remove duplicate TSC DEADLINE MSR definitions
2021-04-26KVM: SVM: Skip SEV cache flush if no ASIDs have been usedSean Christopherson1-12/+13
Skip SEV's expensive WBINVD and DF_FLUSH if there are no SEV ASIDs waiting to be reclaimed, e.g. if SEV was never used. This "fixes" an issue where the DF_FLUSH fails during hardware teardown if the original SEV_INIT failed. Ideally, SEV wouldn't be marked as enabled in KVM if SEV_INIT fails, but that's a problem for another day. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-16-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids()Sean Christopherson1-1/+0
Remove the forward declaration of sev_flush_asids(), which is only a few lines above the function itself. No functional change intended. Reviewed by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-15-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Drop redundant svm_sev_enabled() helperSean Christopherson2-8/+3
Replace calls to svm_sev_enabled() with direct checks on sev_enabled, or in the case of svm_mem_enc_op, simply drop the call to svm_sev_enabled(). This effectively replaces checks against a valid max_sev_asid with checks against sev_enabled. sev_enabled is forced off by sev_hardware_setup() if max_sev_asid is invalid, all call sites are guaranteed to run after sev_hardware_setup(), and all of the checks care about SEV being fully enabled (as opposed to intentionally handling the scenario where max_sev_asid is valid but SEV enabling fails due to OOM). Reviewed by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-14-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Move SEV VMCB tracking allocation to sev.cSean Christopherson3-8/+20
Move the allocation of the SEV VMCB array to sev.c to help pave the way toward encapsulating SEV enabling wholly within sev.c. No functional change intended. Reviewed by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-13-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup()Sean Christopherson1-2/+1
Query max_sev_asid directly after setting it instead of bouncing through its wrapper, svm_sev_enabled(). Using the wrapper is unnecessary obfuscation. No functional change intended. Reviewed by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-12-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Unconditionally invoke sev_hardware_teardown()Sean Christopherson1-2/+1
Remove the redundant svm_sev_enabled() check when calling sev_hardware_teardown(), the teardown helper itself does the check. Removing the check from svm.c will eventually allow dropping svm_sev_enabled() entirely. No functional change intended. Reviewed by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-11-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported)Sean Christopherson1-2/+2
Enable the 'sev' and 'sev_es' module params by default instead of having them conditioned on CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT. The extra Kconfig is pointless as KVM SEV/SEV-ES support is already controlled via CONFIG_KVM_AMD_SEV, and CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT has the unfortunate side effect of enabling all the SEV-ES _guest_ code due to it being dependent on CONFIG_AMD_MEM_ENCRYPT=y. Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=ySean Christopherson1-1/+8
Define sev_enabled and sev_es_enabled as 'false' and explicitly #ifdef out all of sev_hardware_setup() if CONFIG_KVM_AMD_SEV=n. This kills three birds at once: - Makes sev_enabled and sev_es_enabled off by default if CONFIG_KVM_AMD_SEV=n. Previously, they could be on by default if CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y, regardless of KVM SEV support. - Hides the sev and sev_es modules params when CONFIG_KVM_AMD_SEV=n. - Resolves a false positive -Wnonnull in __sev_recycle_asids() that is currently masked by the equivalent IS_ENABLED(CONFIG_KVM_AMD_SEV) check in svm_sev_enabled(), which will be dropped in a future patch. Reviewed by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variablesSean Christopherson1-12/+12
Rename sev and sev_es to sev_enabled and sev_es_enabled respectively to better align with other KVM terminology, and to avoid pseudo-shadowing when the variables are moved to sev.c in a future patch ('sev' is often used for local struct kvm_sev_info pointers. No functional change intended. Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SEV: Mask CPUID[0x8000001F].eax according to supported featuresPaolo Bonzini3-0/+12
Add a reverse-CPUID entry for the memory encryption word, 0x8000001F.EAX, and use it to override the supported CPUID flags reported to userspace. Masking the reported CPUID flags avoids over-reporting KVM support, e.g. without the mask a SEV-SNP capable CPU may incorrectly advertise SNP support to userspace. Clear SEV/SEV-ES if their corresponding module parameters are disabled, and clear the memory encryption leaf completely if SEV is not fully supported in KVM. Advertise SME_COHERENT in addition to SEV and SEV-ES, as the guest can use SME_COHERENT to avoid CLFLUSH operations. Explicitly omit SME and VM_PAGE_FLUSH from the reporting. These features are used by KVM, but are not exposed to the guest, e.g. guest access to related MSRs will fault. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Move SEV module params/variables to sev.cSean Christopherson3-16/+13
Unconditionally invoke sev_hardware_setup() when configuring SVM and handle clearing the module params/variable 'sev' and 'sev_es' in sev_hardware_setup(). This allows making said variables static within sev.c and reduces the odds of a collision with guest code, e.g. the guest side of things has already laid claim to 'sev_enabled'. Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Disable SEV/SEV-ES if NPT is disabledSean Christopherson1-15/+15
Disable SEV and SEV-ES if NPT is disabled. While the APM doesn't clearly state that NPT is mandatory, it's alluded to by: The guest page tables, managed by the guest, may mark data memory pages as either private or shared, thus allowing selected pages to be shared outside the guest. And practically speaking, shadow paging can't work since KVM can't read the guest's page tables. Fixes: e9df09428996 ("KVM: SVM: Add sev module_param") Cc: Brijesh Singh <brijesh.singh@amd.com Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Free sev_asid_bitmap during init if SEV setup failsSean Christopherson1-1/+4
Free sev_asid_bitmap if the reclaim bitmap allocation fails, othwerise KVM will unnecessarily keep the bitmap when SEV is not fully enabled. Freeing the page is also necessary to avoid introducing a bug when a future patch eliminates svm_sev_enabled() in favor of using the global 'sev' flag directly. While sev_hardware_enabled() checks max_sev_asid, which is true even if KVM setup fails, 'sev' will be true if and only if KVM setup fully succeeds. Fixes: 33af3a7ef9e6 ("KVM: SVM: Reduce WBINVD/DF_FLUSH invocations") Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Zero out the VMCB array used to track SEV ASID associationSean Christopherson1-3/+2
Zero out the array of VMCB pointers so that pre_sev_run() won't see garbage when querying the array to detect when an SEV ASID is being associated with a new VMCB. In practice, reading random values is all but guaranteed to be benign as a false negative (which is extremely unlikely on its own) can only happen on CPU0 on the first VMRUN and would only cause KVM to skip the ASID flush. For anything bad to happen, a previous instance of KVM would have to exit without flushing the ASID, _and_ KVM would have to not flush the ASID at any time while building the new SEV guest. Cc: Borislav Petkov <bp@suse.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Fixes: 70cd94e60c73 ("KVM: SVM: VMRUN should use associated ASID when SEV is enabled") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422021125.3417167-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: x86: Rename GPR accessors to make mode-aware variants the defaultsSean Christopherson1-4/+4
Append raw to the direct variants of kvm_register_read/write(), and drop the "l" from the mode-aware variants. I.e. make the mode-aware variants the default, and make the direct variants scary sounding so as to discourage use. Accessing the full 64-bit values irrespective of mode is rarely the desired behavior. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422022128.3464144-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Use default rAX size for INVLPGA emulationSean Christopherson1-3/+9
Drop bits 63:32 of RAX when grabbing the address for INVLPGA emulation outside of 64-bit mode to make KVM's emulation slightly less wrong. The address for INVLPGA is determined by the effective address size, i.e. it's not hardcoded to 64/32 bits for a given mode. Add a FIXME to call out that the emulation is wrong. Opportunistically tweak the ASID handling to make it clear that it's defined by ECX, not rCX. Per the APM: The portion of rAX used to form the address is determined by the effective address size (current execution mode and optional address size prefix). The ASID is taken from ECX. Fixes: ff092385e828 ("KVM: SVM: Implement INVLPGA") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422022128.3464144-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Truncate GPR value for DR and CR accesses in !64-bit modeSean Christopherson1-4/+4
Drop bits 63:32 on loads/stores to/from DRs and CRs when the vCPU is not in 64-bit mode. The APM states bits 63:32 are dropped for both DRs and CRs: In 64-bit mode, the operand size is fixed at 64 bits without the need for a REX prefix. In non-64-bit mode, the operand size is fixed at 32 bits and the upper 32 bits of the destination are forced to 0. Fixes: 7ff76d58a9dc ("KVM: SVM: enhance MOV CR intercept handler") Fixes: cae3797a4639 ("KVM: SVM: enhance mov DR intercept handler") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210422022128.3464144-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Delay restoration of host MSR_TSC_AUX until return to userspaceSean Christopherson2-36/+24
Use KVM's "user return MSRs" framework to defer restoring the host's MSR_TSC_AUX until the CPU returns to userspace. Add/improve comments to clarify why MSR_TSC_AUX is intercepted on both RDMSR and WRMSR, and why it's safe for KVM to keep the guest's value loaded even if KVM is scheduled out. Cc: Reiji Watanabe <reijiw@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210423223404.3860547-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Clear MSR_TSC_AUX[63:32] on writeSean Christopherson1-1/+11
Force clear bits 63:32 of MSR_TSC_AUX on write to emulate current AMD CPUs, which completely ignore the upper 32 bits, including dropping them on write. Emulating AMD hardware will also allow migrating a vCPU from AMD hardware to Intel hardware without requiring userspace to manually clear the upper bits, which are reserved on Intel hardware. Presumably, MSR_TSC_AUX[63:32] are intended to be reserved on AMD, but sadly the APM doesn't say _anything_ about those bits in the context of MSR access. The RDTSCP entry simply states that RCX contains bits 31:0 of the MSR, zero extended. And even worse is that the RDPID description implies that it can consume all 64 bits of the MSR: RDPID reads the value of TSC_AUX MSR used by the RDTSCP instruction into the specified destination register. Normal operand size prefixes do not apply and the update is either 32 bit or 64 bit based on the current mode. Emulate current hardware behavior to give KVM the best odds of playing nice with whatever the behavior of future AMD CPUs happens to be. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210423223404.3860547-3-seanjc@google.com> [Fix broken patch. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-26KVM: SVM: Inject #GP on guest MSR_TSC_AUX accesses if RDTSCP unsupportedSean Christopherson1-0/+7
Inject #GP on guest accesses to MSR_TSC_AUX if RDTSCP is unsupported in the guest's CPUID model. Fixes: 46896c73c1a4 ("KVM: svm: add support for RDTSCP") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210423223404.3860547-2-seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-22Merge branch 'kvm-sev-cgroup' into HEADPaolo Bonzini4-16/+77
2021-04-21KVM: SVM: Allocate SEV command structures on local stackSean Christopherson1-305/+173
Use the local stack to "allocate" the structures used to communicate with the PSP. The largest struct used by KVM, sev_data_launch_secret, clocks in at 52 bytes, well within the realm of reasonable stack usage. The smallest structs are a mere 4 bytes, i.e. the pointer for the allocation is larger than the allocation itself. Now that the PSP driver plays nice with vmalloc pointers, putting the data on a virtually mapped stack (CONFIG_VMAP_STACK=y) will not cause explosions. Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210406224952.4177376-9-seanjc@google.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> [Apply same treatment to PSP migration commands. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-21KVM: SVM: Add KVM_SEV_RECEIVE_FINISH commandBrijesh Singh1-0/+23
The command finalize the guest receiving process and make the SEV guest ready for the execution. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Steve Rutherford <srutherford@google.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Message-Id: <d08914dc259644de94e29b51c3b68a13286fc5a3.1618498113.git.ashish.kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>