summaryrefslogtreecommitdiff
path: root/arch/x86/kvm
AgeCommit message (Collapse)AuthorFilesLines
2023-02-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds56-1397/+2207
Pull kvm updates from Paolo Bonzini: "ARM: - Provide a virtual cache topology to the guest to avoid inconsistencies with migration on heterogenous systems. Non secure software has no practical need to traverse the caches by set/way in the first place - Add support for taking stage-2 access faults in parallel. This was an accidental omission in the original parallel faults implementation, but should provide a marginal improvement to machines w/o FEAT_HAFDBS (such as hardware from the fruit company) - A preamble to adding support for nested virtualization to KVM, including vEL2 register state, rudimentary nested exception handling and masking unsupported features for nested guests - Fixes to the PSCI relay that avoid an unexpected host SVE trap when resuming a CPU when running pKVM - VGIC maintenance interrupt support for the AIC - Improvements to the arch timer emulation, primarily aimed at reducing the trap overhead of running nested - Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the interest of CI systems - Avoid VM-wide stop-the-world operations when a vCPU accesses its own redistributor - Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions in the host - Aesthetic and comment/kerneldoc fixes - Drop the vestiges of the old Columbia mailing list and add [Oliver] as co-maintainer RISC-V: - Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE - Correctly place the guest in S-mode after redirecting a trap to the guest - Redirect illegal instruction traps to guest - SBI PMU support for guest s390: - Sort out confusion between virtual and physical addresses, which currently are the same on s390 - A new ioctl that performs cmpxchg on guest memory - A few fixes x86: - Change tdp_mmu to a read-only parameter - Separate TDP and shadow MMU page fault paths - Enable Hyper-V invariant TSC control - Fix a variety of APICv and AVIC bugs, some of them real-world, some of them affecting architecurally legal but unlikely to happen in practice - Mark APIC timer as expired if its in one-shot mode and the count underflows while the vCPU task was being migrated - Advertise support for Intel's new fast REP string features - Fix a double-shootdown issue in the emergency reboot code - Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give SVM similar treatment to VMX - Update Xen's TSC info CPUID sub-leaves as appropriate - Add support for Hyper-V's extended hypercalls, where "support" at this point is just forwarding the hypercalls to userspace - Clean up the kvm->lock vs. kvm->srcu sequences when updating the PMU and MSR filters - One-off fixes and cleanups - Fix and cleanup the range-based TLB flushing code, used when KVM is running on Hyper-V - Add support for filtering PMU events using a mask. If userspace wants to restrict heavily what events the guest can use, it can now do so without needing an absurd number of filter entries - Clean up KVM's handling of "PMU MSRs to save", especially when vPMU support is disabled - Add PEBS support for Intel Sapphire Rapids - Fix a mostly benign overflow bug in SEV's send|receive_update_data() - Move several SVM-specific flags into vcpu_svm x86 Intel: - Handle NMI VM-Exits before leaving the noinstr region - A few trivial cleanups in the VM-Enter flows - Stop enabling VMFUNC for L1 purely to document that KVM doesn't support EPTP switching (or any other VM function) for L1 - Fix a crash when using eVMCS's enlighted MSR bitmaps Generic: - Clean up the hardware enable and initialization flow, which was scattered around multiple arch-specific hooks. Instead, just let the arch code call into generic code. Both x86 and ARM should benefit from not having to fight common KVM code's notion of how to do initialization - Account allocations in generic kvm_arch_alloc_vm() - Fix a memory leak if coalesced MMIO unregistration fails selftests: - On x86, cache the CPU vendor (AMD vs. Intel) and use the info to emit the correct hypercall instruction instead of relying on KVM to patch in VMMCALL - Use TAP interface for kvm_binary_stats_test and tsc_msrs_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (325 commits) KVM: SVM: hyper-v: placate modpost section mismatch error KVM: x86/mmu: Make tdp_mmu_allowed static KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes KVM: arm64: nv: Filter out unsupported features from ID regs KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2 KVM: arm64: nv: Allow a sysreg to be hidden from userspace only KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2 KVM: arm64: nv: Handle SMCs taken from virtual EL2 KVM: arm64: nv: Handle trapped ERET from virtual EL2 KVM: arm64: nv: Inject HVC exceptions to the virtual EL2 KVM: arm64: nv: Support virtual EL2 exceptions KVM: arm64: nv: Handle HCR_EL2.NV system register traps KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state KVM: arm64: nv: Add EL2 system registers to vcpu context KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set KVM: arm64: nv: Introduce nested virtualization VCPU feature KVM: arm64: Use the S2 MMU context to iterate over S2 table ...
2023-02-24Merge tag 'mm-nonmm-stable-2023-02-20-15-29' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "There is no particular theme here - mainly quick hits all over the tree. Most notable is a set of zlib changes from Mikhail Zaslonko which enhances and fixes zlib's use of S390 hardware support: 'lib/zlib: Set of s390 DFLTCC related patches for kernel zlib'" * tag 'mm-nonmm-stable-2023-02-20-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (55 commits) Update CREDITS file entry for Jesper Juhl sparc: allow PM configs for sparc32 COMPILE_TEST hung_task: print message when hung_task_warnings gets down to zero. arch/Kconfig: fix indentation scripts/tags.sh: fix the Kconfig tags generation when using latest ctags nilfs2: prevent WARNING in nilfs_dat_commit_end() lib/zlib: remove redundation assignement of avail_in dfltcc_gdht() lib/Kconfig.debug: do not enable DEBUG_PREEMPT by default lib/zlib: DFLTCC always switch to software inflate for Z_PACKET_FLUSH option lib/zlib: DFLTCC support inflate with small window lib/zlib: Split deflate and inflate states for DFLTCC lib/zlib: DFLTCC not writing header bits when avail_out == 0 lib/zlib: fix DFLTCC ignoring flush modes when avail_in == 0 lib/zlib: fix DFLTCC not flushing EOBS when creating raw streams lib/zlib: implement switching between DFLTCC and software lib/zlib: adjust offset calculation for dfltcc_state nilfs2: replace WARN_ONs for invalid DAT metadata block requests scripts/spelling.txt: add "exsits" pattern and fix typo instances fs: gracefully handle ->get_block not mapping bh in __mpage_writepage cramfs: Kconfig: fix spelling & punctuation ...
2023-02-22KVM: SVM: hyper-v: placate modpost section mismatch errorRandy Dunlap1-2/+2
modpost reports section mismatch errors/warnings: WARNING: modpost: vmlinux.o: section mismatch in reference: svm_hv_hardware_setup (section: .text) -> (unknown) (section: .init.data) WARNING: modpost: vmlinux.o: section mismatch in reference: svm_hv_hardware_setup (section: .text) -> (unknown) (section: .init.data) WARNING: modpost: vmlinux.o: section mismatch in reference: svm_hv_hardware_setup (section: .text) -> (unknown) (section: .init.data) This "(unknown) (section: .init.data)" all refer to svm_x86_ops. Tag svm_hv_hardware_setup() with __init to fix a modpost warning as the non-stub implementation accesses __initdata (svm_x86_ops), i.e. would generate a use-after-free if svm_hv_hardware_setup() were actually invoked post-init. The helper is only called from svm_hardware_setup(), which is also __init, i.e. lack of __init is benign other than the modpost warning. Fixes: 1e0c7d40758b ("KVM: SVM: hyper-v: Remote TLB flush for SVM") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Vineeth Pillai <viremana@linux.microsoft.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20230222073315.9081-1-rdunlap@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-02-22Merge tag 'kvm-x86-apic-6.3' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini3-49/+70
KVM x86 APIC changes for 6.3: - Remove a superfluous variables from apic_get_tmcct() - Fix various edge cases in x2APIC MSR emulation - Mark APIC timer as expired if its in one-shot mode and the count underflows while the vCPU task was being migrated - Reset xAPIC when userspace forces "impossible" x2APIC => xAPIC transition
2023-02-22Merge tag 'x86_cpu_for_v6.3_rc1' of ↵Linus Torvalds4-19/+29
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Borislav Petkov: - Cache the AMD debug registers in per-CPU variables to avoid MSR writes where possible, when supporting a debug registers swap feature for SEV-ES guests - Add support for AMD's version of eIBRS called Automatic IBRS which is a set-and-forget control of indirect branch restriction speculation resources on privilege change - Add support for a new x86 instruction - LKGS - Load kernel GS which is part of the FRED infrastructure - Reset SPEC_CTRL upon init to accomodate use cases like kexec which rediscover - Other smaller fixes and cleanups * tag 'x86_cpu_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/amd: Cache debug register values in percpu variables KVM: x86: Propagate the AMD Automatic IBRS feature to the guest x86/cpu: Support AMD Automatic IBRS x86/cpu, kvm: Add the SMM_CTL MSR not present feature x86/cpu, kvm: Add the Null Selector Clears Base feature x86/cpu, kvm: Move X86_FEATURE_LFENCE_RDTSC to its native leaf x86/cpu, kvm: Add the NO_NESTED_DATA_BP feature KVM: x86: Move open-coded CPUID leaf 0x80000021 EAX bit propagation code x86/cpu, kvm: Add support for CPUID_80000021_EAX x86/gsseg: Add the new <asm/gsseg.h> header to <asm/asm-prototypes.h> x86/gsseg: Use the LKGS instruction if available for load_gs_index() x86/gsseg: Move load_gs_index() to its own new header file x86/gsseg: Make asm_load_gs_index() take an u16 x86/opcode: Add the LKGS instruction to x86-opcode-map x86/cpufeature: Add the CPU feature bit for LKGS x86/bugs: Reset speculation control settings on init x86/cpu: Remove redundant extern x86_read_arch_cap_msr()
2023-02-16kvm: initialize all of the kvm_debugregs structure before sending it to ↵Greg Kroah-Hartman1-2/+1
userspace When calling the KVM_GET_DEBUGREGS ioctl, on some configurations, there might be some unitialized portions of the kvm_debugregs structure that could be copied to userspace. Prevent this as is done in the other kvm ioctls, by setting the whole structure to 0 before copying anything into it. Bonus is that this reduces the lines of code as the explicit flag setting and reserved space zeroing out can be removed. Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: <x86@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: stable <stable@kernel.org> Reported-by: Xingyuan Mo <hdthky0@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20230214103304.3689213-1-gregkh@linuxfoundation.org> Tested-by: Xingyuan Mo <hdthky0@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-02-16KVM: x86/mmu: Make tdp_mmu_allowed staticDavid Matlack1-1/+1
Make tdp_mmu_allowed static since it is only ever used within arch/x86/kvm/mmu/mmu.c. Link: https://lore.kernel.org/kvm/202302072055.odjDVd5V-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20230213212844.3062733-1-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-02-15Merge tag 'kvm-riscv-6.3-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini1-12/+9
KVM/riscv changes for 6.3 - Fix wrong usage of PGDIR_SIZE to check page sizes - Fix privilege mode setting in kvm_riscv_vcpu_trap_redirect() - Redirect illegal instruction traps to guest - SBI PMU support for guest
2023-02-15Merge tag 'kvm-x86-vmx-6.3' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini9-113/+133
KVM VMX changes for 6.3: - Handle NMI VM-Exits before leaving the noinstr region - A few trivial cleanups in the VM-Enter flows - Stop enabling VMFUNC for L1 purely to document that KVM doesn't support EPTP switching (or any other VM function) for L1 - Fix a crash when using eVMCS's enlighted MSR bitmaps
2023-02-15Merge tag 'kvm-x86-svm-6.3' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini9-41/+60
KVM SVM changes for 6.3: - Fix a mostly benign overflow bug in SEV's send|receive_update_data() - Move the SVM-specific "host flags" into vcpu_svm (extracted from the vNMI enabling series) - A handful for fixes and cleanups
2023-02-15KVM: x86/pmu: Disable vPMU support on hybrid CPUs (host PMUs)Sean Christopherson1-7/+19
Disable KVM support for virtualizing PMUs on hosts with hybrid PMUs until KVM gains a sane way to enumeration the hybrid vPMU to userspace and/or gains a mechanism to let userspace opt-in to the dangers of exposing a hybrid vPMU to KVM guests. Virtualizing a hybrid PMU, or at least part of a hybrid PMU, is possible, but it requires careful, deliberate configuration from userspace. E.g. to expose full functionality, vCPUs need to be pinned to pCPUs to prevent migrating a vCPU between a big core and a little core, userspace must enumerate a reasonable topology to the guest, and guest CPUID must be curated per vCPU to enumerate accurate vPMU capabilities. The last point is especially problematic, as KVM doesn't control which pCPU it runs on when enumerating KVM's vPMU capabilities to userspace, i.e. userspace can't rely on KVM_GET_SUPPORTED_CPUID in it's current form. Alternatively, userspace could enable vPMU support by enumerating the set of features that are common and coherent across all cores, e.g. by filtering PMU events and restricting guest capabilities. But again, that requires userspace to take action far beyond reflecting KVM's supported feature set into the guest. For now, simply disable vPMU support on hybrid CPUs to avoid inducing seemingly random #GPs in guests, and punt support for hybrid CPUs to a future enabling effort. Reported-by: Jianfeng Gao <jianfeng.gao@intel.com> Cc: stable@vger.kernel.org Cc: Andrew Cooper <Andrew.Cooper3@citrix.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/all/20220818181530.2355034-1-kan.liang@linux.intel.com Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230208204230.1360502-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-02-15Merge tag 'kvm-x86-pmu-6.3' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini9-184/+401
KVM x86 PMU changes for 6.3: - Add support for created masked events for the PMU filter to allow userspace to heavily restrict what events the guest can use without needing to create an absurd number of events - Clean up KVM's handling of "PMU MSRs to save", especially when vPMU support is disabled - Add PEBS support for Intel SPR
2023-02-15Merge tag 'kvm-x86-mmu-6.3' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini7-48/+63
KVM x86 MMU changes for 6.3: - Fix and cleanup the range-based TLB flushing code, used when KVM is running on Hyper-V - A few one-off cleanups
2023-02-10KVM: x86: Mitigate the cross-thread return address predictions bugTom Lendacky1-11/+32
By default, KVM/SVM will intercept attempts by the guest to transition out of C0. However, the KVM_CAP_X86_DISABLE_EXITS capability can be used by a VMM to change this behavior. To mitigate the cross-thread return address predictions bug (X86_BUG_SMT_RSB), a VMM must not be allowed to override the default behavior to intercept C0 transitions. Use a module parameter to control the mitigation on processors that are vulnerable to X86_BUG_SMT_RSB. If the processor is vulnerable to the X86_BUG_SMT_RSB bug and the module parameter is set to mitigate the bug, KVM will not allow the disabling of the HLT, MWAIT and CSTATE exits. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <4019348b5e07148eb4d593380a5f6713b93c9a16.1675956146.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-02-08KVM: SVM: Fix potential overflow in SEV's send|receive_update_data()Peter Gonda1-2/+2
KVM_SEV_SEND_UPDATE_DATA and KVM_SEV_RECEIVE_UPDATE_DATA have an integer overflow issue. Params.guest_len and offset are both 32 bits wide, with a large params.guest_len the check to confirm a page boundary is not crossed can falsely pass: /* Check if we are crossing the page boundary * offset = params.guest_uaddr & (PAGE_SIZE - 1); if ((params.guest_len + offset > PAGE_SIZE)) Add an additional check to confirm that params.guest_len itself is not greater than PAGE_SIZE. Note, this isn't a security concern as overflow can happen if and only if params.guest_len is greater than 0xfffff000, and the FW spec says these commands fail with lengths greater than 16KB, i.e. the PSP will detect KVM's goof. Fixes: 15fb7de1a7f5 ("KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command") Fixes: d3d1af85e2c7 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command") Reported-by: Andy Nguyen <theflow@google.com> Suggested-by: Thomas Lendacky <thomas.lendacky@amd.com> Signed-off-by: Peter Gonda <pgonda@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20230207171354.4012821-1-pgonda@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-07KVM: VMX: Fix crash due to uninitialized current_vmcsAlexandru Matei2-13/+7
KVM enables 'Enlightened VMCS' and 'Enlightened MSR Bitmap' when running as a nested hypervisor on top of Hyper-V. When MSR bitmap is updated, evmcs_touch_msr_bitmap function uses current_vmcs per-cpu variable to mark that the msr bitmap was changed. vmx_vcpu_create() modifies the msr bitmap via vmx_disable_intercept_for_msr -> vmx_msr_bitmap_l01_changed which in the end calls this function. The function checks for current_vmcs if it is null but the check is insufficient because current_vmcs is not initialized. Because of this, the code might incorrectly write to the structure pointed by current_vmcs value left by another task. Preemption is not disabled, the current task can be preempted and moved to another CPU while current_vmcs is accessed multiple times from evmcs_touch_msr_bitmap() which leads to crash. The manipulation of MSR bitmaps by callers happens only for vmcs01 so the solution is to use vmx->vmcs01.vmcs instead of current_vmcs. BUG: kernel NULL pointer dereference, address: 0000000000000338 PGD 4e1775067 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI ... RIP: 0010:vmx_msr_bitmap_l01_changed+0x39/0x50 [kvm_intel] ... Call Trace: vmx_disable_intercept_for_msr+0x36/0x260 [kvm_intel] vmx_vcpu_create+0xe6/0x540 [kvm_intel] kvm_arch_vcpu_create+0x1d1/0x2e0 [kvm] kvm_vm_ioctl_create_vcpu+0x178/0x430 [kvm] kvm_vm_ioctl+0x53f/0x790 [kvm] __x64_sys_ioctl+0x8a/0xc0 do_syscall_64+0x5c/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: ceef7d10dfb6 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap support") Cc: stable@vger.kernel.org Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com> Link: https://lore.kernel.org/r/20230123221208.4964-1-alexandru.matei@uipath.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-07KVM: nVMX: Simplify the setting of SECONDARY_EXEC_ENABLE_VMFUNC for nested.Yu Zhang1-9/+5
Values of base settings for nested proc-based VM-Execution control MSR come from the ones for non-nested. And for SECONDARY_EXEC_ENABLE_VMFUNC flag, KVM currently a) first mask off it from vmcs_conf->cpu_based_2nd_exec_ctrl; b) then check it against the same source; c) and reset it again if host has it. So just simplify this, by not masking off SECONDARY_EXEC_ENABLE_VMFUNC in the first place. No functional change. Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com> Link: https://lore.kernel.org/r/20221109075413.1405803-3-yu.c.zhang@linux.intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-07KVM: VMX: Do not trap VMFUNC instructions for L1 guests.Yu Zhang2-4/+9
Explicitly disable VMFUNC in vmcs01 to document that KVM doesn't support any VM-Functions for L1. WARN in the dedicated VMFUNC handler if an exit occurs while L1 is active, but keep the existing handlers as fallbacks to avoid killing the VM as an unexpected VMFUNC VM-Exit isn't fatal Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com> Link: https://lore.kernel.org/r/20221109075413.1405803-2-yu.c.zhang@linux.intel.com [sean: don't kill the VM on an unexpected VMFUNC from L1, reword changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-04KVM: x86: Simplify msr_io()Michal Luczaj1-9/+3
As of commit bccf2150fe62 ("KVM: Per-vcpu inodes"), __msr_io() doesn't return a negative value. Remove unnecessary checks. Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://lore.kernel.org/r/20230107001256.2365304-7-mhal@rbox.co [sean: call out commit which left behind the unnecessary check] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-04KVM: x86: Remove unnecessary initialization in kvm_vm_ioctl_set_msr_filter()Michal Luczaj1-1/+1
Do not initialize the value of `r`, as it will be overwritten. Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://lore.kernel.org/r/20230107001256.2365304-6-mhal@rbox.co Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-04KVM: x86: Explicitly state lockdep condition of msr_filter updateMichal Luczaj1-2/+2
Replace `1` with the actual mutex_is_locked() check. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://lore.kernel.org/r/20230107001256.2365304-5-mhal@rbox.co [sean: delete the comment that explained the hardocded '1'] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-04KVM: x86: Simplify msr_filter updateMichal Luczaj1-4/+1
Replace srcu_dereference()+rcu_assign_pointer() sequence with a single rcu_replace_pointer(). Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://lore.kernel.org/r/20230107001256.2365304-4-mhal@rbox.co Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-04KVM: x86: Optimize kvm->lock and SRCU interaction (KVM_X86_SET_MSR_FILTER)Michal Luczaj1-1/+1
Reduce time spent holding kvm->lock: unlock mutex before calling synchronize_srcu(). There is no need to hold kvm->lock until all vCPUs have been kicked, KVM only needs to guarantee that all vCPUs will switch to the new filter before exiting to userspace. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://lore.kernel.org/r/20230107001256.2365304-3-mhal@rbox.co [sean: expand changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-04KVM: x86: Optimize kvm->lock and SRCU interaction (KVM_SET_PMU_EVENT_FILTER)Michal Luczaj1-2/+1
Reduce time spent holding kvm->lock: unlock mutex before calling synchronize_srcu_expedited(). There is no need to hold kvm->lock until all vCPUs have been kicked, KVM only needs to guarantee that all vCPUs will switch to the new filter before exiting to userspace. Protecting the write to __reprogram_pmi is also unnecessary as a vCPU may process a set bit before receiving the final KVM_REQ_PMU, but the per-vCPU writes are guaranteed to occur after all vCPUs have switched to the new filter. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Link: https://lore.kernel.org/r/20230107001256.2365304-2-mhal@rbox.co [sean: expand changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-04KVM: x86/emulator: Fix comment in __load_segment_descriptor()Michal Luczaj1-1/+1
The comment refers to the same condition twice. Make it reflect what the code actually does. No functional change intended. Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20230126013405.2967156-3-mhal@rbox.co Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-04KVM: x86/emulator: Fix segment load privilege level validationMichal Luczaj1-2/+2
Intel SDM describes what steps are taken by the CPU to verify if a memory segment can actually be used at a given privilege level. Loading DS/ES/FS/GS involves checking segment's type as well as making sure that neither selector's RPL nor caller's CPL are greater than segment's DPL. Emulator implements Intel's pseudocode in __load_segment_descriptor(), even quoting the pseudocode in the comments. Although the pseudocode is correctly translated, the implementation is incorrect. This is most likely due to SDM, at the time, being wrong. Patch fixes emulator's logic and updates the pseudocode in the comment. Below are historical notes. Emulator code for handling segment descriptors appears to have been introduced in March 2010 in commit 38ba30ba51a0 ("KVM: x86 emulator: Emulate task switch in emulator.c"). Intel SDM Vol 2A: Instruction Set Reference, A-M (Order Number: 253666-034US, _March 2010_) lists the steps for loading segment registers in section related to MOV instruction: IF DS, ES, FS, or GS is loaded with non-NULL selector THEN IF segment selector index is outside descriptor table limits or segment is not a data or readable code segment or ((segment is a data or nonconforming code segment) and (both RPL and CPL > DPL)) <--- THEN #GP(selector); FI; This is precisely what __load_segment_descriptor() quotes and implements. But there's a twist; a few SDM revisions later (253667-044US), in August 2012, the snippet above becomes: IF DS, ES, FS, or GS is loaded with non-NULL selector THEN IF segment selector index is outside descriptor table limits or segment is not a data or readable code segment or ((segment is a data or nonconforming code segment) [note: missing or superfluous parenthesis?] or ((RPL > DPL) and (CPL > DPL)) <--- THEN #GP(selector); FI; Many SDMs later (253667-065US), in December 2017, pseudocode reaches what seems to be its final form: IF DS, ES, FS, or GS is loaded with non-NULL selector THEN IF segment selector index is outside descriptor table limits OR segment is not a data or readable code segment OR ((segment is a data or nonconforming code segment) AND ((RPL > DPL) or (CPL > DPL))) <--- THEN #GP(selector); FI; which also matches the behavior described in AMD's APM, which states that a #GP occurs if: The DS, ES, FS, or GS register was loaded and the segment pointed to was a data or non-conforming code segment, but the RPL or CPL was greater than the DPL. Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20230126013405.2967156-2-mhal@rbox.co [sean: add blurb to changelog calling out AMD agrees] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-03scripts/spelling.txt: add `permitted'Ricardo Ribalda1-4/+4
Patch series "spelling: Fix some trivial typos". Seems like permitted has two t's :), Lets add that to spellings to help others. This patch (of 3): Add another common typo. Noticed when I sent a patch with the typo and in kvm and of. [ribalda@chromium.org: fix trivial typo] Link: https://lkml.kernel.org/r/20221220-permited-v1-2-52ea9857fa61@chromium.org Link: https://lkml.kernel.org/r/20221220-permited-v1-1-52ea9857fa61@chromium.org Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02KVM: x86/pmu: Add PRIR++ and PDist support for SPR and later modelsLike Xu1-12/+33
The pebs capability on the SPR is basically the same as Ice Lake Server with the exception of two special facilities that have been enhanced and require special handling. Upon triggering a PEBS assist, there will be a finite delay between the time the counter overflows and when the microcode starts to carry out its data collection obligations. Even if the delay is constant in core clock space, it invariably manifest as variable "skids" in instruction address space. On the Ice Lake Server, the Precise Distribution of Instructions Retire (PDIR) facility mitigates the "skid" problem by providing an early indication of when the counter is about to overflow. On SPR, the PDIR counter available (Fixed 0) is unchanged, but the capability is enhanced to Instruction-Accurate PDIR (PDIR++), where PEBS is taken on the next instruction after the one that caused the overflow. SPR also introduces a new Precise Distribution (PDist) facility only on general programmable counter 0. Per Intel SDM, PDist eliminates any skid or shadowing effects from PEBS. With PDist, the PEBS record will be generated precisely upon completion of the instruction or operation that causes the counter to overflow (there is no "wait for next occurrence" by default). In terms of KVM handling, when guest accesses those special counters, the KVM needs to request the same index counters via the perf_event kernel subsystem to ensure that the guest uses the correct pebs hardware counter (PRIR++ or PDist). This is mainly achieved by adjusting the event precise level to the maximum, where the semantics of this magic number is mainly defined by the internal software context of perf_event and it's also backwards compatible as part of the user space interface. Opportunistically, refine confusing comments on TNT+, as the only ones that currently support pebs_ept are Ice Lake server and SPR (GLC+). Signed-off-by: Like Xu <likexu@tencent.com> Link: https://lore.kernel.org/r/20221109082802.27543-3-likexu@tencent.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-02KVM: x86: Reinitialize xAPIC ID when userspace forces x2APIC => xAPICEmanuele Giuseppe Esposito1-2/+6
Reinitialize the xAPIC ID to the vCPU ID when userspace forces the APIC to transition directly from x2APIC to xAPIC mode, e.g. to emulate RESET. KVM already stuffs the xAPIC ID when the APIC is transitioned from DISABLED to xAPIC (commit 49bd29ba1dbd ("KVM: x86: reset APIC ID when enabling LAPIC")), i.e. userspace is conditioned to expect KVM to update the xAPIC ID, but KVM doesn't handle the architecturally-impossible case where userspace forces x2APIC=>xAPIC via KVM_SET_MSRS. On its own, the "bug" is benign, as userspace emulation of RESET will also stuff APIC registers via KVM_SET_LAPIC, i.e. will manually set the xAPIC ID. However, commit 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base") introduced a bug, fixed by commit commit ef40757743b4 ("KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself"), that caused KVM to fail to properly update the xAPIC ID when handling KVM_SET_LAPIC. Refresh the xAPIC ID even though it's not strictly necessary so that KVM provides consistent behavior. Note, KVM follows Intel architecture with regard to handling the xAPIC ID and x2APIC IDs across mode transitions. For the APIC DISABLED case (commit 49bd29ba1dbd), Intel's SDM says the xAPIC ID _may_ be reinitialized 10.4.3 Enabling or Disabling the Local APIC When IA32_APIC_BASE[11] is set to 0, prior initialization to the APIC may be lost and the APIC may return to the state described in Section 10.4.7.1, “Local APIC State After Power-Up or Reset.” 10.4.7.1 Local APIC State After Power-Up or Reset ... The local APIC ID register is set to a unique APIC ID. ... i.e. KVM's behavior is legal as per Intel's architecture. In practice, Intel's behavior is N/A as modern Intel CPUs (since at least Haswell) make the xAPIC ID fully read-only. And for xAPIC => x2APIC transitions (commit 257b9a5faab5 ("KVM: x86: use correct APIC ID on x2APIC transition")), Intel's SDM says: Any APIC ID value written to the memory-mapped local APIC ID register is not preserved. AMD's APM says nothing (that I could find) about the xAPIC ID when the APIC is DISABLED, but testing on bare metal (Rome) shows that the xAPIC ID is preserved when the APIC is DISABLED and re-enabled in xAPIC mode. AMD also preserves the xAPIC ID when the APIC is transitioned from xAPIC to x2APIC, i.e. allows a backdoor write of the x2APIC ID, which is again not emulated by KVM. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Link: https://lore.kernel.org/all/20230109130605.2013555-2-eesposit@redhat.com [sean: rewrite changelog, set xAPIC ID iff APIC is enabled] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-02KVM: x86: hyper-v: Add extended hypercall support in Hyper-vVipin Sharma1-0/+28
Add support for extended hypercall in Hyper-v. Hyper-v TLFS 6.0b describes hypercalls above call code 0x8000 as extended hypercalls. A Hyper-v hypervisor's guest VM finds availability of extended hypercalls via CPUID.0x40000003.EBX BIT(20). If the bit is set then the guest can call extended hypercalls. All extended hypercalls will exit to userspace by default. This allows for easy support of future hypercalls without being dependent on KVM releases. If there will be need to process the hypercall in KVM instead of userspace then KVM can create a capability which userspace can query to know which hypercalls can be handled by the KVM and enable handling of those hypercalls. Signed-off-by: Vipin Sharma <vipinsh@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20221212183720.4062037-10-vipinsh@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-02KVM: x86: hyper-v: Use common code for hypercall userspace exitVipin Sharma1-16/+11
Remove duplicate code to exit to userspace for hyper-v hypercalls and use a common place to exit. No functional change intended. Signed-off-by: Vipin Sharma <vipinsh@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20221212183720.4062037-9-vipinsh@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-01KVM: x86: Use emulator callbacks instead of duplicating "host flags"Maxim Levitsky4-18/+16
Instead of re-defining the "host flags" bits, just expose dedicated helpers for each of the two remaining flags that are consumed by the emulator. The emulator never consumes both "is guest" and "is SMM" in close proximity, so there is no motivation to avoid additional indirect branches. Also while at it, garbage collect the recently removed host flags. No functional change is intended. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com> Link: https://lore.kernel.org/r/20221129193717.513824-6-mlevitsk@redhat.com [sean: fix CONFIG_KVM_SMM=n builds, tweak names of wrappers] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-31KVM: x86: Move HF_NMI_MASK and HF_IRET_MASK into "struct vcpu_svm"Maxim Levitsky2-9/+31
Move HF_NMI_MASK and HF_IRET_MASK (a.k.a. "waiting for IRET") out of the common "hflags" and into dedicated flags in "struct vcpu_svm". The flags are used only for the SVM and thus should not be in hflags. Tracking NMI masking in software isn't SVM specific, e.g. VMX has a similar flag (soft_vnmi_blocked), but that's much more of a hack as VMX can't intercept IRET, is useful only for ancient CPUs, i.e. will hopefully be removed at some point, and again the exact behavior is vendor specific and shouldn't ever be referenced in common code. converting VMX No functional change is intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com> Link: https://lore.kernel.org/r/20221129193717.513824-5-mlevitsk@redhat.com [sean: split from HF_GIF_MASK patch] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-31KVM: x86: Move HF_GIF_MASK into "struct vcpu_svm" as "guest_gif"Maxim Levitsky1-3/+6
Move HF_GIF_MASK out of the common "hflags" and into vcpu_svm.guest_gif. GIF is an SVM-only concept and has should never be consulted outside of SVM-specific code. No functional change is intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com> Link: https://lore.kernel.org/r/20221129193717.513824-5-mlevitsk@redhat.com [sean: split to separate patch] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-31KVM: nSVM: Don't sync tlb_ctl back to vmcb12 on nested VM-ExitMaxim Levitsky1-1/+0
Don't sync the TLB control field from vmcb02 to vmcs12 on nested VM-Exit. Per AMD's APM, the field is not modified by hardware: The VMRUN instruction reads, but does not change, the value of the TLB_CONTROL field Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com> Link: https://lore.kernel.org/r/20221129193717.513824-2-mlevitsk@redhat.com [sean: massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-27KVM: x86/pmu: Provide "error" semantics for unsupported-but-known PMU MSRsSean Christopherson1-22/+29
Provide "error" semantics (read zeros, drop writes) for userspace accesses to MSRs that are ultimately unsupported for whatever reason, but for which KVM told userspace to save and restore the MSR, i.e. for MSRs that KVM included in KVM_GET_MSR_INDEX_LIST. Previously, KVM special cased a few PMU MSRs that were problematic at one point or another. Extend the treatment to all PMU MSRs, e.g. to avoid spurious unsupported accesses. Note, the logic can also be used for non-PMU MSRs, but as of today only PMU MSRs can end up being unsupported after KVM told userspace to save and restore them. Link: https://lore.kernel.org/r/20230124234905.3774678-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-27KVM: x86/pmu: Don't tell userspace to save MSRs for non-existent fixed PMCsLike Xu1-0/+5
Limit the set of MSRs for fixed PMU counters based on the number of fixed counters actually supported by the host so that userspace doesn't waste time saving and restoring dummy values. Signed-off-by: Like Xu <likexu@tencent.com> [sean: split for !enable_pmu logic, drop min(), write changelog] Link: https://lore.kernel.org/r/20230124234905.3774678-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-27KVM: x86/pmu: Don't tell userspace to save PMU MSRs if PMU is disabledSean Christopherson1-2/+4
Omit all PMU MSRs from the "MSRs to save" list if the PMU is disabled so that userspace doesn't waste time saving and restoring dummy values. KVM provides "error" semantics (read zeros, drop writes) for such known-but- unsupported MSRs, i.e. has fudged around this issue for quite some time. Keep the "error" semantics as-is for now, the logic will be cleaned up in a separate patch. Cc: Aaron Lewis <aaronlewis@google.com> Cc: Weijiang Yang <weijiang.yang@intel.com> Link: https://lore.kernel.org/r/20230124234905.3774678-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-27KVM: x86/pmu: Use separate array for defining "PMU MSRs to save"Sean Christopherson1-71/+82
Move all potential to-be-saved PMU MSRs into a separate array so that a future patch can easily omit all PMU MSRs from the list when the PMU is disabled. No functional change intended. Link: https://lore.kernel.org/r/20230124234905.3774678-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-27KVM: x86/pmu: Gate all "unimplemented MSR" prints on report_ignored_msrsSean Christopherson5-25/+24
Add helpers to print unimplemented MSR accesses and condition all such prints on report_ignored_msrs, i.e. honor userspace's request to not print unimplemented MSRs. Even though vcpu_unimpl() is ratelimited, printing can still be problematic, e.g. if a print gets stalled when host userspace is writing MSRs during live migration, an effective stall can result in very noticeable disruption in the guest. E.g. the profile below was taken while calling KVM_SET_MSRS on the PMU counters while the PMU was disabled in KVM. - 99.75% 0.00% [.] __ioctl - __ioctl - 99.74% entry_SYSCALL_64_after_hwframe do_syscall_64 sys_ioctl - do_vfs_ioctl - 92.48% kvm_vcpu_ioctl - kvm_arch_vcpu_ioctl - 85.12% kvm_set_msr_ignored_check svm_set_msr kvm_set_msr_common printk vprintk_func vprintk_default vprintk_emit console_unlock call_console_drivers univ8250_console_write serial8250_console_write uart_console_write Reported-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20230124234905.3774678-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-27KVM: x86/pmu: Cap kvm_pmu_cap.num_counters_gp at KVM's internal maxSean Christopherson4-4/+9
Limit kvm_pmu_cap.num_counters_gp during kvm_init_pmu_capability() based on the vendor PMU capabilities so that consuming num_counters_gp naturally does the right thing. This fixes a mostly theoretical bug where KVM could over-report its PMU support in KVM_GET_SUPPORTED_CPUID for leaf 0xA, e.g. if the number of counters reported by perf is greater than KVM's hardcoded internal limit. Incorporating input from the AMD PMU also avoids over-reporting MSRs to save when running on AMD. Link: https://lore.kernel.org/r/20230124234905.3774678-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-27KVM: x86/pmu: Drop event_type and rename "struct kvm_event_hw_type_mapping"Like Xu2-15/+12
After commit ("02791a5c362b KVM: x86/pmu: Use PERF_TYPE_RAW to merge reprogram_{gp,fixed}counter()"), vPMU starts to directly use the hardware event eventsel and unit_mask to reprogram perf_event, and the event_type field in the "struct kvm_event_hw_type_mapping" is simply no longer being used. Convert the struct into an anonymous struct as the current name is obsolete as the structure no longer has any mapping semantics, and placing the struct definition directly above its sole user makes its easier to understand what the array is filling in. Signed-off-by: Like Xu <likexu@tencent.com> Link: https://lore.kernel.org/r/20221205122048.16023-1-likexu@tencent.com [sean: drop new comment, use anonymous struct] Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-01-25KVM: x86: Propagate the AMD Automatic IBRS feature to the guestKim Phillips3-1/+7
Add the AMD Automatic IBRS feature bit to those being propagated to the guest, and enable the guest EFER bit. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-9-kim.phillips@amd.com
2023-01-25x86/cpu, kvm: Add the SMM_CTL MSR not present featureKim Phillips1-1/+1
The SMM_CTL MSR not present feature was being open-coded for KVM. Add it to its newly added CPUID leaf 0x80000021 EAX proper. Also drop the bit description comments now the code is more self-describing. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-7-kim.phillips@amd.com
2023-01-25x86/cpu, kvm: Add the Null Selector Clears Base featureKim Phillips1-2/+2
The Null Selector Clears Base feature was being open-coded for KVM. Add it to its newly added native CPUID leaf 0x80000021 EAX proper. Also drop the bit description comments now it's more self-describing. [ bp: Convert test in check_null_seg_clears_base() too. ] Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-6-kim.phillips@amd.com
2023-01-25x86/cpu, kvm: Move X86_FEATURE_LFENCE_RDTSC to its native leafKim Phillips1-3/+13
The LFENCE always serializing feature bit was defined as scattered LFENCE_RDTSC and its native leaf bit position open-coded for KVM. Add it to its newly added CPUID leaf 0x80000021 EAX proper. With LFENCE_RDTSC in its proper place, the kernel's set_cpu_cap() will effectively synthesize the feature for KVM going forward. Also, DE_CFG[1] doesn't need to be set on such CPUs anymore. [ bp: Massage and merge diff from Sean. ] Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-5-kim.phillips@amd.com
2023-01-25x86/cpu, kvm: Add the NO_NESTED_DATA_BP featureKim Phillips1-1/+1
The "Processor ignores nested data breakpoints" feature was being open-coded for KVM. Add the feature to its newly introduced CPUID leaf 0x80000021 EAX proper. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-4-kim.phillips@amd.com
2023-01-25KVM: x86: Move open-coded CPUID leaf 0x80000021 EAX bit propagation codeKim Phillips1-19/+12
Move code from __do_cpuid_func() to kvm_set_cpu_caps() in preparation for adding the features in their native leaf. Also drop the bit description comments as it will be more self-describing once the individual features are added. Whilst there, switch to using the more efficient cpu_feature_enabled() instead of static_cpu_has(). Note, LFENCE_RDTSC and "NULL selector clears base" are currently synthetic, Linux-defined feature flags as Linux tracking of the features predates AMD's definition. Keep the manual propagation of the flags from their synthetic counterparts until the kernel fully converts to AMD's definition, otherwise KVM would stop synthesizing the flags as intended. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-3-kim.phillips@amd.com
2023-01-25x86/cpu, kvm: Add support for CPUID_80000021_EAXKim Phillips1-0/+1
Add support for CPUID leaf 80000021, EAX. The majority of the features will be used in the kernel and thus a separate leaf is appropriate. Include KVM's reverse_cpuid entry because features are used by VM guests, too. [ bp: Massage commit message. ] Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230124163319.2277355-2-kim.phillips@amd.com
2023-01-24KVM: x86: Replace IS_ERR() with IS_ERR_VALUE()ye xingchen1-1/+1
Avoid type casts that are needed for IS_ERR() and use IS_ERR_VALUE() instead. Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Link: https://lore.kernel.org/r/202211161718436948912@zte.com.cn Signed-off-by: Sean Christopherson <seanjc@google.com>