summaryrefslogtreecommitdiff
path: root/arch/x86/include
AgeCommit message (Collapse)AuthorFilesLines
2021-03-15x86/insn: Add a __ignore_sync_check__ markerBorislav Petkov2-2/+2
Add an explicit __ignore_sync_check__ marker which will be used to mark lines which are supposed to be ignored by file synchronization check scripts, its advantage being that it explicitly denotes such lines in the code. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20210304174237.31945-4-bp@alien8.de
2021-03-15x86/insn: Rename insn_decode() to insn_decode_from_regs()Borislav Petkov1-2/+2
Rename insn_decode() to insn_decode_from_regs() to denote that it receives regs as param and uses registers from there during decoding. Free the former name for a more generic version of the function. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210304174237.31945-2-bp@alien8.de
2021-03-15Merge tag 'v5.12-rc3' into x86/coreBorislav Petkov5-8/+24
Pick up dependent SEV-ES urgent changes to base new work ontop. Signed-off-by: Borislav Petkov <bp@suse.de>
2021-03-15KVM: x86: Get active PCID only when writing a CR3 valueSean Christopherson1-2/+2
Retrieve the active PCID only when writing a guest CR3 value, i.e. don't get the PCID when using EPT or NPT. The PCID is especially problematic for EPT as the bits have different meaning, and so the PCID and must be manually stripped, which is annoying and unnecessary. And on VMX, getting the active PCID also involves reading the guest's CR3 and CR4.PCIDE, i.e. may add pointless VMREADs. Opportunistically rename the pgd/pgd_level params to root_hpa and root_level to better reflect their new roles. Keep the function names, as "load the guest PGD" is still accurate/correct. Last, and probably least, pass root_hpa as a hpa_t/u64 instead of an unsigned long. The EPTP holds a 64-bit value, even in 32-bit mode, so in theory EPT could support HIGHMEM for 32-bit KVM. Never mind that doing so would require changing the MMU page allocators and reworking the MMU to use kmap(). Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210305183123.3978098-2-seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15KVM: x86/mmu: Move logic for setting SPTE masks for EPT into the MMU properSean Christopherson1-3/+0
Let the MMU deal with the SPTE masks to avoid splitting the logic and knowledge across the MMU and VMX. The SPTE masks that are used for EPT are very, very tightly coupled to the MMU implementation. The use of available bits, the existence of A/D types, the fact that shadow_x_mask even exists, and so on and so forth are all baked into the MMU implementation. Cross referencing the params to the masks is also a nightmare, as pretty much every param is a u64. A future patch will make the location of the MMU_WRITABLE and HOST_WRITABLE bits MMU specific, to free up bit 11 for a MMU_PRESENT bit. Doing that change with the current kvm_mmu_set_mask_ptes() would be an absolute mess. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210225204749.1512652-18-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15KVM: SVM: Add support for Virtual SPEC_CTRLBabu Moger1-1/+3
Newer AMD processors have a feature to virtualize the use of the SPEC_CTRL MSR. Presence of this feature is indicated via CPUID function 0x8000000A_EDX[20]: GuestSpecCtrl. Hypervisors are not required to enable this feature since it is automatically enabled on processors that support it. A hypervisor may wish to impose speculation controls on guest execution or a guest may want to impose its own speculation controls. Therefore, the processor implements both host and guest versions of SPEC_CTRL. When in host mode, the host SPEC_CTRL value is in effect and writes update only the host version of SPEC_CTRL. On a VMRUN, the processor loads the guest version of SPEC_CTRL from the VMCB. When the guest writes SPEC_CTRL, only the guest version is updated. On a VMEXIT, the guest version is saved into the VMCB and the processor returns to only using the host SPEC_CTRL for speculation control. The guest SPEC_CTRL is located at offset 0x2E0 in the VMCB. The effective SPEC_CTRL setting is the guest SPEC_CTRL setting or'ed with the hypervisor SPEC_CTRL setting. This allows the hypervisor to ensure a minimum SPEC_CTRL if desired. This support also fixes an issue where a guest may sometimes see an inconsistent value for the SPEC_CTRL MSR on processors that support this feature. With the current SPEC_CTRL support, the first write to SPEC_CTRL is intercepted and the virtualized version of the SPEC_CTRL MSR is not updated. When the guest reads back the SPEC_CTRL MSR, it will be 0x0, instead of the actual expected value. There isn’t a security concern here, because the host SPEC_CTRL value is or’ed with the Guest SPEC_CTRL value to generate the effective SPEC_CTRL value. KVM writes with the guest's virtualized SPEC_CTRL value to SPEC_CTRL MSR just before the VMRUN, so it will always have the actual value even though it doesn’t appear that way in the guest. The guest will only see the proper value for the SPEC_CTRL register if the guest was to write to the SPEC_CTRL register again. With Virtual SPEC_CTRL support, the save area spec_ctrl is properly saved and restored. So, the guest will always see the proper value when it is read back. Signed-off-by: Babu Moger <babu.moger@amd.com> Message-Id: <161188100955.28787.11816849358413330720.stgit@bmoger-ubuntu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15x86/cpufeatures: Add the Virtual SPEC_CTRL featureBabu Moger1-0/+1
Newer AMD processors have a feature to virtualize the use of the SPEC_CTRL MSR. Presence of this feature is indicated via CPUID function 0x8000000A_EDX[20]: GuestSpecCtrl. When present, the SPEC_CTRL MSR is automatically virtualized. Signed-off-by: Babu Moger <babu.moger@amd.com> Acked-by: Borislav Petkov <bp@suse.de> Message-Id: <161188100272.28787.4097272856384825024.stgit@bmoger-ubuntu> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15KVM: x86: Move RDPMC emulation to common codeSean Christopherson1-1/+1
Move the entirety of the accelerated RDPMC emulation to x86.c, and assign the common handler directly to the exit handler array for VMX. SVM has bizarre nrips behavior that prevents it from directly invoking the common handler. The nrips goofiness will be addressed in a future patch. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210205005750.3841462-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15KVM: x86: Move trivial instruction-based exit handlers to common codeSean Christopherson1-0/+5
Move the trivial exit handlers, e.g. for instructions that KVM "emulates" as nops, to common x86 code. Assign the common handlers directly to the exit handler arrays and drop the vendor trampolines. Opportunistically use pr_warn_once() where appropriate. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210205005750.3841462-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15KVM: x86: Move XSETBV emulation to common codeSean Christopherson1-1/+1
Move the entirety of XSETBV emulation to x86.c, and assign the function directly to both VMX's and SVM's exit handlers, i.e. drop the unnecessary trampolines. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210205005750.3841462-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15KVM: x86: Handle triple fault in L2 without killing L1Sean Christopherson1-0/+1
Synthesize a nested VM-Exit if L2 triggers an emulated triple fault instead of exiting to userspace, which likely will kill L1. Any flow that does KVM_REQ_TRIPLE_FAULT is suspect, but the most common scenario for L2 killing L1 is if L0 (KVM) intercepts a contributory exception that is _not_intercepted by L1. E.g. if KVM is intercepting #GPs for the VMware backdoor, a #GP that occurs in L2 while vectoring an injected #DF will cause KVM to emulate triple fault. Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210302174515.2812275-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15KVM: x86/mmu: Unexport MMU load/unload functionsSean Christopherson1-3/+0
Unexport the MMU load and unload helpers now that they are no longer used (incorrectly) in vendor code. Opportunistically move the kvm_mmu_sync_roots() declaration into mmu.h, it should not be exposed to vendor code. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210305011101.3597423-16-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15KVM: x86: to track if L1 is running L2 VMDongli Zhang1-0/+1
The new per-cpu stat 'nested_run' is introduced in order to track if L1 VM is running or used to run L2 VM. An example of the usage of 'nested_run' is to help the host administrator to easily track if any L1 VM is used to run L2 VM. Suppose there is issue that may happen with nested virtualization, the administrator will be able to easily narrow down and confirm if the issue is due to nested virtualization via 'nested_run'. For example, whether the fix like commit 88dddc11a8d6 ("KVM: nVMX: do not use dangling shadow VMCS after guest reset") is required. Cc: Joe Jin <joe.jin@oracle.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Message-Id: <20210305225747.7682-1-dongli.zhang@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-14Merge tag 'objtool-urgent-2021-03-14' of ↵Linus Torvalds1-6/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Thomas Gleixner: "A single objtool fix to handle the PUSHF/POPF validation correctly for the paravirt changes which modified arch_local_irq_restore not to use popf" * tag 'objtool-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool,x86: Fix uaccess PUSHF/POPF validation
2021-03-14Merge tag 'x86_urgent_for_v5.12_rc3' of ↵Linus Torvalds3-0/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - A couple of SEV-ES fixes and robustifications: verify usermode stack pointer in NMI is not coming from the syscall gap, correctly track IRQ states in the #VC handler and access user insn bytes atomically in same handler as latter cannot sleep. - Balance 32-bit fast syscall exit path to do the proper work on exit and thus not confuse audit and ptrace frameworks. - Two fixes for the ORC unwinder going "off the rails" into KASAN redzones and when ORC data is missing. * tag 'x86_urgent_for_v5.12_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev-es: Use __copy_from_user_inatomic() x86/sev-es: Correctly track IRQ states in runtime #VC handler x86/sev-es: Check regs->sp is trusted before adjusting #VC IST stack x86/sev-es: Introduce ip_within_syscall_gap() helper x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls x86/unwind/orc: Silence warnings caused by missing ORC data x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2
2021-03-14Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-2/+2
Pull KVM fixes from Paolo Bonzini: "More fixes for ARM and x86" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: LAPIC: Advancing the timer expiration on guest initiated write KVM: x86/mmu: Skip !MMU-present SPTEs when removing SP in exclusive mode KVM: kvmclock: Fix vCPUs > 64 can't be online/hotpluged kvm: x86: annotate RCU pointers KVM: arm64: Fix exclusive limit for IPA size KVM: arm64: Reject VM creation when the default IPA size is unsupported KVM: arm64: Ensure I-cache isolation between vcpus of a same VM KVM: arm64: Don't use cbz/adr with external symbols KVM: arm64: Fix range alignment when walking page tables KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to __vgic_v3_get_gic_config() KVM: arm64: Don't access PMSELR_EL0/PMUSERENR_EL0 when no PMU is available KVM: arm64: Turn kvm_arm_support_pmu_v3() into a static key KVM: arm64: Fix nVHE hyp panic host context restore KVM: arm64: Avoid corrupting vCPU context register in guest exit KVM: arm64: nvhe: Save the SPE context early kvm: x86: use NULL instead of using plain integer as pointer KVM: SVM: Connect 'npt' module param to KVM's internal 'npt_enabled' KVM: x86: Ensure deadline timer has truly expired before posting its IRQ
2021-03-12kvm: x86: annotate RCU pointersMuhammad Usama Anjum1-2/+2
This patch adds the annotation to fix the following sparse errors: arch/x86/kvm//x86.c:8147:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:8147:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//x86.c:8147:15: struct kvm_apic_map * arch/x86/kvm//x86.c:10628:16: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:10628:16: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//x86.c:10628:16: struct kvm_apic_map * arch/x86/kvm//x86.c:10629:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//x86.c:10629:15: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//x86.c:10629:15: struct kvm_pmu_event_filter * arch/x86/kvm//lapic.c:267:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:267:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:267:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:269:9: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:269:9: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:269:9: struct kvm_apic_map * arch/x86/kvm//lapic.c:637:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:637:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:637:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:994:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:994:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:994:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:1036:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:1036:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:1036:15: struct kvm_apic_map * arch/x86/kvm//lapic.c:1173:15: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//lapic.c:1173:15: struct kvm_apic_map [noderef] __rcu * arch/x86/kvm//lapic.c:1173:15: struct kvm_apic_map * arch/x86/kvm//pmu.c:190:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:190:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:190:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:251:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:251:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:251:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter * arch/x86/kvm//pmu.c:522:18: error: incompatible types in comparison expression (different address spaces): arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter [noderef] __rcu * arch/x86/kvm//pmu.c:522:18: struct kvm_pmu_event_filter * Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com> Message-Id: <20210305191123.GA497469@LEGION> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-12objtool,x86: Fix uaccess PUSHF/POPF validationPeter Zijlstra1-6/+4
Commit ab234a260b1f ("x86/pv: Rework arch_local_irq_restore() to not use popf") replaced "push %reg; popf" with something like: "test $0x200, %reg; jz 1f; sti; 1:", which breaks the pushf/popf symmetry that commit ea24213d8088 ("objtool: Add UACCESS validation") relies on. The result is: drivers/gpu/drm/amd/amdgpu/si.o: warning: objtool: si_common_hw_init()+0xf36: PUSHF stack exhausted Meanwhile, commit c9c324dc22aa ("objtool: Support stack layout changes in alternatives") makes that we can actually use stack-ops in alternatives, which means we can revert 1ff865e343c2 ("x86,smap: Fix smap_{save,restore}() alternatives"). That in turn means we can limit the PUSHF/POPF handling of ea24213d8088 to those instructions that are in alternatives. Fixes: ab234a260b1f ("x86/pv: Rework arch_local_irq_restore() to not use popf") Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/YEY4rIbQYa5fnnEp@hirez.programming.kicks-ass.net
2021-03-11x86/paravirt: Have only one paravirt patch functionJuergen Gross1-18/+1
There is no need any longer to have different paravirt patch functions for native and Xen. Eliminate native_patch() and rename paravirt_patch_default() to paravirt_patch(). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-15-jgross@suse.com
2021-03-11x86/paravirt: Switch functions with custom code to ALTERNATIVEJuergen Gross3-58/+51
Instead of using paravirt patching for custom code sequences use ALTERNATIVE for the functions with custom code replacements. Instead of patching an ud2 instruction for unpopulated vector entries into the caller site, use a simple function just calling BUG() as a replacement. Simplify the register defines for assembler paravirt calling, as there isn't much usage left. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-14-jgross@suse.com
2021-03-11x86/paravirt: Add new PVOP_ALT* macros to support pvops in ALTERNATIVEsJuergen Gross1-1/+48
Instead of using paravirt patching for custom code sequences add support for using ALTERNATIVE handling combined with paravirt call patching. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-13-jgross@suse.com
2021-03-11x86/paravirt: Switch iret pvops to ALTERNATIVEJuergen Gross2-7/+4
The iret paravirt op is rather special as it is using a jmp instead of a call instruction. Switch it to ALTERNATIVE. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-12-jgross@suse.com
2021-03-11x86/paravirt: Simplify paravirt macrosJuergen Gross1-29/+12
The central pvops call macros ____PVOP_CALL() and ____PVOP_VCALL() are looking very similar now. The main differences are using PVOP_VCALL_ARGS or PVOP_CALL_ARGS, which are identical, and the return value handling. So drop PVOP_VCALL_ARGS and instead of ____PVOP_VCALL() just use (void)____PVOP_CALL(long, ...). Note that it isn't easily possible to just redefine ____PVOP_VCALL() to use ____PVOP_CALL() instead, as this would require further hiding of commas in macro parameters. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-11-jgross@suse.com
2021-03-11x86/paravirt: Remove no longer needed 32-bit pvops cruftJuergen Gross3-119/+33
PVOP_VCALL4() is only used for Xen PV, while PVOP_CALL4() isn't used at all. Keep PVOP_CALL4() for 64 bits due to symmetry reasons. This allows to remove the 32-bit definitions of those macros leading to a substantial simplification of the paravirt macros, as those were the only ones needing non-empty "pre" and "post" parameters. PVOP_CALLEE2() and PVOP_VCALLEE2() are used nowhere, so remove them. Another no longer needed case is special handling of return types larger than unsigned long. Replace that with a BUILD_BUG_ON(). DISABLE_INTERRUPTS() is used in 32-bit code only, so it can just be replaced by cli. INTERRUPT_RETURN in 32-bit code can be replaced by iret. ENABLE_INTERRUPTS is used nowhere, so it can be removed. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-10-jgross@suse.com
2021-03-11x86/paravirt: Add new features for paravirt patchingJuergen Gross2-0/+12
For being able to switch paravirt patching from special cased custom code sequences to ALTERNATIVE handling some X86_FEATURE_* are needed as new features. This enables to have the standard indirect pv call as the default code and to patch that with the non-Xen custom code sequence via ALTERNATIVE patching later. Make sure paravirt patching is performed before alternatives patching. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-9-jgross@suse.com
2021-03-11x86/alternative: Use ALTERNATIVE_TERNARY() in _static_cpu_has()Juergen Gross1-32/+9
_static_cpu_has() contains a completely open coded version of ALTERNATIVE_TERNARY(). Replace that with the macro instead. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210311142319.4723-8-jgross@suse.com
2021-03-11x86/alternative: Support ALTERNATIVE_TERNARYJuergen Gross1-0/+13
Add ALTERNATIVE_TERNARY support for replacing an initial instruction with either of two instructions depending on a feature: ALTERNATIVE_TERNARY "default_instr", FEATURE_NR, "feature_on_instr", "feature_off_instr" which will start with "default_instr" and at patch time will, depending on FEATURE_NR being set or not, patch that with either "feature_on_instr" or "feature_off_instr". [ bp: Add comment ontop. ] Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-7-jgross@suse.com
2021-03-11x86/alternative: Support not-featureJuergen Gross1-0/+3
Add support for alternative patching for the case a feature is not present on the current CPU. For users of ALTERNATIVE() and friends, an inverted feature is specified by applying the ALT_NOT() macro to it, e.g.: ALTERNATIVE(old, new, ALT_NOT(feature)); Committer note: The decision to encode the NOT-bit in the feature bit itself is because a future change which would make objtool generate such alternative calls, would keep the code in objtool itself fairly simple. Also, this allows for the alternative macros to support the NOT feature without having to change them. Finally, the u16 cpuid member encoding the X86_FEATURE_ flags is not an ABI so if more bits are needed, cpuid itself can be enlarged or a flags field can be added to struct alt_instr after having considered the size growth in either cases. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210311142319.4723-6-jgross@suse.com
2021-03-11x86/paravirt: Switch time pvops functions to use static_call()Juergen Gross3-10/+13
The time pvops functions are the only ones left which might be used in 32-bit mode and which return a 64-bit value. Switch them to use the static_call() mechanism instead of pvops, as this allows quite some simplification of the pvops implementation. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-5-jgross@suse.com
2021-03-11x86/alternative: Merge include filesJuergen Gross4-122/+110
Merge arch/x86/include/asm/alternative-asm.h into arch/x86/include/asm/alternative.h in order to make it easier to use common definitions later. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210311142319.4723-2-jgross@suse.com
2021-03-11x86/setup: Remove unused RESERVE_BRK_ARRAY()Cao jin1-5/+0
Since a13f2ef168cb ("x86/xen: remove 32-bit Xen PV guest support"), RESERVE_BRK_ARRAY() has no user anymore so drop it. Update related comments too. Signed-off-by: Cao jin <jojing64@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210311083919.27530-1-jojing64@gmail.com
2021-03-09x86/alternative: Drop unused feature parameter from ALTINSTR_REPLACEMENT()Juergen Gross1-7/+7
The macro ALTINSTR_REPLACEMENT() doesn't make use of the feature parameter, so drop it. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210309134813.23912-4-jgross@suse.com
2021-03-09x86/sev-es: Use __copy_from_user_inatomic()Joerg Roedel1-0/+2
The #VC handler must run in atomic context and cannot sleep. This is a problem when it tries to fetch instruction bytes from user-space via copy_from_user(). Introduce a insn_fetch_from_user_inatomic() helper which uses __copy_from_user_inatomic() to safely copy the instruction bytes to kernel memory in the #VC handler. Fixes: 5e3427a7bc432 ("x86/sev-es: Handle instruction fetches from user-space") Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org # v5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-6-joro@8bytes.org
2021-03-08x86: Use ELF fields defined in 'struct kimage'Lakshmi Ramasubramanian1-5/+0
ELF related fields elf_headers, elf_headers_sz, and elf_load_addr have been moved from 'struct kimage_arch' to 'struct kimage'. Use the ELF fields defined in 'struct kimage'. Suggested-by: Rob Herring <robh@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20210221174930.27324-5-nramas@linux.microsoft.com
2021-03-08clocksource/drivers/hyper-v: Move handling of STIMER0 interruptsMichael Kelley1-4/+0
STIMER0 interrupts are most naturally modeled as per-cpu IRQs. But because x86/x64 doesn't have per-cpu IRQs, the core STIMER0 interrupt handling machinery is done in code under arch/x86 and Linux IRQs are not used. Adding support for ARM64 means adding equivalent code using per-cpu IRQs under arch/arm64. A better model is to treat per-cpu IRQs as the normal path (which it is for modern architectures), and the x86/x64 path as the exception. Do this by incorporating standard Linux per-cpu IRQ allocation into the main SITMER0 driver code, and bypass it in the x86/x64 exception case. For x86/x64, special case code is retained under arch/x86, but no STIMER0 interrupt handling code is needed under arch/arm64. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1614721102-2241-11-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08clocksource/drivers/hyper-v: Handle sched_clock differences inlineMichael Kelley1-11/+0
While the Hyper-V Reference TSC code is architecture neutral, the pv_ops.time.sched_clock() function is implemented for x86/x64, but not for ARM64. Current code calls a utility function under arch/x86 (and coming, under arch/arm64) to handle the difference. Change this approach to handle the difference inline based on whether GENERIC_SCHED_CLOCK is present. The new approach removes code under arch/* since the difference is tied more to the specifics of the Linux implementation than to the architecture. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1614721102-2241-9-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08clocksource/drivers/hyper-v: Handle vDSO differences inlineMichael Kelley1-4/+0
While the driver for the Hyper-V Reference TSC and STIMERs is architecture neutral, vDSO is implemented for x86/x64, but not for ARM64. Current code calls into utility functions under arch/x86 (and coming, under arch/arm64) to handle the difference. Change this approach to handle the difference inline based on whether VDSO_CLOCK_MODE_HVCLOCK is present. The new approach removes code under arch/* since the difference is tied more to the specifics of the Linux implementation than to the architecture. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1614721102-2241-8-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08Drivers: hv: vmbus: Move handling of VMbus interruptsMichael Kelley1-1/+0
VMbus interrupts are most naturally modelled as per-cpu IRQs. But because x86/x64 doesn't have per-cpu IRQs, the core VMbus interrupt handling machinery is done in code under arch/x86 and Linux IRQs are not used. Adding support for ARM64 means adding equivalent code using per-cpu IRQs under arch/arm64. A better model is to treat per-cpu IRQs as the normal path (which it is for modern architectures), and the x86/x64 path as the exception. Do this by incorporating standard Linux per-cpu IRQ allocation into the main VMbus driver, and bypassing it in the x86/x64 exception case. For x86/x64, special case code is retained under arch/x86, but no VMbus interrupt handling code is needed under arch/arm64. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-7-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08Drivers: hv: vmbus: Handle auto EOI quirk inlineMichael Kelley1-3/+0
On x86/x64, Hyper-V provides a flag to indicate auto EOI functionality, but it doesn't on ARM64. Handle this quirk inline instead of calling into code under arch/x86 (and coming, under arch/arm64). No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-6-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08Drivers: hv: Redo Hyper-V synthetic MSR get/set functionsMichael Kelley2-67/+74
Current code defines a separate get and set macro for each Hyper-V synthetic MSR used by the VMbus driver. Furthermore, the get macro can't be converted to a standard function because the second argument is modified in place, which is somewhat bad form. Redo this by providing a single get and a single set function that take a parameter specifying the MSR to be operated on. Fixup usage of the get function. Calling locations are no more complex than before, but the code under arch/x86 and the upcoming code under arch/arm64 is significantly simplified. Also standardize the names of Hyper-V synthetic MSRs that are architecture neutral. But keep the old x86-specific names as aliases that can be removed later when all references (particularly in KVM code) have been cleaned up in a separate patch series. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-4-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08x86/hyper-v: Move hv_message_type to architecture neutral moduleMichael Kelley1-29/+0
The definition of enum hv_message_type includes arch neutral and x86/x64-specific values. Ideally there would be a way to put the arch neutral values in an arch neutral module, and the arch specific values in an arch specific module. But C doesn't provide a way to extend enum types. As a compromise, move the entire definition into an arch neutral module, to avoid duplicating the arch neutral values for x86/x64 and for ARM64. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-3-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08Drivers: hv: vmbus: Move Hyper-V page allocator to arch neutral codeMichael Kelley1-5/+0
The Hyper-V page allocator functions are implemented in an architecture neutral way. Move them into the architecture neutral VMbus module so a separate implementation for ARM64 is not needed. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/1614721102-2241-2-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-03-08x86/sev-es: Introduce ip_within_syscall_gap() helperJoerg Roedel2-0/+16
Introduce a helper to check whether an exception came from the syscall gap and use it in the SEV-ES code. Extend the check to also cover the compatibility SYSCALL entry path. Fixes: 315562c9af3d5 ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler") Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org # 5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-2-joro@8bytes.org
2021-03-08x86/stackprotector/32: Make the canary into a regular percpu variableAndy Lutomirski5-101/+34
On 32-bit kernels, the stackprotector canary is quite nasty -- it is stored at %gs:(20), which is nasty because 32-bit kernels use %fs for percpu storage. It's even nastier because it means that whether %gs contains userspace state or kernel state while running kernel code depends on whether stackprotector is enabled (this is CONFIG_X86_32_LAZY_GS), and this setting radically changes the way that segment selectors work. Supporting both variants is a maintenance and testing mess. Merely rearranging so that percpu and the stack canary share the same segment would be messy as the 32-bit percpu address layout isn't currently compatible with putting a variable at a fixed offset. Fortunately, GCC 8.1 added options that allow the stack canary to be accessed as %fs:__stack_chk_guard, effectively turning it into an ordinary percpu variable. This lets us get rid of all of the code to manage the stack canary GDT descriptor and the CONFIG_X86_32_LAZY_GS mess. (That name is special. We could use any symbol we want for the %fs-relative mode, but for CONFIG_SMP=n, gcc refuses to let us use any name other than __stack_chk_guard.) Forcibly disable stackprotector on older compilers that don't support the new options and turn the stack canary into a percpu variable. The "lazy GS" approach is now used for all 32-bit configurations. Also makes load_gs_index() work on 32-bit kernels. On 64-bit kernels, it loads the GS selector and updates the user GSBASE accordingly. (This is unchanged.) On 32-bit kernels, it loads the GS selector and updates GSBASE, which is now always the user base. This means that the overall effect is the same on 32-bit and 64-bit, which avoids some ifdeffery. [ bp: Massage commit message. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/c0ff7dba14041c7e5d1cae5d4df052f03759bef3.1613243844.git.luto@kernel.org
2021-03-08x86: Remove duplicate TSC DEADLINE MSR definitionsDave Hansen1-2/+0
There are two definitions for the TSC deadline MSR in msr-index.h, one with an underscore and one without. Axe one of them and move all the references over to the other one. [ bp: Fixup the MSR define in handle_fastpath_set_msr_irqoff() too. ] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200305174706.0D6B8EE4@viggo.jf.intel.com
2021-03-06Merge branch 'locking/core' into x86/mm, to resolve conflictIngo Molnar1-2/+2
There's a non-trivial conflict between the parallel TLB flush framework and the IPI flush debugging code - merge them manually. Conflicts: kernel/smp.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-03-06x86/mm/tlb: Privatize cpu_tlbstateNadav Amit1-18/+21
cpu_tlbstate is mostly private and only the variable is_lazy is shared. This causes some false-sharing when TLB flushes are performed. Break cpu_tlbstate intro cpu_tlbstate and cpu_tlbstate_shared, and mark each one accordingly. Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20210220231712.2475218-6-namit@vmware.com
2021-03-06x86/mm/tlb: Flush remote and local TLBs concurrentlyNadav Amit4-8/+8
To improve TLB shootdown performance, flush the remote and local TLBs concurrently. Introduce flush_tlb_multi() that does so. Introduce paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v (Xen and hyper-v are only compile-tested). While the updated smp infrastructure is capable of running a function on a single local core, it is not optimized for this case. The multiple function calls and the indirect branch introduce some overhead, and might make local TLB flushes slower than they were before the recent changes. Before calling the SMP infrastructure, check if only a local TLB flush is needed to restore the lost performance in this common case. This requires to check mm_cpumask() one more time, but unless this mask is updated very frequently, this should impact performance negatively. Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> # Hyper-v parts Reviewed-by: Juergen Gross <jgross@suse.com> # Xen and paravirt parts Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20210220231712.2475218-5-namit@vmware.com
2021-03-06x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote()Nadav Amit1-2/+3
The unification of these two functions allows to use them in the updated SMP infrastrucutre. To do so, remove the reason argument from flush_tlb_func_local(), add a member to struct tlb_flush_info that says which CPU initiated the flush and act accordingly. Optimize the size of flush_tlb_info while we are at it. Unfortunately, this prevents us from using a constant tlb_flush_info for arch_tlbbatch_flush(), but in a later stage we may be able to inline tlb_flush_info into the IPI data, so it should not have an impact eventually. Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20210220231712.2475218-3-namit@vmware.com
2021-03-06x86/jump_label: Mark arguments as const to satisfy asm constraintsJason Gerecke1-2/+2
When compiling an external kernel module with `-O0` or `-O1`, the following compile error may be reported: ./arch/x86/include/asm/jump_label.h:25:2: error: impossible constraint in ‘asm’ 25 | asm_volatile_goto("1:" | ^~~~~~~~~~~~~~~~~ It appears that these lower optimization levels prevent GCC from detecting that the key/branch arguments can be treated as constants and used as immediate operands. To work around this, explicitly add the `const` label. Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20210211214848.536626-1-jason.gerecke@wacom.com