summaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)AuthorFilesLines
2025-02-26x86/ibt: Add exact_endbr() helperPeter Zijlstra1-3/+17
For when we want to exactly match ENDBR, and not everything that we can scribble it with. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Link: https://lore.kernel.org/r/20250224124200.059556588@infradead.org
2025-02-26x86/cfi: Add 'cfi=warn' boot optionPeter Zijlstra1-0/+3
Rebuilding with CONFIG_CFI_PERMISSIVE=y enabled is such a pain, esp. since clang is so slow. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Link: https://lore.kernel.org/r/20250224124159.924496481@infradead.org
2025-02-26KVM: nVMX: Process events on nested VM-Exit if injectable IRQ or NMI is pendingSean Christopherson1-0/+11
Process pending events on nested VM-Exit if the vCPU has an injectable IRQ or NMI, as the event may have become pending while L2 was active, i.e. may not be tracked in the context of vmcs01. E.g. if L1 has passed its APIC through to L2 and an IRQ arrives while L2 is active, then KVM needs to request an IRQ window prior to running L1, otherwise delivery of the IRQ will be delayed until KVM happens to process events for some other reason. The missed failure is detected by vmx_apic_passthrough_tpr_threshold_test in KVM-Unit-Tests, but has effectively been masked due to a flaw in KVM's PIC emulation that causes KVM to make spurious KVM_REQ_EVENT requests (and apparently no one ever ran the test with split IRQ chips). Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250224235542.2562848-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-02-26KVM: x86: Free vCPUs before freeing VM stateSean Christopherson1-1/+1
Free vCPUs before freeing any VM state, as both SVM and VMX may access VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs to be kicked out of nested guest mode. Commit 6fcee03df6a1 ("KVM: x86: avoid loading a vCPU after .vm_destroy was called") partially fixed the issue, but for unknown reasons only moved the MMU unloading before VM destruction. Complete the change, and free all vCPU state prior to destroying VM state, as nVMX accesses even more state than nSVM. In addition to the AVIC, KVM can hit a use-after-free on MSR filters: kvm_msr_allowed+0x4c/0xd0 __kvm_set_msr+0x12d/0x1e0 kvm_set_msr+0x19/0x40 load_vmcs12_host_state+0x2d8/0x6e0 [kvm_intel] nested_vmx_vmexit+0x715/0xbd0 [kvm_intel] nested_vmx_free_vcpu+0x33/0x50 [kvm_intel] vmx_free_vcpu+0x54/0xc0 [kvm_intel] kvm_arch_vcpu_destroy+0x28/0xf0 kvm_vcpu_destroy+0x12/0x50 kvm_arch_destroy_vm+0x12c/0x1c0 kvm_put_kvm+0x263/0x3c0 kvm_vm_release+0x21/0x30 and an upcoming fix to process injectable interrupts on nested VM-Exit will access the PIC: BUG: kernel NULL pointer dereference, address: 0000000000000090 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page CPU: 23 UID: 1000 PID: 2658 Comm: kvm-nx-lpage-re RIP: 0010:kvm_cpu_has_extint+0x2f/0x60 [kvm] Call Trace: <TASK> kvm_cpu_has_injectable_intr+0xe/0x60 [kvm] nested_vmx_vmexit+0x2d7/0xdf0 [kvm_intel] nested_vmx_free_vcpu+0x40/0x50 [kvm_intel] vmx_vcpu_free+0x2d/0x80 [kvm_intel] kvm_arch_vcpu_destroy+0x2d/0x130 [kvm] kvm_destroy_vcpus+0x8a/0x100 [kvm] kvm_arch_destroy_vm+0xa7/0x1d0 [kvm] kvm_destroy_vm+0x172/0x300 [kvm] kvm_vcpu_release+0x31/0x50 [kvm] Inarguably, both nSVM and nVMX need to be fixed, but punt on those cleanups for the moment. Conceptually, vCPUs should be freed before VM state. Assets like the I/O APIC and PIC _must_ be allocated before vCPUs are created, so it stands to reason that they must be freed _after_ vCPUs are destroyed. Reported-by: Aaron Lewis <aaronlewis@google.com> Closes: https://lore.kernel.org/all/20240703175618.2304869-2-aaronlewis@google.com Cc: Jim Mattson <jmattson@google.com> Cc: Yan Zhao <yan.y.zhao@intel.com> Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com> Cc: Kai Huang <kai.huang@intel.com> Cc: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250224235542.2562848-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-02-26KVM: SVM: Add Idle HLT intercept supportManali Shukla3-3/+11
Add support for "Idle HLT" interception on AMD CPUs, and enable Idle HLT interception instead of "normal" HLT interception for all VMs for which HLT-exiting is enabled. Idle HLT provides a mild performance boost for all VM types, by avoiding a VM-Exit in the scenario where KVM would immediately "wake" and resume the vCPU. Idle HLT makes HLT-exiting conditional on the vCPU not having a valid, unmasked interrupt. Specifically, a VM-Exit occurs on execution of HLT if and only if there are no pending V_IRQ or V_NMI events. Note, Idle is a replacement for full HLT interception, i.e. enabling HLT interception would result in all HLT instructions causing unconditional VM-Exits. Per the APM: When both HLT and Idle HLT intercepts are active at the same time, the HLT intercept takes priority. This intercept occurs only if a virtual interrupt is not pending (V_INTR or V_NMI). For KVM's use of V_IRQ (also called V_INTR in the APM) to detect interrupt windows, the net effect of enabling Idle HLT is that, if a virtual interupt is pending and unmasked at the time of HLT, the vCPU will take a V_IRQ intercept instead of a HLT intercept. When AVIC is enabled, Idle HLT works as intended: the vCPU continues unimpeded and services the pending virtual interrupt. Note, the APM's description of V_IRQ interaction with AVIC is quite confusing, and requires piecing together implied behavior. Per the APM, when AVIC is enabled, V_IRQ *from the VMCB* is ignored: When AVIC mode is enabled for a virtual processor, the V_IRQ, V_INTR_PRIO, V_INTR_VECTOR, and V_IGN_TPR fields in the VMCB are ignored. Which seems to contradict the behavior of Idle HLT: This intercept occurs only if a virtual interrupt is not pending (V_INTR or V_NMI). What's not explicitly stated is that hardware's internal copy of V_IRQ (and related fields) *are* still active, i.e. are presumably used to cache information from the virtual APIC. Handle Idle HLT exits as if they were normal HLT exits, e.g. don't try to optimize the handling under the assumption that there isn't a pending IRQ. Irrespective of AVIC, Idle HLT is inherently racy with respect to the vIRR, as KVM can set vIRR bits asychronously. No changes are required to support KVM's use Idle HLT while running L2. In fact, supporting Idle HLT is actually a bug fix to some extent. If L1 wants to intercept HLT, recalc_intercepts() will enable HLT interception in vmcb02 and forward the intercept to L1 as normal. But if L1 does not want to intercept HLT, then KVM will run L2 with Idle HLT enabled and HLT interception disabled. If a V_IRQ or V_NMI for L2 becomes pending and L2 executes HLT, then use of Idle HLT will do the right thing, i.e. not #VMEXIT and instead deliver the virtual event. KVM currently doesn't handle this scenario correctly, e.g. doesn't check V_IRQ or V_NMI in vmcs02 as part of kvm_vcpu_has_events(). Do not expose Idle HLT to L1 at this time, as supporting nested Idle HLT is more complex than just enumerating the feature, e.g. requires KVM to handle the aforementioned scenarios of V_IRQ and V_NMI at the time of exit. Signed-off-by: Manali Shukla <Manali.Shukla@amd.com> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> Link: https://bugzilla.kernel.org/attachment.cgi?id=306250 Link: https://lore.kernel.org/r/20250128124812.7324-3-manali.shukla@amd.com [sean: rewrite changelog, drop nested "support"] Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-26x86/cpufeatures: Add CPUID feature bit for Idle HLT interceptManali Shukla1-0/+1
The Idle HLT Intercept feature allows for the HLT instruction execution by a vCPU to be intercepted by the hypervisor only if there are no pending events (V_INTR and V_NMI) for the vCPU. When the vCPU is expected to service the pending events (V_INTR and V_NMI), the Idle HLT intercept won’t trigger. The feature allows the hypervisor to determine if the vCPU is idle and reduces wasteful VMEXITs. In addition to the aforementioned use case, the Idle HLT intercept feature is also used for enlightened guests who aim to securely manage events without the hypervisor’s awareness. If a HLT occurs while a virtual event is pending and the hypervisor is unaware of this pending event (as could be the case with enlightened guests), the absence of the Idle HLT intercept feature could result in a vCPU being suspended indefinitely. Presence of Idle HLT intercept feature for guests is indicated via CPUID function 0x8000000A_EDX[30]. Signed-off-by: Manali Shukla <Manali.Shukla@amd.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20250128124812.7324-2-manali.shukla@amd.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-26KVM: SVM: Provide helpers to set the error codeMelody Wang3-23/+49
Provide helpers to set the error code when converting VMGEXIT SW_EXITINFO1 and SW_EXITINFO2 codes from plain numbers to proper defines. Add comments for better code readability. No functionality changed. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Melody Wang <huibo.wang@amd.com> Link: https://lore.kernel.org/r/20250225213937.2471419-3-huibo.wang@amd.com [sean: tweak comments, fix formatting goofs] Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-26KVM: SVM: Convert plain error code numbers to definesMelody Wang3-9/+17
Convert VMGEXIT SW_EXITINFO1 codes from plain numbers to proper defines. Opportunistically update the comment for the malformed input "sub-error" codes to state that they are defined by the GHCB, and to capure the relationship to the malformed input response. No functional change intended. Signed-off-by: Melody Wang <huibo.wang@amd.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Pavan Kumar Paluri <papaluri@amd.com> Link: https://lore.kernel.org/r/20250225213937.2471419-2-huibo.wang@amd.com [sean: update comments] Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-26x86/entry: Fix kernel-doc warningDaniel Sneddon1-0/+1
The do_int80_emulation() function is missing a kernel-doc formatted description of its argument. This is causing a warning when building with W=1. Add a brief description of the argument to satisfy kernel-doc. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20241219155227.685692-1-daniel.sneddon@linux.intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202412131236.a5HhOqXo-lkp@intel.com/
2025-02-26perf/x86/rapl: Add support for Intel Arrow Lake UAaron Ma1-0/+1
Add Arrow Lake U model for RAPL: $ ls -1 /sys/devices/power/events/ energy-cores energy-cores.scale energy-cores.unit energy-gpu energy-gpu.scale energy-gpu.unit energy-pkg energy-pkg.scale energy-pkg.unit energy-psys energy-psys.scale energy-psys.unit The same output as ArrowLake: $ perf stat -a -I 1000 --per-socket -e power/energy-pkg/ Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20241224145516.349028-1-aaron.ma@canonical.com
2025-02-26x86/irq: Define trace events conditionallyArnd Bergmann1-0/+2
When both of X86_LOCAL_APIC and X86_THERMAL_VECTOR are disabled, the irq tracing produces a W=1 build warning for the tracing definitions: In file included from include/trace/trace_events.h:27, from include/trace/define_trace.h:113, from arch/x86/include/asm/trace/irq_vectors.h:383, from arch/x86/kernel/irq.c:29: include/trace/stages/init.h:2:23: error: 'str__irq_vectors__trace_system_name' defined but not used [-Werror=unused-const-variable=] Make the tracepoints conditional on the same symbosl that guard their usage. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250225213236.3141752-1-arnd@kernel.org
2025-02-26x86/CPU: Fix warm boot hang regression on AMD SC1100 SoC systemsRussell Senior1-2/+2
I still have some Soekris net4826 in a Community Wireless Network I volunteer with. These devices use an AMD SC1100 SoC. I am running OpenWrt on them, which uses a patched kernel, that naturally has evolved over time. I haven't updated the ones in the field in a number of years (circa 2017), but have one in a test bed, where I have intermittently tried out test builds. A few years ago, I noticed some trouble, particularly when "warm booting", that is, doing a reboot without removing power, and noticed the device was hanging after the kernel message: [ 0.081615] Working around Cyrix MediaGX virtual DMA bugs. If I removed power and then restarted, it would boot fine, continuing through the message above, thusly: [ 0.081615] Working around Cyrix MediaGX virtual DMA bugs. [ 0.090076] Enable Memory-Write-back mode on Cyrix/NSC processor. [ 0.100000] Enable Memory access reorder on Cyrix/NSC processor. [ 0.100070] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0 [ 0.110058] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0 [ 0.120037] CPU: NSC Geode(TM) Integrated Processor by National Semi (family: 0x5, model: 0x9, stepping: 0x1) [...] In order to continue using modern tools, like ssh, to interact with the software on these old devices, I need modern builds of the OpenWrt firmware on the devices. I confirmed that the warm boot hang was still an issue in modern OpenWrt builds (currently using a patched linux v6.6.65). Last night, I decided it was time to get to the bottom of the warm boot hang, and began bisecting. From preserved builds, I narrowed down the bisection window from late February to late May 2019. During this period, the OpenWrt builds were using 4.14.x. I was able to build using period-correct Ubuntu 18.04.6. After a number of bisection iterations, I identified a kernel bump from 4.14.112 to 4.14.113 as the commit that introduced the warm boot hang. https://github.com/openwrt/openwrt/commit/07aaa7e3d62ad32767d7067107db64b6ade81537 Looking at the upstream changes in the stable kernel between 4.14.112 and 4.14.113 (tig v4.14.112..v4.14.113), I spotted a likely suspect: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=20afb90f730982882e65b01fb8bdfe83914339c5 So, I tried reverting just that kernel change on top of the breaking OpenWrt commit, and my warm boot hang went away. Presumably, the warm boot hang is due to some register not getting cleared in the same way that a loss of power does. That is approximately as much as I understand about the problem. More poking/prodding and coaching from Jonas Gorski, it looks like this test patch fixes the problem on my board: Tested against v6.6.67 and v4.14.113. Fixes: 18fb053f9b82 ("x86/cpu/cyrix: Use correct macros for Cyrix calls on Geode processors") Debugged-by: Jonas Gorski <jonas.gorski@gmail.com> Signed-off-by: Russell Senior <russell@personaltelco.net> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/CAHP3WfOgs3Ms4Z+L9i0-iBOE21sdMk5erAiJurPjnrL9LSsgRA@mail.gmail.com Cc: Matthew Whitehead <tedheadster@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>
2025-02-26x86/of: Don't use DTB for SMP setup if ACPI is enabledDmytro Maluka1-1/+2
There are cases when it is useful to use both ACPI and DTB provided by the bootloader, however in such cases we should make sure to prevent conflicts between the two. Namely, don't try to use DTB for SMP setup if ACPI is enabled. Precisely, this prevents at least: - incorrectly calling register_lapic_address(APIC_DEFAULT_PHYS_BASE) after the LAPIC was already successfully enumerated via ACPI, causing noisy kernel warnings and probably potential real issues as well - failed IOAPIC setup in the case when IOAPIC is enumerated via mptable instead of ACPI (e.g. with acpi=noirq), due to mpparse_parse_smp_config() overridden by x86_dtb_parse_smp_config() Signed-off-by: Dmytro Maluka <dmaluka@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250105172741.3476758-2-dmaluka@chromium.org
2025-02-25x86/build: Fix broken copy command in genimage.sh when making isoimageNir Lichtman1-1/+4
Problem: Currently when running the "make isoimage" command there is an error related to wrong parameters passed to the cp command: "cp: missing destination file operand after 'arch/x86/boot/isoimage/'" This is caused because FDINITRDS is an empty array. Solution: Check if FDINITRDS is empty before executing the "cp" command, similar to how it is done in the case of hdimage. Signed-off-by: Nir Lichtman <nir@lichtman.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michal Marek <michal.lkml@markovi.net> Link: https://lore.kernel.org/r/20250110120500.GA923218@lichtman.org
2025-02-25x86/percpu: Construct __percpu_seg_override from __percpu_segUros Bizjak1-6/+2
Construct __percpu_seg_override macro from __percpu_seg by concatenating the later with __seg_ prefix to reduce ifdeffery. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250225200235.48007-1-ubizjak@gmail.com
2025-02-25x86/mtrr: Remove unnecessary strlen() in mtrr_write()Thorsten Blum1-4/+2
The local variable length already holds the string length after calling strncpy_from_user(). Using another local variable linlen and calling strlen() is therefore unnecessary and can be removed. Remove linlen and strlen() and use length instead. No change in functionality intended. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250225131621.329699-2-thorsten.blum@linux.dev
2025-02-25perf/x86/intel: Use better start period for frequency modeKan Liang1-0/+85
Freqency mode is the current default mode of Linux perf. A period of 1 is used as a starting period. The period is auto-adjusted on each tick or an overflow, to meet the frequency target. The start period of 1 is too low and may trigger some issues: - Many HWs do not support period 1 well. https://lore.kernel.org/lkml/875xs2oh69.ffs@tglx/ - For an event that occurs frequently, period 1 is too far away from the real period. Lots of samples are generated at the beginning. The distribution of samples may not be even. - A low starting period for frequently occurring events also challenges virtualization, which has a longer path to handle a PMI. The limit_period value only checks the minimum acceptable value for HW. It cannot be used to set the start period, because some events may need a very low period. The limit_period cannot be set too high. It doesn't help with the events that occur frequently. It's hard to find a universal starting period for all events. The idea implemented by this patch is to only give an estimate for the popular HW and HW cache events. For the rest of the events, start from the lowest possible recommended value. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250117151913.3043942-3-kan.liang@linux.intel.com
2025-02-25KVM: VMX: Pass XFD_ERR as pseudo-payload when injecting #NMSean Christopherson1-4/+10
Pass XFD_ERR via KVM's exception payload mechanism when injecting an #NM after interception so that XFD_ERR can be propagated to FRED's event_data field without needing a dedicated field (which would need to be migrated). For non-FRED vCPUs, this is a glorified NOP as kvm_deliver_exception_payload() will simply do nothing (which is desirable and correct). Signed-off-by: Xin Li (Intel) <xin@zytor.com> Tested-by: Shan Kang <shan.kang@intel.com> Link: https://lore.kernel.org/r/20241001050110.3643764-15-xin@zytor.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-25KVM: VMX: Don't modify guest XFD_ERR if CR0.TS=1Sean Christopherson1-8/+8
Don't update the guest's XFD_ERR MSR if CR0.TS is set; per the SDM, XFD_ERR is not modified if CR0.TS=1. Although it's not explicitly stated in the SDM, conceptually it makes sense the CR0.TS check would be done prior to the XFD_ERR check, e.g. CR0.TS=1 blocks all SIMD state, whereas XFD blocks only XTILE state. Device-not-available exceptions that are not due to XFD - those resulting from setting CR0.TS to 1 - do not modify the IA32_XFD_ERR MSR. Opportunistically update the comment to call out that XFD_ERR is updated before the VM-Exit check occurs. Nothing in the SDM explicitly calls out this behavior, but logically it must be the behavior, otherwise reading XFD_ERR in handle_nm_fault_irqoff() would return stale data, i.e. the to-be-delivered XFD_ERR value would need to be saved in EXIT_QUALIFICATION, a la DR6 for #DB and CR2 for #PF, so that software could capture the guest value. Fixes: ec5be88ab29f ("kvm: x86: Intercept #NM for saving IA32_XFD_ERR") Signed-off-by: Xin Li (Intel) <xin@zytor.com> Tested-by: Shan Kang <shan.kang@intel.com> Link: https://lore.kernel.org/r/20241001050110.3643764-3-xin@zytor.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-25KVM: x86: Use a dedicated flow for queueing re-injected exceptionsSean Christopherson4-61/+63
Open code the filling of vcpu->arch.exception in kvm_requeue_exception() instead of bouncing through kvm_multiple_exception(), as re-injection doesn't actually share that much code with "normal" injection, e.g. the VM-Exit interception check, payload delivery, and nested exception code is all bypassed as those flows only apply during initial injection. When FRED comes along, the special casing will only get worse, as FRED explicitly tracks nested exceptions and essentially delivers the payload on the stack frame, i.e. re-injection will need more inputs, and normal injection will have yet more code that needs to be bypassed when KVM is re-injecting an exception. No functional change intended. Signed-off-by: Xin Li (Intel) <xin@zytor.com> Tested-by: Shan Kang <shan.kang@intel.com> Link: https://lore.kernel.org/r/20241001050110.3643764-2-xin@zytor.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-25KVM: x86: Rename and invert async #PF's send_user_only flag to send_alwaysSean Christopherson2-3/+3
Rename send_user_only to avoid "user", because KVM's ABI is to not inject page faults into CPL0, whereas "user" in x86 is specifically CPL3. Invert the polarity to keep the naming simple and unambiguous. E.g. while KVM often refers to CPL0 as "kernel", that terminology isn't ubiquitous, and "send_kernel" could be misconstrued as "send only to kernel". Link: https://lore.kernel.org/r/20250215010609.1199982-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-25KVM: x86: Don't inject PV async #PF if SEND_ALWAYS=0 and guest state is ↵Sean Christopherson1-1/+1
protected Don't inject PV async #PFs into guests with protected register state, i.e. SEV-ES and SEV-SNP guests, unless the guest has opted-in to receiving #PFs at CPL0. For protected guests, the actual CPL of the guest is unknown. Note, no sane CoCo guest should enable PV async #PF, but the current state of Linux-as-a-CoCo-guest isn't entirely sane. Fixes: add5e2f04541 ("KVM: SVM: Add support for the SEV-ES VMSA") Link: https://lore.kernel.org/r/20250215010609.1199982-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-25KVM: x86: Update Xen TSC leaves during CPUID emulationFred Griffoul5-27/+29
The Xen emulation in KVM modifies certain CPUID leaves to expose TSC information to the guest. Previously, these CPUID leaves were updated whenever guest time changed, but this conflicts with KVM_SET_CPUID/KVM_SET_CPUID2 ioctls which reject changes to CPUID entries on running vCPUs. Fix this by updating the TSC information directly in the CPUID emulation handler instead of modifying the vCPU's CPUID entries. Signed-off-by: Fred Griffoul <fgriffo@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20250124150539.69975-1-fgriffo@amazon.co.uk Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-25perf/x86: Fix low freqency setting issueKan Liang1-1/+1
Perf doesn't work at low frequencies: $ perf record -e cpu_core/instructions/ppp -F 120 Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_core/instructions/ppp). "dmesg | grep -i perf" may provide additional information. The limit_period() check avoids a low sampling period on a counter. It doesn't intend to limit the frequency. The check in the x86_pmu_hw_config() should be limited to non-freq mode. The attr.sample_period and attr.sample_freq are union. The attr.sample_period should not be used to indicate the frequency mode. Fixes: c46e665f0377 ("perf/x86: Add INST_RETIRED.ALL workarounds") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250117151913.3043942-1-kan.liang@linux.intel.com Closes: https://lore.kernel.org/lkml/20250115154949.3147-1-ravi.bangoria@amd.com/
2025-02-25x86/nmi: Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus()Waiman Long3-7/+47
Depending on the type of panics, it was found that the __register_nmi_handler() function can be called in NMI context from nmi_shootdown_cpus() leading to a lockdep splat: WARNING: inconsistent lock state inconsistent {INITIAL USE} -> {IN-NMI} usage. lock(&nmi_desc[0].lock); <Interrupt> lock(&nmi_desc[0].lock); Call Trace: _raw_spin_lock_irqsave __register_nmi_handler nmi_shootdown_cpus kdump_nmi_shootdown_cpus native_machine_crash_shutdown __crash_kexec In this particular case, the following panic message was printed before: Kernel panic - not syncing: Fatal hardware error! This message seemed to be given out from __ghes_panic() running in NMI context. The __register_nmi_handler() function which takes the nmi_desc lock with irq disabled shouldn't be called from NMI context as this can lead to deadlock. The nmi_shootdown_cpus() function can only be invoked once. After the first invocation, all other CPUs should be stuck in the newly added crash_nmi_callback() and cannot respond to a second NMI. Fix it by adding a new emergency NMI handler to the nmi_desc structure and provide a new set_emergency_nmi_handler() helper to set crash_nmi_callback() in any context. The new emergency handler will preempt other handlers in the linked list. That will eliminate the need to take any lock and serve the panic in NMI use case. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20250206191844.131700-1-longman@redhat.com
2025-02-24x86/percpu: Unify __pcpu_op{1,2}_N() macros to __pcpu_op_N()Uros Bizjak1-20/+18
Unify __pcpu_op1_N() and __pcpu_op2_N() macros to __pcpu_op_N() by applying the macro only to asm mnemonic, not to the mnemonic plus its arguments. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250224071648.15913-1-ubizjak@gmail.com
2025-02-24KVM: nVMX: Synthesize nested VM-Exit for supported emulation interceptsSean Christopherson1-14/+56
When emulating an instruction on behalf of L2 that L1 wants to intercept, generate a nested VM-Exit instead of injecting a #UD into L2. Now that (most of) the necessary information is available, synthesizing a VM-Exit isn't terribly difficult. Punt on decoding the ModR/M for descriptor table exits for now. There is no evidence that any hypervisor intercepts descriptor table accesses *and* uses the EXIT_QUALIFICATION to expedite emulation, i.e. it's not worth delaying basic support for. To avoid doing more harm than good, e.g. by putting L2 into an infinite or effectively corrupting its code stream, inject #UD if the instruction length is nonsensical. Link: https://lore.kernel.org/r/20250201015518.689704-11-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: nVMX: Allow the caller to provide instruction length on nested VM-ExitSean Christopherson2-7/+27
Rework the nested VM-Exit helper to take the instruction length as a parameter, and convert nested_vmx_vmexit() into a "default" wrapper that grabs the length from vmcs02 as appropriate. This will allow KVM to set the correct instruction length when synthesizing a nested VM-Exit when emulating an instruction that L1 wants to intercept. No functional change intended, as the path to prepare_vmcs12()'s reading of vmcs02.VM_EXIT_INSTRUCTION_LEN is gated on the same set of conditions as the VMREAD in the new nested_vmx_vmexit(). Link: https://lore.kernel.org/r/20250201015518.689704-10-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: x86: Add a #define for the architectural max instruction lengthSean Christopherson3-9/+11
Add a #define to capture x86's architecturally defined max instruction length instead of open coding the literal in a variety of places. No functional change intended. Link: https://lore.kernel.org/r/20250201015518.689704-9-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: x86: Plumb the emulator's starting RIP into nested intercept checksSean Christopherson2-0/+2
When checking for intercept when emulating an instruction on behalf of L2, pass the emulator's view of the RIP of the instruction being emulated to vendor code. Unlike SVM, which communicates the next RIP on VM-Exit, VMX communicates the length of the instruction that generated the VM-Exit, i.e. requires the current and next RIPs. Note, unless userspace modifies RIP during a userspace exit that requires completion, kvm_rip_read() will contain the same information. Pass the emulator's view largely out of a paranoia, and because there is no meaningful cost in doing so. Link: https://lore.kernel.org/r/20250201015518.689704-8-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: x86: Plumb the src/dst operand types through to .check_intercept()Sean Christopherson2-0/+4
When checking for intercept when emulating an instruction on behalf of L2, forward the source and destination operand types to vendor code so that VMX can synthesize the correct EXIT_QUALIFICATION for port I/O VM-Exits. Link: https://lore.kernel.org/r/20250201015518.689704-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: nVMX: Consolidate missing X86EMUL_INTERCEPTED logic in L2 emulationSean Christopherson1-11/+7
Refactor the handling of port I/O interception checks when emulating on behalf of L2 in anticipation of synthesizing a nested VM-Exit to L1 instead of injecting a #UD into L2. No functional change intended. Link: https://lore.kernel.org/r/20250201015518.689704-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: nVMX: Emulate HLT in L2 if it's not interceptedSean Christopherson1-0/+5
Extend VMX's nested intercept logic for emulated instructions to handle HLT interception, primarily for testing purposes. Failure to allow emulation of HLT isn't all that interesting, as emulating HLT while L2 is active either requires forced emulation (and no #UD intercept in L1), TLB games in the guest to coerce KVM into emulating the wrong instruction, or a bug elsewhere in KVM. E.g. without commit 47ef3ef843c0 ("KVM: VMX: Handle event vectoring error in check_emulate_instruction()"), KVM can end up trying to emulate HLT if RIP happens to point at a HLT when a vectored event arrives with L2's IDT pointing at emulated MMIO. Note, vmx_check_intercept() is still broken when L1 wants to intercept an instruction, as KVM injects a #UD instead of synthesizing a nested VM-Exit. That issue extends far beyond HLT, punt on it for now. Link: https://lore.kernel.org/r/20250201015518.689704-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: nVMX: Allow emulating RDPID on behalf of L2Sean Christopherson1-6/+7
Return X86EMUL_CONTINUE instead X86EMUL_UNHANDLEABLE when emulating RDPID on behalf of L2 and L1 _does_ expose RDPID/RDTSCP to L2. When RDPID emulation was added by commit fb6d4d340e05 ("KVM: x86: emulate RDPID"), KVM incorrectly allowed emulation by default. Commit 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode") fixed that flaw, but missed that RDPID emulation was relying on the common return path to allow emulation on behalf of L2. Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode") Link: https://lore.kernel.org/r/20250201015518.689704-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: nSVM: Pass next RIP, not current RIP, for nested VM-Exit on emulationSean Christopherson1-1/+1
Set "next_rip" in the emulation interception info passed to vendor code using the emulator context's "_eip", not "eip". "eip" holds RIP from the start of emulation, i.e. the RIP of the instruction that's being emulated, whereas _eip tracks the context's current position in decoding the code stream, which at the time of the intercept checks is effectively the RIP of the next instruction. Passing the current RIP as next_rip causes SVM to stuff the wrong value value into vmcb12->control.next_rip if a nested VM-Exit is generated, i.e. if L1 wants to intercept the instruction, and could result in L1 putting L2 into an infinite loop due to restarting L2 with the same RIP over and over. Fixes: 8a76d7f25f8f ("KVM: x86: Add x86 callback for intercept check") Link: https://lore.kernel.org/r/20250201015518.689704-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: nVMX: Check PAUSE_EXITING, not BUS_LOCK_DETECTION, on PAUSE emulationSean Christopherson1-1/+1
When emulating PAUSE on behalf of L2, check for interception in vmcs12 by looking at primary execution controls, not secondary execution controls. Checking for PAUSE_EXITING in secondary execution controls effectively results in KVM looking for BUS_LOCK_DETECTION, which KVM doesn't expose to L1, i.e. is always off in vmcs12, and ultimately results in KVM failing to "intercept" PAUSE. Because KVM doesn't handle interception during emulation correctly on VMX, i.e. the "fixed" code is still quite broken, and not intercepting PAUSE is relatively benign, for all intents and purposes the bug means that L2 gets to live when it would otherwise get an unexpected #UD. Fixes: 4984563823f0 ("KVM: nVMX: Emulate NOPs in L2, and PAUSE if it's not intercepted") Link: https://lore.kernel.org/r/20250201015518.689704-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: x86/xen: Move kvm_xen_hvm_config field into kvm_xenSean Christopherson4-15/+16
Now that all KVM usage of the Xen HVM config information is buried behind CONFIG_KVM_XEN=y, move the per-VM kvm_xen_hvm_config field out of kvm_arch and into kvm_xen. No functional change intended. Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org> Link: https://lore.kernel.org/r/20250215011437.1203084-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: x86/xen: Bury xen_hvm_config behind CONFIG_KVM_XEN=ySean Christopherson1-2/+1
Now that all references to kvm_vcpu_arch.xen_hvm_config are wrapped with CONFIG_KVM_XEN #ifdefs, bury the field itself behind CONFIG_KVM_XEN=y. No functional change intended. Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org> Link: https://lore.kernel.org/r/20250215011437.1203084-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: x86/xen: Consult kvm_xen_enabled when checking for Xen MSR writesSean Christopherson1-0/+3
Query kvm_xen_enabled when detecting writes to the Xen hypercall page MSR so that the check is optimized away in the likely scenario that Xen isn't enabled for the VM. Deliberately open code the check instead of using kvm_xen_msr_enabled() in order to avoid a double load of xen_hvm_config.msr (which is admittedly rather pointless given the widespread lack of READ_ONCE() usage on the plethora of vCPU-scoped accesses to kvm->arch.xen state). No functional change intended. Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org> Link: https://lore.kernel.org/r/20250215011437.1203084-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: x86/xen: Add an #ifdef'd helper to detect writes to Xen MSRSean Christopherson2-1/+11
Add a helper to detect writes to the Xen hypercall page MSR, and provide a stub for CONFIG_KVM_XEN=n to optimize out the check for kernels built without Xen support. Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20250215011437.1203084-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-24KVM: x86/xen: Restrict hypercall MSR to unofficial synthetic rangeSean Christopherson2-0/+12
Reject userspace attempts to set the Xen hypercall page MSR to an index outside of the "standard" virtualization range [0x40000000, 0x4fffffff], as KVM is not equipped to handle collisions with real MSRs, e.g. KVM doesn't update MSR interception, conflicts with VMCS/VMCB fields, special case writes in KVM, etc. While the MSR index isn't strictly ABI, i.e. can theoretically float to any value, in practice no known VMM sets the MSR index to anything other than 0x40000000 or 0x40000200. Cc: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org> Link: https://lore.kernel.org/r/20250215011437.1203084-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-02-23x86/ioperm: Use atomic64_inc_return() in ksys_ioperm()Uros Bizjak1-1/+1
Use atomic64_inc_return(&ref) instead of atomic64_add_return(1, &ref) to use optimized implementation on targets that define atomic_inc_return() and to remove now unneeded initialization of the %eax/%edx register pair before the call to atomic64_inc_return(). On x86_32 the code improves from: 1b0: b9 00 00 00 00 mov $0x0,%ecx 1b1: R_386_32 .bss 1b5: 89 43 0c mov %eax,0xc(%ebx) 1b8: 31 d2 xor %edx,%edx 1ba: b8 01 00 00 00 mov $0x1,%eax 1bf: e8 fc ff ff ff call 1c0 <ksys_ioperm+0xa8> 1c0: R_386_PC32 atomic64_add_return_cx8 1c4: 89 03 mov %eax,(%ebx) 1c6: 89 53 04 mov %edx,0x4(%ebx) to: 1b0: be 00 00 00 00 mov $0x0,%esi 1b1: R_386_32 .bss 1b5: 89 43 0c mov %eax,0xc(%ebx) 1b8: e8 fc ff ff ff call 1b9 <ksys_ioperm+0xa1> 1b9: R_386_PC32 atomic64_inc_return_cx8 1bd: 89 03 mov %eax,(%ebx) 1bf: 89 53 04 mov %edx,0x4(%ebx) Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250223161355.3607-1-ubizjak@gmail.com
2025-02-23x86/usercopy: Fix kernel-doc func param name in clean_cache_range()'s ↵Randy Dunlap1-1/+1
description Use @addr instead of @vaddr in the kernel-doc comment for clean_cache_range() to eliminate warnings: arch/x86/lib/usercopy_64.c:29: warning: Function parameter or struct member 'addr' not described in 'clean_cache_range' arch/x86/lib/usercopy_64.c:29: warning: Excess function parameter 'vaddr' description in 'clean_cache_range' Fixes: 0aed55af8834 ("x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20250111063333.911084-1-rdunlap@infradead.org
2025-02-22Merge tag 'x86-urgent-2025-02-22' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Fix AVX-VNNI CPU feature dependency bug triggered via the 'noxsave' boot option - Fix typos in the SVA documentation - Add Tony Luck as RDT co-maintainer and remove Fenghua Yu * tag 'x86-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: docs: arch/x86/sva: Fix two grammar errors under Background and FAQ x86/cpufeatures: Make AVX-VNNI depend on AVX MAINTAINERS: Change maintainer for RDT
2025-02-22perf/x86/intel/bts: Allocate bts_ctx only if necessaryLi RongQing1-9/+13
Avoid unnecessary per-CPU memory allocation on unsupported CPUs, this can save 12K memory for each CPU Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20250122074103.3091-1-lirongqing@baidu.com
2025-02-22x86/cpu: Update Intel Family commentsPeter Zijlstra1-6/+6
Because who can ever remember all these names. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250127162252.GK16742@noisy.programming.kicks-ass.net
2025-02-22x86/kexec: Export e820_table_kexec[] to sysfsDave Young2-11/+10
Previously the e820_table_kexec[] was exported to sysfs since kexec-tools uses the memmap entries to prepare the e820 table for the new kernel. The following commit, ~8 years ago, introduced e820_table_firmware[] and changed the behavior to export the firmware table instead: 12df216c61c8 ("x86/boot/e820: Introduce the bootloader provided e820_table_firmware[] table") Originally the kexec_file_load and kexec_load syscalls both used e820_table_kexec[]. Since the sysfs exported entries are from e820_table_firmware[] people now need to tune both tables for kexec. Restore the old behavior so the kexec_load and kexec_file_load syscalls work with only one table update. The e820_table_firmware[] is used by hibernation kernel code and it works without the sysfs exporting. Also remove the SEV e820_table_firmware[] updating code. Also update the code comments here and drop the comments about setup_data reservation since it is not needed any more after this change was made a year ago: fc7f27cda843 ("x86/kexec: Do not update E820 kexec table for setup_data") [ mingo: Tidy up the changelog and comments. ] Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Ashish Kalra <ashish.kalra@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Link: https://lore.kernel.org/r/Z5jcb1GKhLvH8kDc@darkstar.users.ipa.redhat.com
2025-02-22x86/boot: Change some static bootflag functions to boolUros Bizjak1-6/+6
The return values of some functions are of boolean type. Change the type of these function to bool and adjust their return values. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250129154920.6773-1-ubizjak@gmail.com
2025-02-22x86/kaslr: Reduce KASLR entropy on most x86 systemsBalbir Singh1-2/+8
When CONFIG_PCI_P2PDMA=y (which is basically enabled on all large x86 distros), it maps the PFN's via a ZONE_DEVICE mapping using devm_memremap_pages(). The mapped virtual address range corresponds to the pci_resource_start() of the BAR address and size corresponding to the BAR length. When KASLR is enabled, the direct map range of the kernel is reduced to the size of physical memory plus additional padding. If the BAR address is beyond this limit, PCI peer to peer DMA mappings fail. Fix this by not shrinking the size of the direct map when CONFIG_PCI_P2PDMA=y. This reduces the total available entropy, but it's better than the current work around of having to disable KASLR completely. [ mingo: Clarified the changelog to point out the broad impact ... ] Signed-off-by: Balbir Singh <balbirs@nvidia.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci/Kconfig Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/lkml/20250206023201.1481957-1-balbirs@nvidia.com/ Link: https://lore.kernel.org/r/20250206234234.1912585-1-balbirs@nvidia.com -- arch/x86/mm/kaslr.c | 10 ++++++++-- drivers/pci/Kconfig | 6 ++++++ 2 files changed, 14 insertions(+), 2 deletions(-)
2025-02-22x86/microcode/AMD: Load only SHA256-checksummed patchesBorislav Petkov (AMD)3-2/+554
Load patches for which the driver carries a SHA256 checksum of the patch blob. This can be disabled by adding "microcode.amd_sha_check=off" on the kernel cmdline. But it is highly NOT recommended. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>