Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 resource control updates from Borislav Petkov:
"Avoid IPI-ing a task in certain cases and prevent load/store tearing
when accessing a task's resctrl fields concurrently"
* tag 'x86_cache_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Apply READ_ONCE/WRITE_ONCE to task_struct.{rmid,closid}
x86/resctrl: Use task_curr() instead of task_struct->on_cpu to prevent unnecessary IPI
x86/resctrl: Add printf attribute to log function
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 CPUID cleanup from Borislav Petkov:
"Assign a dedicated feature word to a CPUID leaf which is widely used"
* tag 'x86_cpu_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpufeatures: Assign dedicated feature word for CPUID_0x8000001F[EAX]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 FPU updates from Borislav Petkov:
"x86 fpu usage optimization and cleanups:
- make 64-bit kernel code which uses 387 insns request a x87 init
(FNINIT) explicitly when using the FPU
- misc cleanups"
* tag 'x86_fpu_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu/xstate: Use sizeof() instead of a constant
x86/fpu/64: Don't FNINIT in kernel_fpu_begin()
x86/fpu: Make the EFI FPU calling convention explicit
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode cleanup from Borislav Petkov:
"Make the driver init function static again"
* tag 'x86_microcode_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode: Make microcode_init() static
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm cleanups from Borislav Petkov:
- PTRACE_GETREGS/PTRACE_PUTREGS regset selection cleanup
- Another initial cleanup - more to follow - to the fault handling
code.
- Other minor cleanups and corrections.
* tag 'x86_mm_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
x86/{fault,efi}: Fix and rename efi_recover_from_page_fault()
x86/fault: Don't run fixups for SMAP violations
x86/fault: Don't look for extable entries for SMEP violations
x86/fault: Rename no_context() to kernelmode_fixup_or_oops()
x86/fault: Bypass no_context() for implicit kernel faults from usermode
x86/fault: Split the OOPS code out from no_context()
x86/fault: Improve kernel-executing-user-memory handling
x86/fault: Correct a few user vs kernel checks wrt WRUSS
x86/fault: Document the locking in the fault_signal_pending() path
x86/fault/32: Move is_f00f_bug() to do_kern_addr_fault()
x86/fault: Fold mm_fault_error() into do_user_addr_fault()
x86/fault: Skip the AMD erratum #91 workaround on unaffected CPUs
x86/fault: Fix AMD erratum #91 errata fixup for user code
x86/Kconfig: Remove HPET_EMULATE_RTC depends on RTC
x86/asm: Fixup TASK_SIZE_MAX comment
x86/ptrace: Clean up PTRACE_GETREGS/PTRACE_PUTREGS regset selection
x86/vm86/32: Remove VM86_SCREEN_BITMAP support
x86: Remove definition of DEBUG
x86/entry: Remove now unused do_IRQ() declaration
x86/mm: Remove duplicate definition of _PAGE_PAT_LARGE
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 paravirt updates from Borislav Petkov:
"Part one of a major conversion of the paravirt infrastructure to our
kernel patching facilities and getting rid of the custom-grown ones"
* tag 'x86_paravirt_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pv: Rework arch_local_irq_restore() to not use popf
x86/xen: Drop USERGS_SYSRET64 paravirt call
x86/pv: Switch SWAPGS to ALTERNATIVE
x86/xen: Use specific Xen pv interrupt entry for DF
x86/xen: Use specific Xen pv interrupt entry for MCE
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ard Biesheuvel via Borislav Petkov:
"A few cleanups left and right, some of which were part of a initrd
measured boot series that needs some more work, and so only the
cleanup patches have been included for this release"
* tag 'efi-next-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/arm64: Update debug prints to reflect other entropy sources
efi: x86: clean up previous struct mm switching
efi: x86: move mixed mode stack PA variable out of 'efi_scratch'
efi/libstub: move TPM related prototypes into efistub.h
efi/libstub: fix prototype of efi_tcg2_protocol::get_event_log()
efi/libstub: whitespace cleanup
efi: ia64: move IA64-only declarations to new asm/efi.h header
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:
- move therm_throt.c to the thermal framework, where it belongs.
- identify CPUs which miss to enter the broadcast handler, as an
additional debugging aid.
* tag 'ras_updates_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
thermal: Move therm_throt there from x86/mce
x86/mce: Get rid of mcheck_intel_therm_init()
x86/mce: Make mce_timed_out() identify holdout CPUs
|
|
Remove several exports from the MMU that are no longer necessary.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-15-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Stop setting dirty bits for MMU pages when dirty logging is disabled for
a memslot, as PML is now completely disabled when there are no memslots
with dirty logging enabled.
This means that spurious PML entries will be created for memslots with
dirty logging disabled if at least one other memslot has dirty logging
enabled. However, spurious PML entries are already possible since
dirty bits are set only when a dirty logging is turned off, i.e. memslots
that are never dirty logged will have dirty bits cleared.
In the end, it's faster overall to eat a few spurious PML entries in the
window where dirty logging is being disabled across all memslots.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-13-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Currently, if enable_pml=1 PML remains enabled for the entire lifetime
of the VM irrespective of whether dirty logging is enable or disabled.
When dirty logging is disabled, all the pages of the VM are manually
marked dirty, so that PML is effectively non-operational. Setting
the dirty bits is an expensive operation which can cause severe MMU
lock contention in a performance sensitive path when dirty logging is
disabled after a failed or canceled live migration.
Manually setting dirty bits also fails to prevent PML activity if some
code path clears dirty bits, which can incur unnecessary VM-Exits.
In order to avoid this extra overhead, dynamically enable/disable PML
when dirty logging gets turned on/off for the first/last memslot.
Signed-off-by: Makarand Sonare <makarandsonare@google.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-12-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Drop the facade of KVM's PML logic being vendor specific and move the
bits that aren't truly VMX specific into common x86 code. The MMU logic
for dealing with PML is tightly coupled to the feature and to VMX's
implementation, bouncing through kvm_x86_ops obfuscates the code without
providing any meaningful separation of concerns or encapsulation.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-10-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Store the vendor-specific dirty log size in a variable, there's no need
to wrap it in a function since the value is constant after
hardware_setup() runs.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210213005015.1651772-9-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Allow building x86 with PREEMPT_DYNAMIC=n, this is needed for
PREEMPT_RT as it makes no sense to not have full preemption on
PREEMPT_RT.
Fixes: 8c98e8cf723c ("preempt/dynamic: Provide preempt_schedule[_notrace]() static calls")
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Mike Galbraith <efault@gmx.de>
Link: https://lkml.kernel.org/r/YCK1+JyFNxQnWeXK@hirez.programming.kicks-ass.net
|
|
Use the new EXPORT_STATIC_CALL_TRAMP() / static_call_mod() to unexport
the static_call_key for the PREEMPT_DYNAMIC calls such that modules
can no longer update these calls.
Having modules change/hi-jack the preemption calls would be horrible.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
When exporting static_call_key; with EXPORT_STATIC_CALL*(), the module
can use static_call_update() to change the function called. This is
not desirable in general.
Not exporting static_call_key however also disallows usage of
static_call(), since objtool needs the key to construct the
static_call_site.
Solve this by allowing objtool to create the static_call_site using
the trampoline address when it builds a module and cannot find the
static_call_key symbol. The module loader will then try and map the
trampole back to a key before it constructs the normal sites list.
Doing this requires a trampoline -> key associsation, so add another
magic section that keeps those.
Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210127231837.ifddpn7rhwdaepiu@treble
|
|
Provide static calls to control preempt_schedule[_notrace]()
(called in CONFIG_PREEMPT) so that we can override their behaviour when
preempt= is overriden.
Since the default behaviour is full preemption, both their calls are
initialized to the arch provided wrapper, if any.
[fweisbec: only define static calls when PREEMPT_DYNAMIC, make it less
dependent on x86 with __preempt_schedule_func]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-7-frederic@kernel.org
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Update Copyright year and drop file names from files themselves.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
After the commit f1be6cdaf57c ("x86/platform/intel-mid: Make
intel_scu_device_register() static") the platform_device.h is not being
used anymore by intel-mid.h. Remove it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Since there is no more user of this global variable and associated custom API,
we may safely drop this legacy reinvented a wheel from the kernel sources.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The header is used by a single user. Move header content to that user.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
SFI-based platforms are gone. So does this framework.
This removes mention of SFI through the drivers and other code as well.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for Linux 5.12
- Make the nVHE EL2 object relocatable, resulting in much more
maintainable code
- Handle concurrent translation faults hitting the same page
in a more elegant way
- Support for the standard TRNG hypervisor call
- A bunch of small PMU/Debug fixes
- Allow the disabling of symbol export from assembly code
- Simplification of the early init hypercall handling
|
|
Merge recent cleanups to the x86 MM code to resolve a conflict.
Conflicts:
arch/x86/mm/fault.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Merge in the recent paravirt changes to resolve conflicts caused
by objtool annotations.
Conflicts:
arch/x86/xen/xen-asm.S
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Just like MSI/MSI-X, IO-APIC interrupts are remapped by Microsoft
Hypervisor when Linux runs as the root partition. Implement an IRQ
domain to handle mapping and unmapping of IO-APIC interrupts.
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Joerg Roedel <joro@8bytes.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-17-wei.liu@kernel.org
|
|
When Linux runs as the root partition on Microsoft Hypervisor, its
interrupts are remapped. Linux will need to explicitly map and unmap
interrupts for hardware.
Implement an MSI domain to issue the correct hypercalls. And initialize
this irq domain as the default MSI irq domain.
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-16-wei.liu@kernel.org
|
|
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-15-wei.liu@kernel.org
|
|
We will soon need to access fields inside the MSI address and MSI data
fields. Introduce hv_msi_address_register and hv_msi_data_register.
Fix up one user of hv_msi_entry in mshyperv.h.
No functional change expected.
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-12-wei.liu@kernel.org
|
|
They are used to deposit pages into Microsoft Hypervisor and bring up
logical and virtual processors.
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-10-wei.liu@kernel.org
|
|
We will need the partition ID for executing some hypercalls later.
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-7-wei.liu@kernel.org
|
|
When Linux runs as the root partition, it will need to make hypercalls
which return data from the hypervisor.
Allocate pages for storing results when Linux runs as the root
partition.
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-6-wei.liu@kernel.org
|
|
For now we can use the privilege flag to check. Stash the value to be
used later.
Put in a bunch of defines for future use when we want to have more
fine-grained detection.
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210203150435.27941-3-wei.liu@kernel.org
|
|
If bit 22 of Group B Features is set, the guest has access to the
Isolation Configuration CPUID leaf. On x86, the first four bits
of EAX in this leaf provide the isolation type of the partition;
we entail three isolation types: 'SNP' (hardware-based isolation),
'VBS' (software-based isolation), and 'NONE' (no isolation).
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: x86@kernel.org
Cc: linux-arch@vger.kernel.org
Link: https://lore.kernel.org/r/20210201144814.2701-2-parri.andrea@gmail.com
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
|
There is no reason to have this as a seperate function for a single caller.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002513.382806685@linutronix.de
|
|
To prepare for inlining do_softirq_own_stack() replace
__ARCH_HAS_DO_SOFTIRQ with a Kconfig switch and select it in the affected
architectures.
This allows in the next step to move the function prototype and the inline
stub into a seperate asm-generic header file which is required to avoid
include recursion.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002513.181713427@linutronix.de
|
|
Now that all invocations of irq_exit_rcu() happen on the irq stack, turn on
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK which causes the core code to invoke
__do_softirq() directly without going through do_softirq_own_stack().
That means do_softirq_own_stack() is only invoked from task context which
means it can't be on the irq stack. Remove the conditional from
run_softirq_on_irqstack_cond() and rename the function accordingly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002513.068033456@linutronix.de
|
|
Use the new inline stack switching and remove the old ASM indirect call
implementation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.972714001@linutronix.de
|
|
Convert device interrupts to inline stack switching by replacing the
existing macro implementation with the new inline version. Tweak the
function signature of the actual handler function to have the vector
argument as u32. That allows the inline macro to avoid extra intermediates
and lets the compiler be smarter about the whole thing.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.769728139@linutronix.de
|
|
To inline the stack switching and to prepare for enabling
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK provide a macro template for system
vectors and device interrupts and convert the system vectors over to it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.676197354@linutronix.de
|
|
The effort to make the ASM entry code slim and unified moved the irq stack
switching out of the low level ASM code so that the whole return from
interrupt work and state handling can be done in C and the ASM code just
handles the low level details of entry and exit.
This ended up being a suboptimal implementation for various reasons
(including tooling). The main pain points are:
- The indirect call which is expensive thanks to retpoline
- The inability to stay on the irq stack for softirq processing on return
from interrupt
- The fact that the stack switching code ends up being an easy to target
exploit gadget.
Prepare for inlining the stack switching logic into the C entry points by
providing a ASM macro which contains the guts of the switching mechanism:
1) Store RSP at the top of the irq stack
2) Switch RSP to the irq stack
3) Invoke code
4) Pop the original RSP back
Document the unholy asm() logic while at it to reduce the amount of head
scratching required a half year from now.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.578371068@linutronix.de
|
|
The per CPU hardirq_stack_ptr contains the pointer to the irq stack in the
form that it is ready to be assigned to [ER]SP so that the first push ends
up on the top entry of the stack.
But the stack switching on 64 bit has the following rules:
1) Store the current stack pointer (RSP) in the top most stack entry
to allow the unwinder to link back to the previous stack
2) Set RSP to the top most stack entry
3) Invoke functions on the irq stack
4) Pop RSP from the top most stack entry (stored in #1) so it's back
to the original stack.
That requires all stack switching code to decrement the stored pointer by 8
in order to be able to store the current RSP and then set RSP to that
location. That's a pointless exercise.
Do the -8 adjustment right when storing the pointer and make the data type
a void pointer to avoid confusion vs. the struct irq_stack data type which
is on 64bit only used to declare the backing store. Move the definition
next to the inuse flag so they likely end up in the same cache
line. Sticking them into a struct to enforce it is a seperate change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.354260928@linutronix.de
|
|
The recursion protection for hard interrupt stacks is an unsigned int per
CPU variable initialized to -1 named __irq_count.
The irq stack switching is only done when the variable is -1, which creates
worse code than just checking for 0. When the stack switching happens it
uses this_cpu_add/sub(1), but there is no reason to do so. It simply can
use straight writes. This is a historical leftover from the low level ASM
code which used inc and jz to make a decision.
Rename it to hardirq_stack_inuse, make it a bool and use plain stores.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002512.228830141@linutronix.de
|
|
efi_recover_from_page_fault() doesn't recover -- it does a special EFI
mini-oops. Rename it to make it clear that it crashes.
While renaming it, I noticed a blatant bug: a page fault oops in a
different thread happening concurrently with an EFI runtime service call
would be misinterpreted as an EFI page fault. Fix that.
This isn't quite exact. The situation could be improved by using a
special CS for calls into EFI.
[ bp: Massage commit message and simplify in interrupt check. ]
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/f43b1e80830dc78ed60ed8b0826f4f189254570c.1612924255.git.luto@kernel.org
|
|
POPF is a rather expensive operation, so don't use it for restoring
irq flags. Instead, test whether interrupts are enabled in the flags
parameter and enable interrupts via STI in that case.
This results in the restore_fl paravirt op to be no longer needed.
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210120135555.32594-7-jgross@suse.com
|
|
USERGS_SYSRET64 is used to return from a syscall via SYSRET, but
a Xen PV guest will nevertheless use the IRET hypercall, as there
is no sysret PV hypercall defined.
So instead of testing all the prerequisites for doing a sysret and
then mangling the stack for Xen PV again for doing an iret just use
the iret exit from the beginning.
This can easily be done via an ALTERNATIVE like it is done for the
sysenter compat case already.
It should be noted that this drops the optimization in Xen for not
restoring a few registers when returning to user mode, but it seems
as if the saved instructions in the kernel more than compensate for
this drop (a kernel build in a Xen PV guest was slightly faster with
this patch applied).
While at it remove the stale sysret32 remnants.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210120135555.32594-6-jgross@suse.com
|
|
SWAPGS is used only for interrupts coming from user mode or for
returning to user mode. So there is no reason to use the PARAVIRT
framework, as it can easily be replaced by an ALTERNATIVE depending
on X86_FEATURE_XENPV.
There are several instances using the PV-aware SWAPGS macro in paths
which are never executed in a Xen PV guest. Replace those with the
plain swapgs instruction. For SWAPGS_UNSAFE_STACK the same applies.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210120135555.32594-5-jgross@suse.com
|
|
Xen PV guests don't use IST. For double fault interrupts, switch to
the same model as NMI.
Correct a typo in a comment while copying it.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210120135555.32594-4-jgross@suse.com
|
|
Xen PV guests don't use IST. For machine check interrupts, switch to the
same model as debug interrupts.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210120135555.32594-3-jgross@suse.com
|