summaryrefslogtreecommitdiff
path: root/arch/x86/xen
AgeCommit message (Collapse)AuthorFilesLines
2021-04-29Merge tag 'x86-mm-2021-04-29' of ↵Linus Torvalds1-6/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 tlb updates from Ingo Molnar: "The x86 MM changes in this cycle were: - Implement concurrent TLB flushes, which overlaps the local TLB flush with the remote TLB flush. In testing this improved sysbench performance measurably by a couple of percentage points, especially if TLB-heavy security mitigations are active. - Further micro-optimizations to improve the performance of TLB flushes" * tag 'x86-mm-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp: Micro-optimize smp_call_function_many_cond() smp: Inline on_each_cpu_cond() and on_each_cpu() x86/mm/tlb: Remove unnecessary uses of the inline keyword cpumask: Mark functions as pure x86/mm/tlb: Do not make is_lazy dirty for no reason x86/mm/tlb: Privatize cpu_tlbstate x86/mm/tlb: Flush remote and local TLBs concurrently x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy() x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote() smp: Run functions concurrently in smp_call_function_many_cond()
2021-04-28Merge tag 'x86_core_for_v5.13' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 updates from Borislav Petkov: - Turn the stack canary into a normal __percpu variable on 32-bit which gets rid of the LAZY_GS stuff and a lot of code. - Add an insn_decode() API which all users of the instruction decoder should preferrably use. Its goal is to keep the details of the instruction decoder away from its users and simplify and streamline how one decodes insns in the kernel. Convert its users to it. - kprobes improvements and fixes - Set the maximum DIE per package variable on Hygon - Rip out the dynamic NOP selection and simplify all the machinery around selecting NOPs. Use the simplified NOPs in objtool now too. - Add Xeon Sapphire Rapids to list of CPUs that support PPIN - Simplify the retpolines by folding the entire thing into an alternative now that objtool can handle alternatives with stack ops. Then, have objtool rewrite the call to the retpoline with the alternative which then will get patched at boot time. - Document Intel uarch per models in intel-family.h - Make Sub-NUMA Clustering topology the default and Cluster-on-Die the exception on Intel. * tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) x86, sched: Treat Intel SNC topology as default, COD as exception x86/cpu: Comment Skylake server stepping too x86/cpu: Resort and comment Intel models objtool/x86: Rewrite retpoline thunk calls objtool: Skip magical retpoline .altinstr_replacement objtool: Cache instruction relocs objtool: Keep track of retpoline call sites objtool: Add elf_create_undef_symbol() objtool: Extract elf_symbol_add() objtool: Extract elf_strtab_concat() objtool: Create reloc sections implicitly objtool: Add elf_create_reloc() helper objtool: Rework the elf_rebuild_reloc_section() logic objtool: Fix static_call list generation objtool: Handle per arch retpoline naming objtool: Correctly handle retpoline thunk calls x86/retpoline: Simplify retpolines x86/alternatives: Optimize optimize_nops() x86: Add insn_decode_kernel() x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration ...
2021-04-26Merge tag 'x86_cleanups_for_v5.13' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 cleanups from Borislav Petkov: "Trivial cleanups and fixes all over the place" * tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Remove me from IDE/ATAPI section x86/pat: Do not compile stubbed functions when X86_PAT is off x86/asm: Ensure asm/proto.h can be included stand-alone x86/platform/intel/quark: Fix incorrect kernel-doc comment syntax in files x86/msr: Make locally used functions static x86/cacheinfo: Remove unneeded dead-store initialization x86/process/64: Move cpu_current_top_of_stack out of TSS tools/turbostat: Unmark non-kernel-doc comment x86/syscalls: Fix -Wmissing-prototypes warnings from COND_SYSCALL() x86/fpu/math-emu: Fix function cast warning x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes x86: Fix various typos in comments, take #2 x86: Remove unusual Unicode characters from comments x86/kaslr: Return boolean values from a function returning bool x86: Fix various typos in comments x86/setup: Remove unused RESERVE_BRK_ARRAY() stacktrace: Move documentation for arch_stack_walk_reliable() to header x86: Remove duplicate TSC DEADLINE MSR definitions
2021-04-26Merge tag 'x86_alternatives_for_v5.13' of ↵Linus Torvalds2-16/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 alternatives/paravirt updates from Borislav Petkov: "First big cleanup to the paravirt infra to use alternatives and thus eliminate custom code patching. For that, the alternatives infrastructure is extended to accomodate paravirt's needs and, as a result, a lot of paravirt patching code goes away, leading to a sizeable cleanup and simplification. Work by Juergen Gross" * tag 'x86_alternatives_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/paravirt: Have only one paravirt patch function x86/paravirt: Switch functions with custom code to ALTERNATIVE x86/paravirt: Add new PVOP_ALT* macros to support pvops in ALTERNATIVEs x86/paravirt: Switch iret pvops to ALTERNATIVE x86/paravirt: Simplify paravirt macros x86/paravirt: Remove no longer needed 32-bit pvops cruft x86/paravirt: Add new features for paravirt patching x86/alternative: Use ALTERNATIVE_TERNARY() in _static_cpu_has() x86/alternative: Support ALTERNATIVE_TERNARY x86/alternative: Support not-feature x86/paravirt: Switch time pvops functions to use static_call() static_call: Add function to query current function static_call: Move struct static_call_key definition to static_call_types.h x86/alternative: Merge include files x86/alternative: Drop unused feature parameter from ALTINSTR_REPLACEMENT()
2021-04-02Merge tag 'v5.12-rc5' into WIP.x86/core, to pick up recent NOP related changesIngo Molnar2-7/+16
In particular we want to have this upstream commit: b90829704780: ("bpf: Use NOP_ATOMIC5 instead of emit_nops(&prog, 5) for BPF_TRAMP_F_CALL_ORIG") ... before merging in x86/cpu changes and the removal of the NOP optimizations, and applying PeterZ's !retpoline objtool series. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-03-31Merge 'x86/alternatives'Borislav Petkov2-16/+14
Pick up dependent changes. Signed-off-by: Borislav Petkov <bp@suse.de>
2021-03-26Merge tag 'for-linus-5.12b-rc5-tag' of ↵Linus Torvalds2-7/+16
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "This contains a small series with a more elegant fix of a problem which was originally fixed in rc2" * tag 'for-linus-5.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: Revert "xen: fix p2m size in dom0 for disabled memory hotplug case" xen/x86: make XEN_BALLOON_MEMORY_HOTPLUG_LIMIT depend on MEMORY_HOTPLUG
2021-03-25Revert "xen: fix p2m size in dom0 for disabled memory hotplug case"Roger Pau Monne2-5/+14
This partially reverts commit 882213990d32 ("xen: fix p2m size in dom0 for disabled memory hotplug case") There's no need to special case XEN_UNPOPULATED_ALLOC anymore in order to correctly size the p2m. The generic memory hotplug option has already been tied together with the Xen hotplug limit, so enabling memory hotplug should already trigger a properly sized p2m on Xen PV. Note that XEN_UNPOPULATED_ALLOC depends on ZONE_DEVICE which pulls in MEMORY_HOTPLUG. Leave the check added to __set_phys_to_machine and the adjusted comment about EXTRA_MEM_RATIO. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210324122424.58685-3-roger.pau@citrix.com [boris: fixed formatting issues] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-25xen/x86: make XEN_BALLOON_MEMORY_HOTPLUG_LIMIT depend on MEMORY_HOTPLUGRoger Pau Monne1-2/+2
The Xen memory hotplug limit should depend on the memory hotplug generic option, rather than the Xen balloon configuration. It's possible to have a kernel with generic memory hotplug enabled, but without Xen balloon enabled, at which point memory hotplug won't work correctly due to the size limitation of the p2m. Rename the option to XEN_MEMORY_HOTPLUG_LIMIT since it's no longer tied to ballooning. Fixes: 9e2369c06c8a18 ("xen: add helpers to allocate unpopulated memory") Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210324122424.58685-2-roger.pau@citrix.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-22x86: Fix various typos in comments, take #2Ingo Molnar1-1/+1
Fix another ~42 single-word typos in arch/x86/ code comments, missed a few in the first pass, in particular in .S files. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-kernel@vger.kernel.org
2021-03-15Merge tag 'v5.12-rc3' into x86/coreBorislav Petkov1-4/+2
Pick up dependent SEV-ES urgent changes to base new work ontop. Signed-off-by: Borislav Petkov <bp@suse.de>
2021-03-12Merge tag 'for-linus-5.12b-rc3-tag' of ↵Linus Torvalds1-4/+2
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Two fix series and a single cleanup: - a small cleanup patch to remove unneeded symbol exports - a series to cleanup Xen grant handling (avoiding allocations in some cases, and using common defines for "invalid" values) - a series to address a race issue in Xen event channel handling" * tag 'for-linus-5.12b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: Xen/gntdev: don't needlessly use kvcalloc() Xen/gnttab: introduce common INVALID_GRANT_{HANDLE,REF} Xen/gntdev: don't needlessly allocate k{,un}map_ops[] Xen: drop exports of {set,clear}_foreign_p2m_mapping() xen/events: avoid handling the same event on two cpus at the same time xen/events: don't unmask an event channel when an eoi is pending xen/events: reset affinity of 2-level event when tearing it down
2021-03-11x86/paravirt: Have only one paravirt patch functionJuergen Gross1-1/+0
There is no need any longer to have different paravirt patch functions for native and Xen. Eliminate native_patch() and rename paravirt_patch_default() to paravirt_patch(). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-15-jgross@suse.com
2021-03-11x86/paravirt: Switch iret pvops to ALTERNATIVEJuergen Gross1-2/+1
The iret paravirt op is rather special as it is using a jmp instead of a call instruction. Switch it to ALTERNATIVE. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-12-jgross@suse.com
2021-03-11x86/paravirt: Switch time pvops functions to use static_call()Juergen Gross1-13/+13
The time pvops functions are the only ones left which might be used in 32-bit mode and which return a 64-bit value. Switch them to use the static_call() mechanism instead of pvops, as this allows quite some simplification of the pvops implementation. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-5-jgross@suse.com
2021-03-11Xen/gnttab: introduce common INVALID_GRANT_{HANDLE,REF}Jan Beulich1-2/+2
It's not helpful if every driver has to cook its own. Generalize xenbus'es INVALID_GRANT_HANDLE and pcifront's INVALID_GRANT_REF (which shouldn't have expanded to zero to begin with). Use the constants in p2m.c and gntdev.c right away, and update field types where necessary so they would match with the constants' types (albeit without touching struct ioctl_gntdev_grant_ref's ref field, as that's part of the public interface of the kernel and would require introducing a dependency on Xen's grant_table.h public header). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/db7c38a5-0d75-d5d1-19de-e5fe9f0b9c48@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-11Xen: drop exports of {set,clear}_foreign_p2m_mapping()Jan Beulich1-2/+0
They're only used internally, and the layering violation they contain (x86) or imply (Arm) of calling HYPERVISOR_grant_table_op() strongly advise against any (uncontrolled) use from a module. The functions also never had users except the ones from drivers/xen/grant-table.c forever since their introduction in 3.15. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/746a5cd6-1446-eda4-8b23-03c1cac30b8d@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-08x86/stackprotector/32: Make the canary into a regular percpu variableAndy Lutomirski1-1/+0
On 32-bit kernels, the stackprotector canary is quite nasty -- it is stored at %gs:(20), which is nasty because 32-bit kernels use %fs for percpu storage. It's even nastier because it means that whether %gs contains userspace state or kernel state while running kernel code depends on whether stackprotector is enabled (this is CONFIG_X86_32_LAZY_GS), and this setting radically changes the way that segment selectors work. Supporting both variants is a maintenance and testing mess. Merely rearranging so that percpu and the stack canary share the same segment would be messy as the 32-bit percpu address layout isn't currently compatible with putting a variable at a fixed offset. Fortunately, GCC 8.1 added options that allow the stack canary to be accessed as %fs:__stack_chk_guard, effectively turning it into an ordinary percpu variable. This lets us get rid of all of the code to manage the stack canary GDT descriptor and the CONFIG_X86_32_LAZY_GS mess. (That name is special. We could use any symbol we want for the %fs-relative mode, but for CONFIG_SMP=n, gcc refuses to let us use any name other than __stack_chk_guard.) Forcibly disable stackprotector on older compilers that don't support the new options and turn the stack canary into a percpu variable. The "lazy GS" approach is now used for all 32-bit configurations. Also makes load_gs_index() work on 32-bit kernels. On 64-bit kernels, it loads the GS selector and updates the user GSBASE accordingly. (This is unchanged.) On 32-bit kernels, it loads the GS selector and updates GSBASE, which is now always the user base. This means that the overall effect is the same on 32-bit and 64-bit, which avoids some ifdeffery. [ bp: Massage commit message. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/c0ff7dba14041c7e5d1cae5d4df052f03759bef3.1613243844.git.luto@kernel.org
2021-03-06x86/mm/tlb: Flush remote and local TLBs concurrentlyNadav Amit1-6/+5
To improve TLB shootdown performance, flush the remote and local TLBs concurrently. Introduce flush_tlb_multi() that does so. Introduce paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v (Xen and hyper-v are only compile-tested). While the updated smp infrastructure is capable of running a function on a single local core, it is not optimized for this case. The multiple function calls and the indirect branch introduce some overhead, and might make local TLB flushes slower than they were before the recent changes. Before calling the SMP infrastructure, check if only a local TLB flush is needed to restore the lost performance in this common case. This requires to check mm_cpumask() one more time, but unless this mask is updated very frequently, this should impact performance negatively. Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> # Hyper-v parts Reviewed-by: Juergen Gross <jgross@suse.com> # Xen and paravirt parts Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20210220231712.2475218-5-namit@vmware.com
2021-03-04Merge tag 'for-linus-5.12b-rc2-tag' of ↵Linus Torvalds2-29/+50
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Two security issues (XSA-367 and XSA-369)" * tag 'for-linus-5.12b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: fix p2m size in dom0 for disabled memory hotplug case xen-netback: respect gnttab_map_refs()'s return value Xen/gnttab: handle p2m update errors on a per-slot basis
2021-03-03xen: fix p2m size in dom0 for disabled memory hotplug caseJuergen Gross2-26/+9
Since commit 9e2369c06c8a18 ("xen: add helpers to allocate unpopulated memory") foreign mappings are using guest physical addresses allocated via ZONE_DEVICE functionality. This will result in problems for the case of no balloon memory hotplug being configured, as the p2m list will only cover the initial memory size of the domain. Any ZONE_DEVICE allocated address will be outside the p2m range and thus a mapping can't be established with that memory address. Fix that by extending the p2m size for that case. At the same time add a check for a to be created mapping to be within the p2m limits in order to detect errors early. While changing a comment, remove some 32-bit leftovers. This is XSA-369. Fixes: 9e2369c06c8a18 ("xen: add helpers to allocate unpopulated memory") Cc: <stable@vger.kernel.org> # 5.9 Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2021-03-03Xen/gnttab: handle p2m update errors on a per-slot basisJan Beulich1-3/+41
Bailing immediately from set_foreign_p2m_mapping() upon a p2m updating error leaves the full batch in an ambiguous state as far as the caller is concerned. Instead flags respective slots as bad, unmapping what was mapped there right away. HYPERVISOR_grant_table_op()'s return value and the individual unmap slots' status fields get used only for a one-time - there's not much we can do in case of a failure. Note that there's no GNTST_enomem or alike, so GNTST_general_error gets used. The map ops' handle fields get overwritten just to be on the safe side. This is part of XSA-367. Cc: <stable@vger.kernel.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/96cccf5d-e756-5f53-b91a-ea269bfb9be0@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-02-23Merge tag 'objtool-core-2021-02-23' of ↵Linus Torvalds3-13/+21
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Thomas Gleixner: - Make objtool work for big-endian cross compiles - Make stack tracking via stack pointer memory operations match push/pop semantics to prepare for architectures w/o PUSH/POP instructions. - Add support for analyzing alternatives - Improve retpoline detection and handling - Improve assembly code coverage on x86 - Provide support for inlined stack switching * tag 'objtool-core-2021-02-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) objtool: Support stack-swizzle objtool,x86: Additionally decode: mov %rsp, (%reg) x86/unwind/orc: Change REG_SP_INDIRECT x86/power: Support objtool validation in hibernate_asm_64.S x86/power: Move restore_registers() to top of the file x86/power: Annotate indirect branches as safe x86/acpi: Support objtool validation in wakeup_64.S x86/acpi: Annotate indirect branch as safe x86/ftrace: Support objtool vmlinux.o validation in ftrace_64.S x86/xen/pvh: Annotate indirect branch as safe x86/xen: Support objtool vmlinux.o validation in xen-head.S x86/xen: Support objtool validation in xen-asm.S objtool: Add xen_start_kernel() to noreturn list objtool: Combine UNWIND_HINT_RET_OFFSET and UNWIND_HINT_FUNC objtool: Add asm version of STACK_FRAME_NON_STANDARD objtool: Assume only ELF functions do sibling calls x86/ftrace: Add UNWIND_HINT_FUNC annotation for ftrace_stub objtool: Support retpoline jump detection for vmlinux.o objtool: Fix ".cold" section suffix check for newer versions of GCC objtool: Fix retpoline detection in asm code ...
2021-02-22Merge tag 'for-linus-5.12-rc1-tag' of ↵Linus Torvalds1-8/+7
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "A series of Xen related security fixes, all related to limited error handling in Xen backend drivers" * tag 'for-linus-5.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen-blkback: fix error handling in xen_blkbk_map() xen-scsiback: don't "handle" error by BUG() xen-netback: don't "handle" error by BUG() xen-blkback: don't "handle" error by BUG() xen/arm: don't ignore return errors from set_phys_to_machine Xen/gntdev: correct error checking in gntdev_map_grant_pages() Xen/gntdev: correct dev_bus_addr handling in gntdev_map_grant_pages() Xen/x86: also check kernel mapping in set_foreign_p2m_mapping() Xen/x86: don't bail early from clear_foreign_p2m_mapping()
2021-02-15Xen/x86: also check kernel mapping in set_foreign_p2m_mapping()Jan Beulich1-1/+2
We should not set up further state if either mapping failed; paying attention to just the user mapping's status isn't enough. Also use GNTST_okay instead of implying its value (zero). This is part of XSA-361. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2021-02-15Xen/x86: don't bail early from clear_foreign_p2m_mapping()Jan Beulich1-7/+5
Its sibling (set_foreign_p2m_mapping()) as well as the sibling of its only caller (gnttab_map_refs()) don't clean up after themselves in case of error. Higher level callers are expected to do so. However, in order for that to really clean up any partially set up state, the operation should not terminate upon encountering an entry in unexpected state. It is particularly relevant to notice here that set_foreign_p2m_mapping() would skip setting up a p2m entry if its grant mapping failed, but it would continue to set up further p2m entries as long as their mappings succeeded. Arguably down the road set_foreign_p2m_mapping() may want its page state related WARN_ON() also converted to an error return. This is part of XSA-361. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2021-02-10x86/pv: Rework arch_local_irq_restore() to not use popfJuergen Gross4-54/+0
POPF is a rather expensive operation, so don't use it for restoring irq flags. Instead, test whether interrupts are enabled in the flags parameter and enable interrupts via STI in that case. This results in the restore_fl paravirt op to be no longer needed. Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210120135555.32594-7-jgross@suse.com
2021-02-10x86/xen: Drop USERGS_SYSRET64 paravirt callJuergen Gross3-23/+0
USERGS_SYSRET64 is used to return from a syscall via SYSRET, but a Xen PV guest will nevertheless use the IRET hypercall, as there is no sysret PV hypercall defined. So instead of testing all the prerequisites for doing a sysret and then mangling the stack for Xen PV again for doing an iret just use the iret exit from the beginning. This can easily be done via an ALTERNATIVE like it is done for the sysenter compat case already. It should be noted that this drops the optimization in Xen for not restoring a few registers when returning to user mode, but it seems as if the saved instructions in the kernel more than compensate for this drop (a kernel build in a Xen PV guest was slightly faster with this patch applied). While at it remove the stale sysret32 remnants. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210120135555.32594-6-jgross@suse.com
2021-02-10x86/pv: Switch SWAPGS to ALTERNATIVEJuergen Gross1-3/+0
SWAPGS is used only for interrupts coming from user mode or for returning to user mode. So there is no reason to use the PARAVIRT framework, as it can easily be replaced by an ALTERNATIVE depending on X86_FEATURE_XENPV. There are several instances using the PV-aware SWAPGS macro in paths which are never executed in a Xen PV guest. Replace those with the plain swapgs instruction. For SWAPGS_UNSAFE_STACK the same applies. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andy Lutomirski <luto@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210120135555.32594-5-jgross@suse.com
2021-02-10x86/xen: Use specific Xen pv interrupt entry for DFJuergen Gross2-3/+9
Xen PV guests don't use IST. For double fault interrupts, switch to the same model as NMI. Correct a typo in a comment while copying it. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210120135555.32594-4-jgross@suse.com
2021-02-10x86/xen: Use specific Xen pv interrupt entry for MCEJuergen Gross2-2/+16
Xen PV guests don't use IST. For machine check interrupts, switch to the same model as debug interrupts. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210120135555.32594-3-jgross@suse.com
2021-01-28Merge tag 'for-linus-5.11-rc6-tag' of ↵Linus Torvalds2-1/+15
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - A fix for a regression introduced in 5.11 resulting in Xen dom0 having problems to correctly initialize Xenstore. - A fix for avoiding WARN splats when booting as Xen dom0 with CONFIG_AMD_MEM_ENCRYPT enabled due to a missing trap handler for the #VC exception (even if the handler should never be called). - A fix for the Xen bklfront driver adapting to the correct but unexpected behavior of new qemu. * tag 'for-linus-5.11-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: avoid warning in Xen pv guest with CONFIG_AMD_MEM_ENCRYPT enabled xen: Fix XenStore initialisation for XS_LOCAL xen-blkfront: allow discard-* nodes to be optional
2021-01-27x86/xen: avoid warning in Xen pv guest with CONFIG_AMD_MEM_ENCRYPT enabledJuergen Gross2-1/+15
When booting a kernel which has been built with CONFIG_AMD_MEM_ENCRYPT enabled as a Xen pv guest a warning is issued for each processor: [ 5.964347] ------------[ cut here ]------------ [ 5.968314] WARNING: CPU: 0 PID: 1 at /home/gross/linux/head/arch/x86/xen/enlighten_pv.c:660 get_trap_addr+0x59/0x90 [ 5.972321] Modules linked in: [ 5.976313] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.11.0-rc5-default #75 [ 5.980313] Hardware name: Dell Inc. OptiPlex 9020/0PC5F7, BIOS A05 12/05/2013 [ 5.984313] RIP: e030:get_trap_addr+0x59/0x90 [ 5.988313] Code: 42 10 83 f0 01 85 f6 74 04 84 c0 75 1d b8 01 00 00 00 c3 48 3d 00 80 83 82 72 08 48 3d 20 81 83 82 72 0c b8 01 00 00 00 eb db <0f> 0b 31 c0 c3 48 2d 00 80 83 82 48 ba 72 1c c7 71 1c c7 71 1c 48 [ 5.992313] RSP: e02b:ffffc90040033d38 EFLAGS: 00010202 [ 5.996313] RAX: 0000000000000001 RBX: ffffffff82a141d0 RCX: ffffffff8222ec38 [ 6.000312] RDX: ffffffff8222ec38 RSI: 0000000000000005 RDI: ffffc90040033d40 [ 6.004313] RBP: ffff8881003984a0 R08: 0000000000000007 R09: ffff888100398000 [ 6.008312] R10: 0000000000000007 R11: ffffc90040246000 R12: ffff8884082182a8 [ 6.012313] R13: 0000000000000100 R14: 000000000000001d R15: ffff8881003982d0 [ 6.016316] FS: 0000000000000000(0000) GS:ffff888408200000(0000) knlGS:0000000000000000 [ 6.020313] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.024313] CR2: ffffc900020ef000 CR3: 000000000220a000 CR4: 0000000000050660 [ 6.028314] Call Trace: [ 6.032313] cvt_gate_to_trap.part.7+0x3f/0x90 [ 6.036313] ? asm_exc_double_fault+0x30/0x30 [ 6.040313] xen_convert_trap_info+0x87/0xd0 [ 6.044313] xen_pv_cpu_up+0x17a/0x450 [ 6.048313] bringup_cpu+0x2b/0xc0 [ 6.052313] ? cpus_read_trylock+0x50/0x50 [ 6.056313] cpuhp_invoke_callback+0x80/0x4c0 [ 6.060313] _cpu_up+0xa7/0x140 [ 6.064313] cpu_up+0x98/0xd0 [ 6.068313] bringup_nonboot_cpus+0x4f/0x60 [ 6.072313] smp_init+0x26/0x79 [ 6.076313] kernel_init_freeable+0x103/0x258 [ 6.080313] ? rest_init+0xd0/0xd0 [ 6.084313] kernel_init+0xa/0x110 [ 6.088313] ret_from_fork+0x1f/0x30 [ 6.092313] ---[ end trace be9ecf17dceeb4f3 ]--- Reason is that there is no Xen pv trap entry for X86_TRAP_VC. Fix that by adding a generic trap handler for unknown traps and wire all unknown bare metal handlers to this generic handler, which will just crash the system in case such a trap will ever happen. Fixes: 0786138c78e793 ("x86/sev-es: Add a Runtime #VC Exception Handler") Cc: <stable@vger.kernel.org> # v5.10 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2021-01-26x86/xen: Support objtool vmlinux.o validation in xen-head.SJosh Poimboeuf1-2/+3
The Xen hypercall page is filled with zeros, causing objtool to fall through all the empty hypercall functions until it reaches a real function, resulting in a stack state mismatch. The build-time contents of the hypercall page don't matter because the page gets rewritten by the hypervisor. Make it more palatable to objtool by making each hypervisor function a true empty function, with nops and a return. Cc: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/0883bde1d7a1fb3b6a4c952bc0200e873752f609.1611263462.git.jpoimboe@redhat.com
2021-01-26x86/xen: Support objtool validation in xen-asm.SJosh Poimboeuf2-11/+19
The OBJECT_FILES_NON_STANDARD annotation is used to tell objtool to ignore a file. File-level ignores won't work when validating vmlinux.o. Tweak the ELF metadata and unwind hints to allow objtool to follow the code. Cc: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/8b042a09c69e8645f3b133ef6653ba28f896807d.1611263462.git.jpoimboe@redhat.com
2021-01-20Merge tag 'for-linus-5.11-rc5-tag' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "A fix for build failure showing up in some configurations" * tag 'for-linus-5.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: fix 'nopvspin' build error
2021-01-18x86/xen: fix 'nopvspin' build errorRandy Dunlap1-0/+2
Fix build error in x86/xen/ when PARAVIRT_SPINLOCKS is not enabled. Fixes this build error: ../arch/x86/xen/smp_hvm.c: In function ‘xen_hvm_smp_init’: ../arch/x86/xen/smp_hvm.c:77:3: error: ‘nopvspin’ undeclared (first use in this function) nopvspin = true; Fixes: 3d7746bea925 ("x86/xen: Fix xen_hvm_smp_init() when vector callback not available") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20210115191123.27572-1-rdunlap@infradead.org Signed-off-by: Juergen Gross <jgross@suse.com>
2021-01-15Merge tag 'for-linus-5.11-rc4-tag' of ↵Linus Torvalds2-13/+29
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - A series to fix a regression when running as a fully virtualized guest on an old Xen hypervisor not supporting PV interrupt callbacks for HVM guests. - A patch to add support to query Xen resource sizes (setting was possible already) from user mode. * tag 'for-linus-5.11-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: Fix xen_hvm_smp_init() when vector callback not available x86/xen: Don't register Xen IPIs when they aren't going to be used x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery xen: Set platform PCI device INTX affinity to CPU0 xen: Fix event channel callback via INTX/GSI xen/privcmd: allow fetching resource sizes
2021-01-13x86/xen: Fix xen_hvm_smp_init() when vector callback not availableDavid Woodhouse1-10/+17
Only the IPI-related functions in the smp_ops should be conditional on the vector callback being available. The rest should still happen: • xen_hvm_smp_prepare_boot_cpu() This function does two things, both of which should still happen if there is no vector callback support. The call to xen_vcpu_setup() for vCPU0 should still happen as it just sets up the vcpu_info for CPU0. That does happen for the secondary vCPUs too, from xen_cpu_up_prepare_hvm(). The second thing it does is call xen_init_spinlocks(), which perhaps counter-intuitively should *also* still be happening in the case without vector callbacks, so that it can clear its local xen_pvspin flag and disable the virt_spin_lock_key accordingly. Checking xen_have_vector_callback in xen_init_spinlocks() itself would affect PV guests, so set the global nopvspin flag in xen_hvm_smp_init() instead, when vector callbacks aren't available. • xen_hvm_smp_prepare_cpus() This does some IPI-related setup by calling xen_smp_intr_init() and xen_init_lock_cpu(), which can be made conditional. And it sets the xen_vcpu_id to XEN_VCPU_ID_INVALID for all possible CPUS, which does need to happen. • xen_smp_cpus_done() This offlines any vCPUs which doesn't fit in the global shared_info page, if separate vcpu_info placement isn't available. That part also needs to happen regardless of vector callback support. • xen_hvm_cpu_die() This doesn't actually do anything other than commin_cpu_die() right right now in the !vector_callback case; all three teardown functions it calls should be no-ops. But to guard against future regressions it's useful to call it anyway, and for it to explicitly check for xen_have_vector_callback before calling those additional functions. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210106153958.584169-6-dwmw2@infradead.org Signed-off-by: Juergen Gross <jgross@suse.com>
2021-01-13x86/xen: Don't register Xen IPIs when they aren't going to be usedDavid Woodhouse1-2/+2
In the case where xen_have_vector_callback is false, we still register the IPI vectors in xen_smp_intr_init() for the secondary CPUs even though they aren't going to be used. Stop doing that. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210106153958.584169-5-dwmw2@infradead.org Signed-off-by: Juergen Gross <jgross@suse.com>
2021-01-13x86/xen: Add xen_no_vector_callback option to test PCI INTX deliveryDavid Woodhouse1-1/+10
It's useful to be able to test non-vector event channel delivery, to make sure Linux will work properly on older Xen which doesn't have it. It's also useful for those working on Xen and Xen-compatible hypervisors, because there are guest kernels still in active use which use PCI INTX even when vector delivery is available. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210106153958.584169-4-dwmw2@infradead.org Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-24Merge tag 'efi_updates_for_v5.11' of ↵Linus Torvalds1-28/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI updates from Borislav Petkov: "These got delayed due to a last minute ia64 build issue which got fixed in the meantime. EFI updates collected by Ard Biesheuvel: - Don't move BSS section around pointlessly in the x86 decompressor - Refactor helper for discovering the EFI secure boot mode - Wire up EFI secure boot to IMA for arm64 - Some fixes for the capsule loader - Expose the RT_PROP table via the EFI test module - Relax DT and kernel placement restrictions on ARM with a few followup fixes: - fix the build breakage on IA64 caused by recent capsule loader changes - suppress a type mismatch build warning in the expansion of EFI_PHYS_ALIGN on ARM" * tag 'efi_updates_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: arm: force use of unsigned type for EFI_PHYS_ALIGN efi: ia64: disable the capsule loader efi: stub: get rid of efi_get_max_fdt_addr() efi/efi_test: read RuntimeServicesSupported efi: arm: reduce minimum alignment of uncompressed kernel efi: capsule: clean scatter-gather entries from the D-cache efi: capsule: use atomic kmap for transient sglist mappings efi: x86/xen: switch to efi_get_secureboot_mode helper arm64/ima: add ima_arch support ima: generalize x86/EFI arch glue for other EFI architectures efi: generalize efi_get_secureboot efi/libstub: EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER should not default to yes efi/x86: Only copy the compressed kernel image in efi_relocate_kernel() efi/libstub/x86: simplify efi_is_native()
2020-12-19Merge tag 'for-linus-5.11-rc1b-tag' of ↵Linus Torvalds2-27/+23
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull more xen updates from Juergen Gross: "Some minor cleanup patches and a small series disentangling some Xen related Kconfig options" * tag 'for-linus-5.11-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Kconfig: remove X86_64 depends from XEN_512GB xen/manage: Fix fall-through warnings for Clang xen-blkfront: Fix fall-through warnings for Clang xen: remove trailing semicolon in macro definition xen: Kconfig: nest Xen guest options xen: Remove Xen PVH/PVHVM dependency on PCI x86/xen: Convert to DEFINE_SHOW_ATTRIBUTE
2020-12-19xen: Kconfig: remove X86_64 depends from XEN_512GBJason Andryuk1-1/+1
commit bfda93aee0ec ("xen: Kconfig: nest Xen guest options") accidentally re-added X86_64 as a dependency to XEN_512GB. It was originally removed in commit a13f2ef168cb ("x86/xen: remove 32-bit Xen PV guest support"). Remove it again. Fixes: bfda93aee0ec ("xen: Kconfig: nest Xen guest options") Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20201216140838.16085-1-jandryuk@gmail.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-16xen: Kconfig: nest Xen guest optionsJason Andryuk1-13/+13
Moving XEN_512GB allows it to nest under XEN_PV. That also allows XEN_PVH to nest under XEN as a sibling to XEN_PV and XEN_PVHVM giving: [*] Xen guest support [*] Xen PV guest support [*] Limit Xen pv-domain memory to 512GB [*] Xen PV Dom0 support [*] Xen PVHVM guest support [*] Xen PVH guest support Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20201014175342.152712-3-jandryuk@gmail.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-16xen: Remove Xen PVH/PVHVM dependency on PCIJason Andryuk1-6/+12
A Xen PVH domain doesn't have a PCI bus or devices, so it doesn't need PCI support built in. Currently, XEN_PVH depends on XEN_PVHVM which depends on PCI. Introduce XEN_PVHVM_GUEST as a toplevel item and change XEN_PVHVM to a hidden variable. This allows XEN_PVH to depend on XEN_PVHVM without PCI while XEN_PVHVM_GUEST depends on PCI. In drivers/xen, compile platform-pci depending on XEN_PVHVM_GUEST since that pulls in the PCI dependency for linking. Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20201014175342.152712-2-jandryuk@gmail.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-16x86/xen: Convert to DEFINE_SHOW_ATTRIBUTEQinglang Miao1-11/+1
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20200917125547.104472-1-miaoqinglang@huawei.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-12-15Merge tag 'x86-apic-2020-12-14' of ↵Linus Torvalds1-5/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Thomas Gleixner: "Yet another large set of x86 interrupt management updates: - Simplification and distangling of the MSI related functionality - Let IO/APIC construct the RTE entries from an MSI message instead of having IO/APIC specific code in the interrupt remapping drivers - Make the retrieval of the parent interrupt domain (vector or remap unit) less hardcoded and use the relevant irqdomain callbacks for selection. - Allow the handling of more than 255 CPUs without a virtualized IOMMU when the hypervisor supports it. This has made been possible by the above modifications and also simplifies the existing workaround in the HyperV specific virtual IOMMU. - Cleanup of the historical timer_works() irq flags related inconsistencies" * tag 'x86-apic-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) x86/ioapic: Cleanup the timer_works() irqflags mess iommu/hyper-v: Remove I/O-APIC ID check from hyperv_irq_remapping_select() iommu/amd: Fix IOMMU interrupt generation in X2APIC mode iommu/amd: Don't register interrupt remapping irqdomain when IR is disabled iommu/amd: Fix union of bitfields in intcapxt support x86/ioapic: Correct the PCI/ISA trigger type selection x86/ioapic: Use I/O-APIC ID for finding irqdomain, not index x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it x86/kvm: Enable 15-bit extension when KVM_FEATURE_MSI_EXT_DEST_ID detected iommu/hyper-v: Disable IRQ pseudo-remapping if 15 bit APIC IDs are available x86/apic: Support 15 bits of APIC ID in MSI where available x86/ioapic: Handle Extended Destination ID field in RTE iommu/vt-d: Simplify intel_irq_remapping_select() x86: Kill all traces of irq_remapping_get_irq_domain() x86/ioapic: Use irq_find_matching_fwspec() to find remapping irqdomain x86/hpet: Use irq_find_matching_fwspec() to find remapping irqdomain iommu/hyper-v: Implement select() method on remapping irqdomain iommu/vt-d: Implement select() method on remapping irqdomain iommu/amd: Implement select() method on remapping irqdomain x86/apic: Add select() method on vector irqdomain ...
2020-11-20Merge tag 'for-linus-5.10b-rc5-tag' of ↵Linus Torvalds1-1/+11
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "A single fix for avoiding WARN splats when booting a Xen guest with nosmt" * tag 'for-linus-5.10b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: don't unbind uninitialized lock_kicker_irq
2020-11-17efi: x86/xen: switch to efi_get_secureboot_mode helperArd Biesheuvel1-28/+9
Now that we have a static inline helper to discover the platform's secure boot mode that can be shared between the EFI stub and the kernel proper, switch to it, and drop some comments about keeping them in sync manually. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>