summaryrefslogtreecommitdiff
path: root/arch/x86/xen
AgeCommit message (Collapse)AuthorFilesLines
2021-10-05xen/x86: adjust data placementJan Beulich2-3/+5
Both xen_pvh and xen_start_flags get written just once early during init. Using the respective annotation then allows the open-coded placing in .data to go away. Additionally the former, like the latter, wants exporting, or else xen_pvh_domain() can't be used from modules. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/8155ed26-5a1d-c06f-42d8-596d26e75849@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05xen/x86: hook up xen_banner() also for PVHJan Beulich4-11/+16
This was effectively lost while dropping PVHv1 code. Move the function and arrange for it to be called the same way as done in PV mode. Clearly this then needs re-introducing the XENFEAT_mmu_pt_update_preserve_ad check that was recently removed, as that's a PV-only feature. Since the string pointed at by pv_info.name describes the mode, drop "paravirtualized" from the log message while moving the code. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/de03054d-a20d-2114-bb86-eec28e17b3b8@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05xen/x86: generalize preferred console model from PV to PVH Dom0Jan Beulich4-7/+18
Without announcing hvc0 as preferred it won't get used as long as tty0 gets registered earlier. This is particularly problematic with there not being any screen output for PVH Dom0 when the screen is in graphics mode, as the necessary information doesn't get conveyed yet from the hypervisor. Follow PV's model, but be conservative and do this for Dom0 only for now. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/582328b6-c86c-37f3-d802-5539b7a86736@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05xen/x86: allow "earlyprintk=xen" to work for PV Dom0Jan Beulich1-1/+1
With preferred consoles "tty" and "hvc" announced as preferred, registering "xenboot" early won't result in use of the console: It also needs to be registered as preferred. Generalize this from being DomU- only so far. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/d4a34540-a476-df2c-bca6-732d0d58c5f0@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05xen/x86: allow PVH Dom0 without XEN_PV=yJan Beulich5-25/+31
Decouple XEN_DOM0 from XEN_PV, converting some existing uses of XEN_DOM0 to a new XEN_PV_DOM0. (I'm not convinced all are really / should really be PV-specific, but for starters I've tried to be conservative.) For PVH Dom0 the hypervisor populates MADT with only x2APIC entries, so without x2APIC support enabled in the kernel things aren't going to work very well. (As opposed, DomU-s would only ever see LAPIC entries in MADT as of now.) Note that this then requires PVH Dom0 to be 64-bit, as X86_X2APIC depends on X86_64. In the course of this xen_running_on_version_or_later() needs to be available more broadly. Move it from a PV-specific to a generic file, considering that what it does isn't really PV-specific at all anyway. Note that xen/interface/version.h cannot be included on its own; in enlighten.c, which uses SCHEDOP_* anyway, include xen/interface/sched.h first to resolve the apparently sole missing type (xen_ulong_t). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/983bb72f-53df-b6af-14bd-5e088bd06a08@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05xen/x86: prevent PVH type from getting clobberedJan Beulich1-5/+4
Like xen_start_flags, xen_domain_type gets set before .bss gets cleared. Hence this variable also needs to be prevented from getting put in .bss, which is possible because XEN_NATIVE is an enumerator evaluating to zero. Any use prior to init_hvm_pv_info() setting the variable again would lead to wrong decisions; one such case is xenboot_console_setup() when called as a result of "earlyprintk=xen". Use __ro_after_init as more applicable than either __section(".data") or __read_mostly. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/d301677b-6f22-5ae6-bd36-458e1f323d0b@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05xen/privcmd: drop "pages" parameter from xen_remap_pfn()Jan Beulich1-1/+1
The function doesn't use it and all of its callers say in a comment that their respective arguments are to be non-NULL only in auto-translated mode. Since xen_remap_domain_mfn_array() isn't supposed to be used by non-PV, drop the parameter there as well. It was bogusly passed as non- NULL (PRIV_VMA_LOCKED) by its only caller anyway. For xen_remap_domain_gfn_range(), otoh, it's not clear at all why this wouldn't want / might not need to gain auto-translated support down the road, so the parameter is retained there despite now remaining unused (and the only caller passing NULL); correct a respective comment as well. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/036ad8a2-46f9-ac3d-6219-bdc93ab9e10b@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-21xen/x86: fix PV trap handling on secondary processorsJan Beulich1-6/+9
The initial observation was that in PV mode under Xen 32-bit user space didn't work anymore. Attempts of system calls ended in #GP(0x402). All of the sudden the vector 0x80 handler was not in place anymore. As it turns out up to 5.13 redundant initialization did occur: Once from cpu_initialize_context() (through its VCPUOP_initialise hypercall) and a 2nd time while each CPU was brought fully up. This 2nd initialization is now gone, uncovering that the 1st one was flawed: Unlike for the set_trap_table hypercall, a full virtual IDT needs to be specified here; the "vector" fields of the individual entries are of no interest. With many (kernel) IDT entries still(?) (i.e. at that point at least) empty, the syscall vector 0x80 ended up in slot 0x20 of the virtual IDT, thus becoming the domain's handler for vector 0x20. Make xen_convert_trap_info() fit for either purpose, leveraging the fact that on the xen_copy_trap_info() path the table starts out zero-filled. This includes moving out the writing of the sentinel, which would also have lead to a buffer overrun in the xen_copy_trap_info() case if all (kernel) IDT entries were populated. Convert the writing of the sentinel to clearing of the entire table entry rather than just the address field. (I didn't bother trying to identify the commit which uncovered the issue in 5.14; the commit named below is the one which actually introduced the bad code.) Fixes: f87e4cac4f4e ("xen: SMP guest support") Cc: stable@vger.kernel.org Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/7a266932-092e-b68f-f2bb-1473b61adc6e@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-20xen/pci-swiotlb: reduce visibility of symbolsJan Beulich1-2/+2
xen_swiotlb and pci_xen_swiotlb_init() are only used within the file defining them, so make them static and remove the stubs. Otoh pci_xen_swiotlb_detect() has a use (as function pointer) from the main pci-swiotlb.c file - convert its stub to a #define to NULL. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/aef5fc33-9c02-4df0-906a-5c813142e13c@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-20xen/x86: drop redundant zeroing from cpu_initialize_context()Jan Beulich1-4/+0
Just after having obtained the pointer from kzalloc() there's no reason at all to set part of the area to all zero yet another time. Similarly there's no point explicitly clearing "ldt_ents". Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrvsky@oracle.com> Link: https://lore.kernel.org/r/14881835-a48e-29fa-0870-e177b10fcf65@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-15xen: fix usage of pmd_populate in mremap for pv guestsJuergen Gross1-2/+5
Commit 0881ace292b662 ("mm/mremap: use pmd/pud_poplulate to update page table entries") introduced a regression when running as Xen PV guest. Today pmd_populate() for Xen PV assumes that the PFN inserted is referencing a not yet used page table. In case of move_normal_pmd() this is not true, resulting in WARN splats like: [34321.304270] ------------[ cut here ]------------ [34321.304277] WARNING: CPU: 0 PID: 23628 at arch/x86/xen/multicalls.c:102 xen_mc_flush+0x176/0x1a0 [34321.304288] Modules linked in: [34321.304291] CPU: 0 PID: 23628 Comm: apt-get Not tainted 5.14.1-20210906-doflr-mac80211debug+ #1 [34321.304294] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010 [34321.304296] RIP: e030:xen_mc_flush+0x176/0x1a0 [34321.304300] Code: 89 45 18 48 c1 e9 3f 48 89 ce e9 20 ff ff ff e8 60 03 00 00 66 90 5b 5d 41 5c 41 5d c3 48 c7 45 18 ea ff ff ff be 01 00 00 00 <0f> 0b 8b 55 00 48 c7 c7 10 97 aa 82 31 db 49 c7 c5 38 97 aa 82 65 [34321.304303] RSP: e02b:ffffc90000a97c90 EFLAGS: 00010002 [34321.304305] RAX: ffff88807d416398 RBX: ffff88807d416350 RCX: ffff88807d416398 [34321.304306] RDX: 0000000000000001 RSI: 0000000000000001 RDI: deadbeefdeadf00d [34321.304308] RBP: ffff88807d416300 R08: aaaaaaaaaaaaaaaa R09: ffff888006160cc0 [34321.304309] R10: deadbeefdeadf00d R11: ffffea000026a600 R12: 0000000000000000 [34321.304310] R13: ffff888012f6b000 R14: 0000000012f6b000 R15: 0000000000000001 [34321.304320] FS: 00007f5071177800(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000 [34321.304322] CS: 10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033 [34321.304323] CR2: 00007f506f542000 CR3: 00000000160cc000 CR4: 0000000000000660 [34321.304326] Call Trace: [34321.304331] xen_alloc_pte+0x294/0x320 [34321.304334] move_pgt_entry+0x165/0x4b0 [34321.304339] move_page_tables+0x6fa/0x8d0 [34321.304342] move_vma.isra.44+0x138/0x500 [34321.304345] __x64_sys_mremap+0x296/0x410 [34321.304348] do_syscall_64+0x3a/0x80 [34321.304352] entry_SYSCALL_64_after_hwframe+0x44/0xae [34321.304355] RIP: 0033:0x7f507196301a [34321.304358] Code: 73 01 c3 48 8b 0d 76 0e 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 49 89 ca b8 19 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 46 0e 0c 00 f7 d8 64 89 01 48 [34321.304360] RSP: 002b:00007ffda1eecd38 EFLAGS: 00000246 ORIG_RAX: 0000000000000019 [34321.304362] RAX: ffffffffffffffda RBX: 000056205f950f30 RCX: 00007f507196301a [34321.304363] RDX: 0000000001a00000 RSI: 0000000001900000 RDI: 00007f506dc56000 [34321.304364] RBP: 0000000001a00000 R08: 0000000000000010 R09: 0000000000000004 [34321.304365] R10: 0000000000000001 R11: 0000000000000246 R12: 00007f506dc56060 [34321.304367] R13: 00007f506dc56000 R14: 00007f506dc56060 R15: 000056205f950f30 [34321.304368] ---[ end trace a19885b78fe8f33e ]--- [34321.304370] 1 of 2 multicall(s) failed: cpu 0 [34321.304371] call 2: op=12297829382473034410 arg=[aaaaaaaaaaaaaaaa] result=-22 Fix that by modifying xen_alloc_ptpage() to only pin the page table in case it wasn't pinned already. Fixes: 0881ace292b662 ("mm/mremap: use pmd/pud_poplulate to update page table entries") Cc: <stable@vger.kernel.org> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20210908073640.11299-1-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-15xen: reset legacy rtc flag for PV domUJuergen Gross1-0/+7
A Xen PV guest doesn't have a legacy RTC device, so reset the legacy RTC flag. Otherwise the following WARN splat will occur at boot: [ 1.333404] WARNING: CPU: 1 PID: 1 at /home/gross/linux/head/drivers/rtc/rtc-mc146818-lib.c:25 mc146818_get_time+0x1be/0x210 [ 1.333404] Modules linked in: [ 1.333404] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 5.14.0-rc7-default+ #282 [ 1.333404] RIP: e030:mc146818_get_time+0x1be/0x210 [ 1.333404] Code: c0 64 01 c5 83 fd 45 89 6b 14 7f 06 83 c5 64 89 6b 14 41 83 ec 01 b8 02 00 00 00 44 89 63 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <0f> 0b 48 c7 c7 30 0e ef 82 4c 89 e6 e8 71 2a 24 00 48 c7 c0 ff ff [ 1.333404] RSP: e02b:ffffc90040093df8 EFLAGS: 00010002 [ 1.333404] RAX: 00000000000000ff RBX: ffffc90040093e34 RCX: 0000000000000000 [ 1.333404] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000000000d [ 1.333404] RBP: ffffffff82ef0e30 R08: ffff888005013e60 R09: 0000000000000000 [ 1.333404] R10: ffffffff82373e9b R11: 0000000000033080 R12: 0000000000000200 [ 1.333404] R13: 0000000000000000 R14: 0000000000000002 R15: ffffffff82cdc6d4 [ 1.333404] FS: 0000000000000000(0000) GS:ffff88807d440000(0000) knlGS:0000000000000000 [ 1.333404] CS: 10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.333404] CR2: 0000000000000000 CR3: 000000000260a000 CR4: 0000000000050660 [ 1.333404] Call Trace: [ 1.333404] ? wakeup_sources_sysfs_init+0x30/0x30 [ 1.333404] ? rdinit_setup+0x2b/0x2b [ 1.333404] early_resume_init+0x23/0xa4 [ 1.333404] ? cn_proc_init+0x36/0x36 [ 1.333404] do_one_initcall+0x3e/0x200 [ 1.333404] kernel_init_freeable+0x232/0x28e [ 1.333404] ? rest_init+0xd0/0xd0 [ 1.333404] kernel_init+0x16/0x120 [ 1.333404] ret_from_fork+0x1f/0x30 Cc: <stable@vger.kernel.org> Fixes: 8d152e7a5c7537 ("x86/rtc: Replace paravirt rtc check with platform legacy quirk") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210903084937.19392-3-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-01xen: remove stray preempt_disable() from PV AP startup codeJuergen Gross1-1/+0
In cpu_bringup() there is a call of preempt_disable() without a paired preempt_enable(). This is not needed as interrupts are off initially. Additionally this will result in early boot messages like: BUG: scheduling while atomic: swapper/1/0/0x00000002 Signed-off-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20210825113158.11716-1-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30x86: xen: platform-pci-unplug: use pr_err() and pr_warn() instead of raw ↵zhaoxiao1-7/+9
printk() Since we have the nice helpers pr_err() and pr_warn(), use them instead of raw printk(). [jgross@suse.com] Move the "#define pr_fmt" above the #includes in order to avoid build warnings due to redefinition. Signed-off-by: zhaoxiao <zhaoxiao@uniontech.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20210825114111.29009-1-zhaoxiao@uniontech.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30xen: assume XENFEAT_mmu_pt_update_preserve_ad being set for pv guestsJuergen Gross2-12/+4
XENFEAT_mmu_pt_update_preserve_ad is always set in Xen 4.0 and newer. Remove coding assuming it might be zero. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210730071804.4302-3-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30xen: fix setting of max_pfn in shared_infoJuergen Gross1-2/+2
Xen PV guests are specifying the highest used PFN via the max_pfn field in shared_info. This value is used by the Xen tools when saving or migrating the guest. Unfortunately this field is misnamed, as in reality it is specifying the number of pages (including any memory holes) of the guest, so it is the highest used PFN + 1. Renaming isn't possible, as this is a public Xen hypervisor interface which needs to be kept stable. The kernel will set the value correctly initially at boot time, but when adding more pages (e.g. due to memory hotplug or ballooning) a real PFN number is stored in max_pfn. This is done when expanding the p2m array, and the PFN stored there is even possibly wrong, as it should be the last possible PFN of the just added P2M frame, and not one which led to the P2M expansion. Fix that by setting shared_info->max_pfn to the last possible PFN + 1. Fixes: 98dd166ea3a3c3 ("x86/xen/p2m: hint at the last populated P2M entry") Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Link: https://lore.kernel.org/r/20210730092622.9973-2-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-07-01kernel.h: split out panic and oops helpersAndy Shevchenko1-0/+1
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out panic and oops helpers. There are several purposes of doing this: - dropping dependency in bug.h - dropping a loop by moving out panic_notifier.h - unload kernel.h from something which has its own domain At the same time convert users tree-wide to use new headers, although for the time being include new header back to kernel.h to avoid twisted indirected includes for existing users. [akpm@linux-foundation.org: thread_info.h needs limits.h] [andriy.shevchenko@linux.intel.com: ia64 fix] Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Co-developed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Corey Minyard <cminyard@mvista.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Sebastian Reichel <sre@kernel.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Stephen Boyd <sboyd@kernel.org> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-22x86/xen: Fix noinstr fail in exc_xen_unknown_trap()Peter Zijlstra1-0/+2
Fix: vmlinux.o: warning: objtool: exc_xen_unknown_trap()+0x7: call to printk() leaves .noinstr.text section Fixes: 2e92493637a0 ("x86/xen: avoid warning in Xen pv guest with CONFIG_AMD_MEM_ENCRYPT enabled") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210621120120.606560778@infradead.org
2021-05-21x86/Xen: swap NX determination and GDT setup on BSPJan Beulich1-4/+4
xen_setup_gdt(), via xen_load_gdt_boot(), wants to adjust page tables. For this to work when NX is not available, x86_configure_nx() needs to be called first. [jgross] Note that this is a revert of 36104cb9012a82e73 ("x86/xen: Delay get_cpu_cap until stack canary is established"), which is possible now that we no longer support running as PV guest in 32-bit mode. Cc: <stable.vger.kernel.org> # 5.9 Fixes: 36104cb9012a82e73 ("x86/xen: Delay get_cpu_cap until stack canary is established") Reported-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/12a866b0-9e89-59f7-ebeb-a2a6cec0987a@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-05-04Merge branch 'stable/for-linus-5.13' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb updates from Konrad Rzeszutek Wilk: "Christoph Hellwig has taken a cleaver and trimmed off the not-needed code and nicely folded duplicate code in the generic framework. This lays the groundwork for more work to add extra DMA-backend-ish in the future. Along with that some bug-fixes to make this a nice working package" * 'stable/for-linus-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: swiotlb: don't override user specified size in swiotlb_adjust_size swiotlb: Fix the type of index swiotlb: Make SWIOTLB_NO_FORCE perform no allocation ARM: Qualify enabling of swiotlb_init() swiotlb: remove swiotlb_nr_tbl swiotlb: dynamically allocate io_tlb_default_mem swiotlb: move global variables into a new io_tlb_mem structure xen-swiotlb: remove the unused size argument from xen_swiotlb_fixup xen-swiotlb: split xen_swiotlb_init swiotlb: lift the double initialization protection from xen-swiotlb xen-swiotlb: remove xen_io_tlb_start and xen_io_tlb_nslabs xen-swiotlb: remove xen_set_nslabs xen-swiotlb: use io_tlb_end in xen_swiotlb_dma_supported xen-swiotlb: use is_swiotlb_buffer in is_xen_swiotlb_buffer swiotlb: split swiotlb_tbl_sync_single swiotlb: move orig addr and size validation into swiotlb_bounce swiotlb: remove the alloc_size parameter to swiotlb_tbl_unmap_single powerpc/svm: stop using io_tlb_start
2021-04-29Merge tag 'x86-mm-2021-04-29' of ↵Linus Torvalds1-6/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 tlb updates from Ingo Molnar: "The x86 MM changes in this cycle were: - Implement concurrent TLB flushes, which overlaps the local TLB flush with the remote TLB flush. In testing this improved sysbench performance measurably by a couple of percentage points, especially if TLB-heavy security mitigations are active. - Further micro-optimizations to improve the performance of TLB flushes" * tag 'x86-mm-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp: Micro-optimize smp_call_function_many_cond() smp: Inline on_each_cpu_cond() and on_each_cpu() x86/mm/tlb: Remove unnecessary uses of the inline keyword cpumask: Mark functions as pure x86/mm/tlb: Do not make is_lazy dirty for no reason x86/mm/tlb: Privatize cpu_tlbstate x86/mm/tlb: Flush remote and local TLBs concurrently x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy() x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote() smp: Run functions concurrently in smp_call_function_many_cond()
2021-04-28Merge tag 'x86_core_for_v5.13' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 updates from Borislav Petkov: - Turn the stack canary into a normal __percpu variable on 32-bit which gets rid of the LAZY_GS stuff and a lot of code. - Add an insn_decode() API which all users of the instruction decoder should preferrably use. Its goal is to keep the details of the instruction decoder away from its users and simplify and streamline how one decodes insns in the kernel. Convert its users to it. - kprobes improvements and fixes - Set the maximum DIE per package variable on Hygon - Rip out the dynamic NOP selection and simplify all the machinery around selecting NOPs. Use the simplified NOPs in objtool now too. - Add Xeon Sapphire Rapids to list of CPUs that support PPIN - Simplify the retpolines by folding the entire thing into an alternative now that objtool can handle alternatives with stack ops. Then, have objtool rewrite the call to the retpoline with the alternative which then will get patched at boot time. - Document Intel uarch per models in intel-family.h - Make Sub-NUMA Clustering topology the default and Cluster-on-Die the exception on Intel. * tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) x86, sched: Treat Intel SNC topology as default, COD as exception x86/cpu: Comment Skylake server stepping too x86/cpu: Resort and comment Intel models objtool/x86: Rewrite retpoline thunk calls objtool: Skip magical retpoline .altinstr_replacement objtool: Cache instruction relocs objtool: Keep track of retpoline call sites objtool: Add elf_create_undef_symbol() objtool: Extract elf_symbol_add() objtool: Extract elf_strtab_concat() objtool: Create reloc sections implicitly objtool: Add elf_create_reloc() helper objtool: Rework the elf_rebuild_reloc_section() logic objtool: Fix static_call list generation objtool: Handle per arch retpoline naming objtool: Correctly handle retpoline thunk calls x86/retpoline: Simplify retpolines x86/alternatives: Optimize optimize_nops() x86: Add insn_decode_kernel() x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration ...
2021-04-26Merge tag 'x86_cleanups_for_v5.13' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 cleanups from Borislav Petkov: "Trivial cleanups and fixes all over the place" * tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Remove me from IDE/ATAPI section x86/pat: Do not compile stubbed functions when X86_PAT is off x86/asm: Ensure asm/proto.h can be included stand-alone x86/platform/intel/quark: Fix incorrect kernel-doc comment syntax in files x86/msr: Make locally used functions static x86/cacheinfo: Remove unneeded dead-store initialization x86/process/64: Move cpu_current_top_of_stack out of TSS tools/turbostat: Unmark non-kernel-doc comment x86/syscalls: Fix -Wmissing-prototypes warnings from COND_SYSCALL() x86/fpu/math-emu: Fix function cast warning x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes x86: Fix various typos in comments, take #2 x86: Remove unusual Unicode characters from comments x86/kaslr: Return boolean values from a function returning bool x86: Fix various typos in comments x86/setup: Remove unused RESERVE_BRK_ARRAY() stacktrace: Move documentation for arch_stack_walk_reliable() to header x86: Remove duplicate TSC DEADLINE MSR definitions
2021-04-26Merge tag 'x86_alternatives_for_v5.13' of ↵Linus Torvalds2-16/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 alternatives/paravirt updates from Borislav Petkov: "First big cleanup to the paravirt infra to use alternatives and thus eliminate custom code patching. For that, the alternatives infrastructure is extended to accomodate paravirt's needs and, as a result, a lot of paravirt patching code goes away, leading to a sizeable cleanup and simplification. Work by Juergen Gross" * tag 'x86_alternatives_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/paravirt: Have only one paravirt patch function x86/paravirt: Switch functions with custom code to ALTERNATIVE x86/paravirt: Add new PVOP_ALT* macros to support pvops in ALTERNATIVEs x86/paravirt: Switch iret pvops to ALTERNATIVE x86/paravirt: Simplify paravirt macros x86/paravirt: Remove no longer needed 32-bit pvops cruft x86/paravirt: Add new features for paravirt patching x86/alternative: Use ALTERNATIVE_TERNARY() in _static_cpu_has() x86/alternative: Support ALTERNATIVE_TERNARY x86/alternative: Support not-feature x86/paravirt: Switch time pvops functions to use static_call() static_call: Add function to query current function static_call: Move struct static_call_key definition to static_call_types.h x86/alternative: Merge include files x86/alternative: Drop unused feature parameter from ALTINSTR_REPLACEMENT()
2021-04-02Merge tag 'v5.12-rc5' into WIP.x86/core, to pick up recent NOP related changesIngo Molnar2-7/+16
In particular we want to have this upstream commit: b90829704780: ("bpf: Use NOP_ATOMIC5 instead of emit_nops(&prog, 5) for BPF_TRAMP_F_CALL_ORIG") ... before merging in x86/cpu changes and the removal of the NOP optimizations, and applying PeterZ's !retpoline objtool series. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-03-31Merge 'x86/alternatives'Borislav Petkov2-16/+14
Pick up dependent changes. Signed-off-by: Borislav Petkov <bp@suse.de>
2021-03-26Merge tag 'for-linus-5.12b-rc5-tag' of ↵Linus Torvalds2-7/+16
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "This contains a small series with a more elegant fix of a problem which was originally fixed in rc2" * tag 'for-linus-5.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: Revert "xen: fix p2m size in dom0 for disabled memory hotplug case" xen/x86: make XEN_BALLOON_MEMORY_HOTPLUG_LIMIT depend on MEMORY_HOTPLUG
2021-03-25Revert "xen: fix p2m size in dom0 for disabled memory hotplug case"Roger Pau Monne2-5/+14
This partially reverts commit 882213990d32 ("xen: fix p2m size in dom0 for disabled memory hotplug case") There's no need to special case XEN_UNPOPULATED_ALLOC anymore in order to correctly size the p2m. The generic memory hotplug option has already been tied together with the Xen hotplug limit, so enabling memory hotplug should already trigger a properly sized p2m on Xen PV. Note that XEN_UNPOPULATED_ALLOC depends on ZONE_DEVICE which pulls in MEMORY_HOTPLUG. Leave the check added to __set_phys_to_machine and the adjusted comment about EXTRA_MEM_RATIO. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210324122424.58685-3-roger.pau@citrix.com [boris: fixed formatting issues] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-25xen/x86: make XEN_BALLOON_MEMORY_HOTPLUG_LIMIT depend on MEMORY_HOTPLUGRoger Pau Monne1-2/+2
The Xen memory hotplug limit should depend on the memory hotplug generic option, rather than the Xen balloon configuration. It's possible to have a kernel with generic memory hotplug enabled, but without Xen balloon enabled, at which point memory hotplug won't work correctly due to the size limitation of the p2m. Rename the option to XEN_MEMORY_HOTPLUG_LIMIT since it's no longer tied to ballooning. Fixes: 9e2369c06c8a18 ("xen: add helpers to allocate unpopulated memory") Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210324122424.58685-2-roger.pau@citrix.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-22x86: Fix various typos in comments, take #2Ingo Molnar1-1/+1
Fix another ~42 single-word typos in arch/x86/ code comments, missed a few in the first pass, in particular in .S files. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-kernel@vger.kernel.org
2021-03-17xen-swiotlb: split xen_swiotlb_initChristoph Hellwig1-2/+2
Split xen_swiotlb_init into a normal an an early case. That makes both much simpler and more readable, and also allows marking the early code as __init and x86-only. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2021-03-15Merge tag 'v5.12-rc3' into x86/coreBorislav Petkov1-4/+2
Pick up dependent SEV-ES urgent changes to base new work ontop. Signed-off-by: Borislav Petkov <bp@suse.de>
2021-03-12Merge tag 'for-linus-5.12b-rc3-tag' of ↵Linus Torvalds1-4/+2
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Two fix series and a single cleanup: - a small cleanup patch to remove unneeded symbol exports - a series to cleanup Xen grant handling (avoiding allocations in some cases, and using common defines for "invalid" values) - a series to address a race issue in Xen event channel handling" * tag 'for-linus-5.12b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: Xen/gntdev: don't needlessly use kvcalloc() Xen/gnttab: introduce common INVALID_GRANT_{HANDLE,REF} Xen/gntdev: don't needlessly allocate k{,un}map_ops[] Xen: drop exports of {set,clear}_foreign_p2m_mapping() xen/events: avoid handling the same event on two cpus at the same time xen/events: don't unmask an event channel when an eoi is pending xen/events: reset affinity of 2-level event when tearing it down
2021-03-11x86/paravirt: Have only one paravirt patch functionJuergen Gross1-1/+0
There is no need any longer to have different paravirt patch functions for native and Xen. Eliminate native_patch() and rename paravirt_patch_default() to paravirt_patch(). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-15-jgross@suse.com
2021-03-11x86/paravirt: Switch iret pvops to ALTERNATIVEJuergen Gross1-2/+1
The iret paravirt op is rather special as it is using a jmp instead of a call instruction. Switch it to ALTERNATIVE. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-12-jgross@suse.com
2021-03-11x86/paravirt: Switch time pvops functions to use static_call()Juergen Gross1-13/+13
The time pvops functions are the only ones left which might be used in 32-bit mode and which return a 64-bit value. Switch them to use the static_call() mechanism instead of pvops, as this allows quite some simplification of the pvops implementation. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210311142319.4723-5-jgross@suse.com
2021-03-11Xen/gnttab: introduce common INVALID_GRANT_{HANDLE,REF}Jan Beulich1-2/+2
It's not helpful if every driver has to cook its own. Generalize xenbus'es INVALID_GRANT_HANDLE and pcifront's INVALID_GRANT_REF (which shouldn't have expanded to zero to begin with). Use the constants in p2m.c and gntdev.c right away, and update field types where necessary so they would match with the constants' types (albeit without touching struct ioctl_gntdev_grant_ref's ref field, as that's part of the public interface of the kernel and would require introducing a dependency on Xen's grant_table.h public header). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/db7c38a5-0d75-d5d1-19de-e5fe9f0b9c48@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-11Xen: drop exports of {set,clear}_foreign_p2m_mapping()Jan Beulich1-2/+0
They're only used internally, and the layering violation they contain (x86) or imply (Arm) of calling HYPERVISOR_grant_table_op() strongly advise against any (uncontrolled) use from a module. The functions also never had users except the ones from drivers/xen/grant-table.c forever since their introduction in 3.15. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/746a5cd6-1446-eda4-8b23-03c1cac30b8d@suse.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-03-08x86/stackprotector/32: Make the canary into a regular percpu variableAndy Lutomirski1-1/+0
On 32-bit kernels, the stackprotector canary is quite nasty -- it is stored at %gs:(20), which is nasty because 32-bit kernels use %fs for percpu storage. It's even nastier because it means that whether %gs contains userspace state or kernel state while running kernel code depends on whether stackprotector is enabled (this is CONFIG_X86_32_LAZY_GS), and this setting radically changes the way that segment selectors work. Supporting both variants is a maintenance and testing mess. Merely rearranging so that percpu and the stack canary share the same segment would be messy as the 32-bit percpu address layout isn't currently compatible with putting a variable at a fixed offset. Fortunately, GCC 8.1 added options that allow the stack canary to be accessed as %fs:__stack_chk_guard, effectively turning it into an ordinary percpu variable. This lets us get rid of all of the code to manage the stack canary GDT descriptor and the CONFIG_X86_32_LAZY_GS mess. (That name is special. We could use any symbol we want for the %fs-relative mode, but for CONFIG_SMP=n, gcc refuses to let us use any name other than __stack_chk_guard.) Forcibly disable stackprotector on older compilers that don't support the new options and turn the stack canary into a percpu variable. The "lazy GS" approach is now used for all 32-bit configurations. Also makes load_gs_index() work on 32-bit kernels. On 64-bit kernels, it loads the GS selector and updates the user GSBASE accordingly. (This is unchanged.) On 32-bit kernels, it loads the GS selector and updates GSBASE, which is now always the user base. This means that the overall effect is the same on 32-bit and 64-bit, which avoids some ifdeffery. [ bp: Massage commit message. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/c0ff7dba14041c7e5d1cae5d4df052f03759bef3.1613243844.git.luto@kernel.org
2021-03-06x86/mm/tlb: Flush remote and local TLBs concurrentlyNadav Amit1-6/+5
To improve TLB shootdown performance, flush the remote and local TLBs concurrently. Introduce flush_tlb_multi() that does so. Introduce paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v (Xen and hyper-v are only compile-tested). While the updated smp infrastructure is capable of running a function on a single local core, it is not optimized for this case. The multiple function calls and the indirect branch introduce some overhead, and might make local TLB flushes slower than they were before the recent changes. Before calling the SMP infrastructure, check if only a local TLB flush is needed to restore the lost performance in this common case. This requires to check mm_cpumask() one more time, but unless this mask is updated very frequently, this should impact performance negatively. Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> # Hyper-v parts Reviewed-by: Juergen Gross <jgross@suse.com> # Xen and paravirt parts Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20210220231712.2475218-5-namit@vmware.com
2021-03-04Merge tag 'for-linus-5.12b-rc2-tag' of ↵Linus Torvalds2-29/+50
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Two security issues (XSA-367 and XSA-369)" * tag 'for-linus-5.12b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: fix p2m size in dom0 for disabled memory hotplug case xen-netback: respect gnttab_map_refs()'s return value Xen/gnttab: handle p2m update errors on a per-slot basis
2021-03-03xen: fix p2m size in dom0 for disabled memory hotplug caseJuergen Gross2-26/+9
Since commit 9e2369c06c8a18 ("xen: add helpers to allocate unpopulated memory") foreign mappings are using guest physical addresses allocated via ZONE_DEVICE functionality. This will result in problems for the case of no balloon memory hotplug being configured, as the p2m list will only cover the initial memory size of the domain. Any ZONE_DEVICE allocated address will be outside the p2m range and thus a mapping can't be established with that memory address. Fix that by extending the p2m size for that case. At the same time add a check for a to be created mapping to be within the p2m limits in order to detect errors early. While changing a comment, remove some 32-bit leftovers. This is XSA-369. Fixes: 9e2369c06c8a18 ("xen: add helpers to allocate unpopulated memory") Cc: <stable@vger.kernel.org> # 5.9 Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2021-03-03Xen/gnttab: handle p2m update errors on a per-slot basisJan Beulich1-3/+41
Bailing immediately from set_foreign_p2m_mapping() upon a p2m updating error leaves the full batch in an ambiguous state as far as the caller is concerned. Instead flags respective slots as bad, unmapping what was mapped there right away. HYPERVISOR_grant_table_op()'s return value and the individual unmap slots' status fields get used only for a one-time - there's not much we can do in case of a failure. Note that there's no GNTST_enomem or alike, so GNTST_general_error gets used. The map ops' handle fields get overwritten just to be on the safe side. This is part of XSA-367. Cc: <stable@vger.kernel.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/96cccf5d-e756-5f53-b91a-ea269bfb9be0@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
2021-02-23Merge tag 'objtool-core-2021-02-23' of ↵Linus Torvalds3-13/+21
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Thomas Gleixner: - Make objtool work for big-endian cross compiles - Make stack tracking via stack pointer memory operations match push/pop semantics to prepare for architectures w/o PUSH/POP instructions. - Add support for analyzing alternatives - Improve retpoline detection and handling - Improve assembly code coverage on x86 - Provide support for inlined stack switching * tag 'objtool-core-2021-02-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) objtool: Support stack-swizzle objtool,x86: Additionally decode: mov %rsp, (%reg) x86/unwind/orc: Change REG_SP_INDIRECT x86/power: Support objtool validation in hibernate_asm_64.S x86/power: Move restore_registers() to top of the file x86/power: Annotate indirect branches as safe x86/acpi: Support objtool validation in wakeup_64.S x86/acpi: Annotate indirect branch as safe x86/ftrace: Support objtool vmlinux.o validation in ftrace_64.S x86/xen/pvh: Annotate indirect branch as safe x86/xen: Support objtool vmlinux.o validation in xen-head.S x86/xen: Support objtool validation in xen-asm.S objtool: Add xen_start_kernel() to noreturn list objtool: Combine UNWIND_HINT_RET_OFFSET and UNWIND_HINT_FUNC objtool: Add asm version of STACK_FRAME_NON_STANDARD objtool: Assume only ELF functions do sibling calls x86/ftrace: Add UNWIND_HINT_FUNC annotation for ftrace_stub objtool: Support retpoline jump detection for vmlinux.o objtool: Fix ".cold" section suffix check for newer versions of GCC objtool: Fix retpoline detection in asm code ...
2021-02-22Merge tag 'for-linus-5.12-rc1-tag' of ↵Linus Torvalds1-8/+7
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "A series of Xen related security fixes, all related to limited error handling in Xen backend drivers" * tag 'for-linus-5.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen-blkback: fix error handling in xen_blkbk_map() xen-scsiback: don't "handle" error by BUG() xen-netback: don't "handle" error by BUG() xen-blkback: don't "handle" error by BUG() xen/arm: don't ignore return errors from set_phys_to_machine Xen/gntdev: correct error checking in gntdev_map_grant_pages() Xen/gntdev: correct dev_bus_addr handling in gntdev_map_grant_pages() Xen/x86: also check kernel mapping in set_foreign_p2m_mapping() Xen/x86: don't bail early from clear_foreign_p2m_mapping()
2021-02-15Xen/x86: also check kernel mapping in set_foreign_p2m_mapping()Jan Beulich1-1/+2
We should not set up further state if either mapping failed; paying attention to just the user mapping's status isn't enough. Also use GNTST_okay instead of implying its value (zero). This is part of XSA-361. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2021-02-15Xen/x86: don't bail early from clear_foreign_p2m_mapping()Jan Beulich1-7/+5
Its sibling (set_foreign_p2m_mapping()) as well as the sibling of its only caller (gnttab_map_refs()) don't clean up after themselves in case of error. Higher level callers are expected to do so. However, in order for that to really clean up any partially set up state, the operation should not terminate upon encountering an entry in unexpected state. It is particularly relevant to notice here that set_foreign_p2m_mapping() would skip setting up a p2m entry if its grant mapping failed, but it would continue to set up further p2m entries as long as their mappings succeeded. Arguably down the road set_foreign_p2m_mapping() may want its page state related WARN_ON() also converted to an error return. This is part of XSA-361. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2021-02-10x86/pv: Rework arch_local_irq_restore() to not use popfJuergen Gross4-54/+0
POPF is a rather expensive operation, so don't use it for restoring irq flags. Instead, test whether interrupts are enabled in the flags parameter and enable interrupts via STI in that case. This results in the restore_fl paravirt op to be no longer needed. Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210120135555.32594-7-jgross@suse.com
2021-02-10x86/xen: Drop USERGS_SYSRET64 paravirt callJuergen Gross3-23/+0
USERGS_SYSRET64 is used to return from a syscall via SYSRET, but a Xen PV guest will nevertheless use the IRET hypercall, as there is no sysret PV hypercall defined. So instead of testing all the prerequisites for doing a sysret and then mangling the stack for Xen PV again for doing an iret just use the iret exit from the beginning. This can easily be done via an ALTERNATIVE like it is done for the sysenter compat case already. It should be noted that this drops the optimization in Xen for not restoring a few registers when returning to user mode, but it seems as if the saved instructions in the kernel more than compensate for this drop (a kernel build in a Xen PV guest was slightly faster with this patch applied). While at it remove the stale sysret32 remnants. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210120135555.32594-6-jgross@suse.com
2021-02-10x86/pv: Switch SWAPGS to ALTERNATIVEJuergen Gross1-3/+0
SWAPGS is used only for interrupts coming from user mode or for returning to user mode. So there is no reason to use the PARAVIRT framework, as it can easily be replaced by an ALTERNATIVE depending on X86_FEATURE_XENPV. There are several instances using the PV-aware SWAPGS macro in paths which are never executed in a Xen PV guest. Replace those with the plain swapgs instruction. For SWAPGS_UNSAFE_STACK the same applies. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andy Lutomirski <luto@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210120135555.32594-5-jgross@suse.com