summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu/mshyperv.c
AgeCommit message (Collapse)AuthorFilesLines
2024-03-21Merge tag 'hyperv-next-signed-20240320' of ↵Linus Torvalds1-48/+45
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Use Hyper-V entropy to seed guest random number generator (Michael Kelley) - Convert to platform remove callback returning void for vmbus (Uwe Kleine-König) - Introduce hv_get_hypervisor_version function (Nuno Das Neves) - Rename some HV_REGISTER_* defines for consistency (Nuno Das Neves) - Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_* (Nuno Das Neves) - Cosmetic changes for hv_spinlock.c (Purna Pavan Chandra Aekkaladevi) - Use per cpu initial stack for vtl context (Saurabh Sengar) * tag 'hyperv-next-signed-20240320' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Use Hyper-V entropy to seed guest random number generator x86/hyperv: Cosmetic changes for hv_spinlock.c hyperv-tlfs: Rename some HV_REGISTER_* defines for consistency hv: vmbus: Convert to platform remove callback returning void mshyperv: Introduce hv_get_hypervisor_version function x86/hyperv: Use per cpu initial stack for vtl context hyperv-tlfs: Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_*
2024-03-19x86/hyperv: Use Hyper-V entropy to seed guest random number generatorMichael Kelley1-0/+1
A Hyper-V host provides its guest VMs with entropy in a custom ACPI table named "OEM0". The entropy bits are updated each time Hyper-V boots the VM, and are suitable for seeding the Linux guest random number generator (rng). See a brief description of OEM0 in [1]. Generation 2 VMs on Hyper-V use UEFI to boot. Existing EFI code in Linux seeds the rng with entropy bits from the EFI_RNG_PROTOCOL. Via this path, the rng is seeded very early during boot with good entropy. The ACPI OEM0 table provided in such VMs is an additional source of entropy. Generation 1 VMs on Hyper-V boot from BIOS. For these VMs, Linux doesn't currently get any entropy from the Hyper-V host. While this is not fundamentally broken because Linux can generate its own entropy, using the Hyper-V host provided entropy would get the rng off to a better start and would do so earlier in the boot process. Improve the rng seeding for Generation 1 VMs by having Hyper-V specific code in Linux take advantage of the OEM0 table to seed the rng. For Generation 2 VMs, use the OEM0 table to provide additional entropy beyond the EFI_RNG_PROTOCOL. Because the OEM0 table is custom to Hyper-V, parse it directly in the Hyper-V code in the Linux kernel and use add_bootloader_randomness() to add it to the rng. Once the entropy bits are read from OEM0, zero them out in the table so they don't appear in /sys/firmware/acpi/tables/OEM0 in the running VM. The zero'ing is done out of an abundance of caution to avoid potential security risks to the rng. Also set the OEM0 data length to zero so a kexec or other subsequent use of the table won't try to use the zero'ed bits. [1] https://download.microsoft.com/download/1/c/9/1c9813b8-089c-4fef-b2ad-ad80e79403ba/Whitepaper%20-%20The%20Windows%2010%20random%20number%20generation%20infrastructure.pdf Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://lore.kernel.org/r/20240318155408.216851-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240318155408.216851-1-mhklinux@outlook.com>
2024-03-18hyperv-tlfs: Rename some HV_REGISTER_* defines for consistencyNuno Das Neves1-1/+1
Rename HV_REGISTER_GUEST_OSID to HV_REGISTER_GUEST_OS_ID. This matches the existing HV_X64_MSR_GUEST_OS_ID. Rename HV_REGISTER_CRASH_* to HV_REGISTER_GUEST_CRASH_*. Including GUEST_ is consistent with other #defines such as HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE. The new names also match the TLFS document more accurately, i.e. HvRegisterGuestCrash*. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Link: https://lore.kernel.org/r/1710285687-9160-1-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1710285687-9160-1-git-send-email-nunodasneves@linux.microsoft.com>
2024-03-15Merge tag 'mm-stable-2024-03-13-20-04' of ↵Linus Torvalds1-2/+8
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames from hotplugged memory rather than only from main memory. Series "implement "memmap on memory" feature on s390". - More folio conversions from Matthew Wilcox in the series "Convert memcontrol charge moving to use folios" "mm: convert mm counter to take a folio" - Chengming Zhou has optimized zswap's rbtree locking, providing significant reductions in system time and modest but measurable reductions in overall runtimes. The series is "mm/zswap: optimize the scalability of zswap rb-tree". - Chengming Zhou has also provided the series "mm/zswap: optimize zswap lru list" which provides measurable runtime benefits in some swap-intensive situations. - And Chengming Zhou further optimizes zswap in the series "mm/zswap: optimize for dynamic zswap_pools". Measured improvements are modest. - zswap cleanups and simplifications from Yosry Ahmed in the series "mm: zswap: simplify zswap_swapoff()". - In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has contributed several DAX cleanups as well as adding a sysfs tunable to control the memmap_on_memory setting when the dax device is hotplugged as system memory. - Johannes Weiner has added the large series "mm: zswap: cleanups", which does that. - More DAMON work from SeongJae Park in the series "mm/damon: make DAMON debugfs interface deprecation unignorable" "selftests/damon: add more tests for core functionalities and corner cases" "Docs/mm/damon: misc readability improvements" "mm/damon: let DAMOS feeds and tame/auto-tune itself" - In the series "mm/mempolicy: weighted interleave mempolicy and sysfs extension" Rakie Kim has developed a new mempolicy interleaving policy wherein we allocate memory across nodes in a weighted fashion rather than uniformly. This is beneficial in heterogeneous memory environments appearing with CXL. - Christophe Leroy has contributed some cleanup and consolidation work against the ARM pagetable dumping code in the series "mm: ptdump: Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute". - Luis Chamberlain has added some additional xarray selftesting in the series "test_xarray: advanced API multi-index tests". - Muhammad Usama Anjum has reworked the selftest code to make its human-readable output conform to the TAP ("Test Anything Protocol") format. Amongst other things, this opens up the use of third-party tools to parse and process out selftesting results. - Ryan Roberts has added fork()-time PTE batching of THP ptes in the series "mm/memory: optimize fork() with PTE-mapped THP". Mainly targeted at arm64, this significantly speeds up fork() when the process has a large number of pte-mapped folios. - David Hildenbrand also gets in on the THP pte batching game in his series "mm/memory: optimize unmap/zap with PTE-mapped THP". It implements batching during munmap() and other pte teardown situations. The microbenchmark improvements are nice. - And in the series "Transparent Contiguous PTEs for User Mappings" Ryan Roberts further utilizes arm's pte's contiguous bit ("contpte mappings"). Kernel build times on arm64 improved nicely. Ryan's series "Address some contpte nits" provides some followup work. - In the series "mm/hugetlb: Restore the reservation" Breno Leitao has fixed an obscure hugetlb race which was causing unnecessary page faults. He has also added a reproducer under the selftest code. - In the series "selftests/mm: Output cleanups for the compaction test", Mark Brown did what the title claims. - Kinsey Ho has added the series "mm/mglru: code cleanup and refactoring". - Even more zswap material from Nhat Pham. The series "fix and extend zswap kselftests" does as claimed. - In the series "Introduce cpu_dcache_is_aliasing() to fix DAX regression" Mathieu Desnoyers has cleaned up and fixed rather a mess in our handling of DAX on archiecctures which have virtually aliasing data caches. The arm architecture is the main beneficiary. - Lokesh Gidra's series "per-vma locks in userfaultfd" provides dramatic improvements in worst-case mmap_lock hold times during certain userfaultfd operations. - Some page_owner enhancements and maintenance work from Oscar Salvador in his series "page_owner: print stacks and their outstanding allocations" "page_owner: Fixup and cleanup" - Uladzislau Rezki has contributed some vmalloc scalability improvements in his series "Mitigate a vmap lock contention". It realizes a 12x improvement for a certain microbenchmark. - Some kexec/crash cleanup work from Baoquan He in the series "Split crash out from kexec and clean up related config items". - Some zsmalloc maintenance work from Chengming Zhou in the series "mm/zsmalloc: fix and optimize objects/page migration" "mm/zsmalloc: some cleanup for get/set_zspage_mapping()" - Zi Yan has taught the MM to perform compaction on folios larger than order=0. This a step along the path to implementaton of the merging of large anonymous folios. The series is named "Enable >0 order folio memory compaction". - Christoph Hellwig has done quite a lot of cleanup work in the pagecache writeback code in his series "convert write_cache_pages() to an iterator". - Some modest hugetlb cleanups and speedups in Vishal Moola's series "Handle hugetlb faults under the VMA lock". - Zi Yan has changed the page splitting code so we can split huge pages into sizes other than order-0 to better utilize large folios. The series is named "Split a folio to any lower order folios". - David Hildenbrand has contributed the series "mm: remove total_mapcount()", a cleanup. - Matthew Wilcox has sought to improve the performance of bulk memory freeing in his series "Rearrange batched folio freeing". - Gang Li's series "hugetlb: parallelize hugetlb page init on boot" provides large improvements in bootup times on large machines which are configured to use large numbers of hugetlb pages. - Matthew Wilcox's series "PageFlags cleanups" does that. - Qi Zheng's series "minor fixes and supplement for ptdesc" does that also. S390 is affected. - Cleanups to our pagemap utility functions from Peter Xu in his series "mm/treewide: Replace pXd_large() with pXd_leaf()". - Nico Pache has fixed a few things with our hugepage selftests in his series "selftests/mm: Improve Hugepage Test Handling in MM Selftests". - Also, of course, many singleton patches to many things. Please see the individual changelogs for details. * tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (435 commits) mm/zswap: remove the memcpy if acomp is not sleepable crypto: introduce: acomp_is_async to expose if comp drivers might sleep memtest: use {READ,WRITE}_ONCE in memory scanning mm: prohibit the last subpage from reusing the entire large folio mm: recover pud_leaf() definitions in nopmd case selftests/mm: skip the hugetlb-madvise tests on unmet hugepage requirements selftests/mm: skip uffd hugetlb tests with insufficient hugepages selftests/mm: dont fail testsuite due to a lack of hugepages mm/huge_memory: skip invalid debugfs new_order input for folio split mm/huge_memory: check new folio order when split a folio mm, vmscan: retry kswapd's priority loop with cache_trim_mode off on failure mm: add an explicit smp_wmb() to UFFDIO_CONTINUE mm: fix list corruption in put_pages_list mm: remove folio from deferred split list before uncharging it filemap: avoid unnecessary major faults in filemap_fault() mm,page_owner: drop unnecessary check mm,page_owner: check for null stack_record before bumping its refcount mm: swap: fix race between free_swap_and_cache() and swapoff() mm/treewide: align up pXd_leaf() retval across archs mm/treewide: drop pXd_large() ...
2024-03-12mshyperv: Introduce hv_get_hypervisor_version functionNuno Das Neves1-19/+15
Introduce x86_64 and arm64 functions to get the hypervisor version information and store it in a structure for simpler parsing. Use the new function to get and parse the version at boot time. While at it, move the printing code to hv_common_init() so it is not duplicated. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Acked-by: Wei Liu <wei.liu@kernel.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1709852618-29110-1-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1709852618-29110-1-git-send-email-nunodasneves@linux.microsoft.com>
2024-03-04hyperv-tlfs: Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_*Nuno Das Neves1-28/+28
The HV_REGISTER_ are used as arguments to hv_set/get_register(), which delegate to arch-specific mechanisms for getting/setting synthetic Hyper-V MSRs. On arm64, HV_REGISTER_ defines are synthetic VP registers accessed via the get/set vp registers hypercalls. The naming matches the TLFS document, although these register names are not specific to arm64. However, on x86 the prefix HV_REGISTER_ indicates Hyper-V MSRs accessed via rdmsrl()/wrmsrl(). This is not consistent with the TLFS doc, where HV_REGISTER_ is *only* used for used for VP register names used by the get/set register hypercalls. To fix this inconsistency and prevent future confusion, change the arch-generic aliases used by callers of hv_set/get_register() to have the prefix HV_MSR_ instead of HV_REGISTER_. Use the prefix HV_X64_MSR_ for the x86-only Hyper-V MSRs. On x86, the generic HV_MSR_'s point to the corresponding HV_X64_MSR_. Move the arm64 HV_REGISTER_* defines to the asm-generic hyperv-tlfs.h, since these are not specific to arm64. On arm64, the generic HV_MSR_'s point to the corresponding HV_REGISTER_. While at it, rename hv_get/set_registers() and related functions to hv_get/set_msr(), hv_get/set_nested_msr(), etc. These are only used for Hyper-V MSRs and this naming makes that clear. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1708440933-27125-1-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1708440933-27125-1-git-send-email-nunodasneves@linux.microsoft.com>
2024-02-24x86, crash: wrap crash dumping code into crash related ifdefsBaoquan He1-2/+8
Now crash codes under kernel/ folder has been split out from kexec code, crash dumping can be separated from kexec reboot in config items on x86 with some adjustments. Here, also change some ifdefs or IS_ENABLED() check to more appropriate ones, e,g - #ifdef CONFIG_KEXEC_CORE -> #ifdef CONFIG_CRASH_DUMP - (!IS_ENABLED(CONFIG_KEXEC_CORE)) - > (!IS_ENABLED(CONFIG_CRASH_RESERVE)) [bhe@redhat.com: don't nest CONFIG_CRASH_DUMP ifdef inside CONFIG_KEXEC_CODE ifdef scope] Link: https://lore.kernel.org/all/SN6PR02MB4157931105FA68D72E3D3DB8D47B2@SN6PR02MB4157.namprd02.prod.outlook.com/T/#u Link: https://lkml.kernel.org/r/20240124051254.67105-7-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Pingfan Liu <piliu@redhat.com> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Michael Kelley <mhklinux@outlook.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-01x86/traps: Add sysvec_install() to install a system interrupt handlerXin Li1-8/+7
Add sysvec_install() to install a system interrupt handler into the IDT or the FRED system interrupt handler table. Signed-off-by: Xin Li <xin3.li@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Shan Kang <shan.kang@intel.com> Link: https://lore.kernel.org/r/20231205105030.8698-28-xin3.li@intel.com
2023-11-22x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown()Uros Bizjak1-1/+4
Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old in hv_nmi_unknown(). On x86 the CMPXCHG instruction returns success in the ZF flag, so this change saves a compare after CMPXCHG. The generated asm code improves from: 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx 45: b8 ff ff ff ff mov $0xffffffff,%eax 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip) 51: 00 52: 83 f8 ff cmp $0xffffffff,%eax 55: 0f 95 c0 setne %al to: 3e: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx 45: b8 ff ff ff ff mov $0xffffffff,%eax 4a: f0 0f b1 15 00 00 00 lock cmpxchg %edx,0x0(%rip) 51: 00 52: 0f 95 c0 setne %al No functional change intended. Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20231114170038.381634-1-ubizjak@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20231114170038.381634-1-ubizjak@gmail.com>
2023-09-04Merge tag 'hyperv-next-signed-20230902' of ↵Linus Torvalds1-7/+84
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Support for SEV-SNP guests on Hyper-V (Tianyu Lan) - Support for TDX guests on Hyper-V (Dexuan Cui) - Use SBRM API in Hyper-V balloon driver (Mitchell Levy) - Avoid dereferencing ACPI root object handle in VMBus driver (Maciej Szmigiero) - A few misecllaneous fixes (Jiapeng Chong, Nathan Chancellor, Saurabh Sengar) * tag 'hyperv-next-signed-20230902' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (24 commits) x86/hyperv: Remove duplicate include x86/hyperv: Move the code in ivm.c around to avoid unnecessary ifdef's x86/hyperv: Remove hv_isolation_type_en_snp x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM with the paravisor Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the paravisor x86/hyperv: Introduce a global variable hyperv_paravisor_present Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM x86/hyperv: Fix serial console interrupts for fully enlightened TDX guests Drivers: hv: vmbus: Support fully enlightened TDX guests x86/hyperv: Support hypercalls for fully enlightened TDX guests x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests x86/hyperv: Fix undefined reference to isolation_type_en_snp without CONFIG_HYPERV x86/hyperv: Add missing 'inline' to hv_snp_boot_ap() stub hv: hyperv.h: Replace one-element array with flexible-array member Drivers: hv: vmbus: Don't dereference ACPI root object handle x86/hyperv: Add hyperv-specific handling for VMMCALL under SEV-ES x86/hyperv: Add smp support for SEV-SNP guest clocksource: hyper-v: Mark hyperv tsc page unencrypted in sev-snp enlightened guest x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp enlightened guest drivers: hv: Mark percpu hvcall input arg page unencrypted in SEV-SNP enlightened guest ...
2023-08-25x86/hyperv: Remove hv_isolation_type_en_snpDexuan Cui1-6/+4
In ms_hyperv_init_platform(), do not distinguish between a SNP VM with the paravisor and a SNP VM without the paravisor. Replace hv_isolation_type_en_snp() with !ms_hyperv.paravisor_present && hv_isolation_type_snp(). The hv_isolation_type_en_snp() in drivers/hv/hv.c and drivers/hv/hv_common.c can be changed to hv_isolation_type_snp() since we know !ms_hyperv.paravisor_present is true there. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-10-decui@microsoft.com
2023-08-25x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM with the paravisorDexuan Cui1-4/+4
When the paravisor is present, a SNP VM must use GHCB to access some special MSRs, including HV_X64_MSR_GUEST_OS_ID and some SynIC MSRs. Similarly, when the paravisor is present, a TDX VM must use TDX GHCI to access the same MSRs. Implement hv_tdx_msr_write() and hv_tdx_msr_read(), and use the helper functions hv_ivm_msr_read() and hv_ivm_msr_write() to access the MSRs in a unified way for SNP/TDX VMs with the paravisor. Do not export hv_tdx_msr_write() and hv_tdx_msr_read(), because we never really used hv_ghcb_msr_write() and hv_ghcb_msr_read() in any module. Update arch/x86/include/asm/mshyperv.h so that the kernel can still build if CONFIG_AMD_MEM_ENCRYPT or CONFIG_INTEL_TDX_GUEST is not set, or neither is set. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-9-decui@microsoft.com
2023-08-25x86/hyperv: Introduce a global variable hyperv_paravisor_presentDexuan Cui1-2/+7
The new variable hyperv_paravisor_present is set only when the VM is a SNP/TDX VM with the paravisor running: see ms_hyperv_init_platform(). We introduce hyperv_paravisor_present because we can not use ms_hyperv.paravisor_present in arch/x86/include/asm/mshyperv.h: struct ms_hyperv_info is defined in include/asm-generic/mshyperv.h, which is included at the end of arch/x86/include/asm/mshyperv.h, but at the beginning of arch/x86/include/asm/mshyperv.h, we would already need to use struct ms_hyperv_info in hv_do_hypercall(). We use hyperv_paravisor_present only in include/asm-generic/mshyperv.h, and use ms_hyperv.paravisor_present elsewhere. In the future, we'll introduce a hypercall function structure for different VM types, and at boot time, the right function pointers would be written into the structure so that runtime testing of TDX vs. SNP vs. normal will be avoided and hyperv_paravisor_present will no longer be needed. Call hv_vtom_init() when it's a VBS VM or when ms_hyperv.paravisor_present is true, i.e. the VM is a SNP VM or TDX VM with the paravisor. Enhance hv_vtom_init() for a TDX VM with the paravisor. In hv_common_cpu_init(), don't decrypt the hyperv_pcpu_input_arg for a TDX VM with the paravisor, just like we don't decrypt the page for a SNP VM with the paravisor. Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-7-decui@microsoft.com
2023-08-25x86/hyperv: Fix serial console interrupts for fully enlightened TDX guestsDexuan Cui1-0/+22
When a fully enlightened TDX guest runs on Hyper-V, the UEFI firmware sets the HW_REDUCED flag and consequently ttyS0 interrupts can't work. Fix the issue by overriding x86_init.acpi.reduced_hw_early_init(). Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-5-decui@microsoft.com
2023-08-25Drivers: hv: vmbus: Support fully enlightened TDX guestsDexuan Cui1-0/+14
Add Hyper-V specific code so that a fully enlightened TDX guest (i.e. without the paravisor) can run on Hyper-V: Don't use hv_vp_assist_page. Use GHCI instead. Don't try to use the unsupported HV_REGISTER_CRASH_CTL. Don't trust (use) Hyper-V's TLB-flushing hypercalls. Don't use lazy EOI. Share the SynIC Event/Message pages with the hypervisor. Don't use the Hyper-V TSC page for now, because non-trivial work is required to share the page with the hypervisor. Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-4-decui@microsoft.com
2023-08-25x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guestsDexuan Cui1-0/+2
No logic change to SNP/VBS guests. hv_isolation_type_tdx() will be used to instruct a TDX guest on Hyper-V to do some TDX-specific operations, e.g. for a fully enlightened TDX guest (i.e. without the paravisor), hv_do_hypercall() should use __tdx_hypercall() and such a guest on Hyper-V should handle the Hyper-V Event/Message/Monitor pages specially. Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230824080712.30327-2-decui@microsoft.com
2023-08-23x86/hyperv: Fix undefined reference to isolation_type_en_snp without ↵Dexuan Cui1-4/+5
CONFIG_HYPERV When CONFIG_HYPERV is not set, arch/x86/hyperv/ivm.c is not built (see arch/x86/Kbuild), so 'isolation_type_en_snp' in the ivm.c is not defined, and this failure happens: ld: arch/x86/kernel/cpu/mshyperv.o: in function `ms_hyperv_init_platform': arch/x86/kernel/cpu/mshyperv.c:417: undefined reference to `isolation_type_en_snp' Fix the failure by testing hv_get_isolation_type() and ms_hyperv.paravisor_present for a fully enlightened SNP VM: when CONFIG_HYPERV is not set, hv_get_isolation_type() is defined as a static inline function that always returns HV_ISOLATION_TYPE_NONE (see include/asm-generic/mshyperv.h), so the compiler won't generate any code for the ms_hyperv.paravisor_present and static_branch_enable(). Reported-by: Tom Lendacky <thomas.lendacky@amd.com> Closes: https://lore.kernel.org/lkml/b4979997-23b9-0c43-574e-e4a3506500ff@amd.com/ Fixes: d6e2d6524437 ("x86/hyperv: Add sev-snp enlightened guest static key") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230823032008.18186-1-decui@microsoft.com
2023-08-22x86/hyperv: Add hyperv-specific handling for VMMCALL under SEV-ESTianyu Lan1-0/+21
Add Hyperv-specific handling for faults caused by VMMCALL instructions. Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230818102919.1318039-9-ltykernel@gmail.com
2023-08-22x86/hyperv: Add smp support for SEV-SNP guestTianyu Lan1-1/+10
In the AMD SEV-SNP guest, AP needs to be started up via sev es save area and Hyper-V requires to call HVCALL_START_VP hypercall to pass the gpa of sev es save area with AP's vp index and VTL(Virtual trust level) parameters. Override wakeup_secondary_cpu_64 callback with hv_snp_boot_ap. Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230818102919.1318039-8-ltykernel@gmail.com
2023-08-22x86/hyperv: Add sev-snp enlightened guest static keyTianyu Lan1-2/+7
Introduce static key isolation_type_en_snp for enlightened sev-snp guest check. Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20230818102919.1318039-2-ltykernel@gmail.com
2023-08-09x86/apic: Nuke ack_APIC_irq()Dave Hansen1-2/+2
Yet another wrapper of a wrapper gone along with the outdated comment that this compiles to a single instruction. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
2023-04-18x86/hyperv: VTL support for Hyper-VSaurabh Sengar1-0/+1
Virtual Trust Levels (VTL) helps enable Hyper-V Virtual Secure Mode (VSM) feature. VSM is a set of hypervisor capabilities and enlightenments offered to host and guest partitions which enable the creation and management of new security boundaries within operating system software. VSM achieves and maintains isolation through VTLs. Add early initialization for Virtual Trust Levels (VTL). This includes initializing the x86 platform for VTL and enabling boot support for secondary CPUs to start in targeted VTL context. For now, only enable the code for targeted VTL level as 2. When starting an AP at a VTL other than VTL0, the AP must start directly in 64-bit mode, bypassing the usual 16-bit -> 32-bit -> 64-bit mode transition sequence that occurs after waking up an AP with SIPI whose vector points to the 16-bit AP startup trampoline code. Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com> Link: https://lore.kernel.org/r/1681192532-15460-6-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-04-18x86/hyperv: Make hv_get_nmi_reason publicSaurabh Sengar1-5/+0
Move hv_get_nmi_reason to .h file so it can be used in other modules as well. Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1681192532-15460-4-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-04-17swiotlb: Remove bounce buffer remapping for Hyper-VMichael Kelley1-6/+1
With changes to how Hyper-V guest VMs flip memory between private (encrypted) and shared (decrypted), creating a second kernel virtual mapping for shared memory is no longer necessary. Everything needed for the transition to shared is handled by set_memory_decrypted(). As such, remove swiotlb_unencrypted_base and the associated code. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/1679838727-87310-8-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-04-17Merge remote-tracking branch 'tip/x86/sev' into hyperv-nextWei Liu1-8/+7
Merge the following 6 patches from tip/x86/sev, which are taken from Michael Kelley's series [0]. The rest of Michael's series depend on them. x86/hyperv: Change vTOM handling to use standard coco mechanisms init: Call mem_encrypt_init() after Hyper-V hypercall init is done x86/mm: Handle decryption/re-encryption of bss_decrypted consistently Drivers: hv: Explicitly request decrypted in vmap_pfn() calls x86/hyperv: Reorder code to facilitate future work x86/ioremap: Add hypervisor callback for private MMIO mapping in coco VM 0: https://lore.kernel.org/linux-hyperv/1679838727-87310-1-git-send-email-mikelley@microsoft.com/
2023-03-27x86/hyperv: Change vTOM handling to use standard coco mechanismsMichael Kelley1-8/+7
Hyper-V guests on AMD SEV-SNP hardware have the option of using the "virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP architecture. With vTOM, shared vs. private memory accesses are controlled by splitting the guest physical address space into two halves. vTOM is the dividing line where the uppermost bit of the physical address space is set; e.g., with 47 bits of guest physical address space, vTOM is 0x400000000000 (bit 46 is set). Guest physical memory is accessible at two parallel physical addresses -- one below vTOM and one above vTOM. Accesses below vTOM are private (encrypted) while accesses above vTOM are shared (decrypted). In this sense, vTOM is like the GPA.SHARED bit in Intel TDX. Support for Hyper-V guests using vTOM was added to the Linux kernel in two patch sets[1][2]. This support treats the vTOM bit as part of the physical address. For accessing shared (decrypted) memory, these patch sets create a second kernel virtual mapping that maps to physical addresses above vTOM. A better approach is to treat the vTOM bit as a protection flag, not as part of the physical address. This new approach is like the approach for the GPA.SHARED bit in Intel TDX. Rather than creating a second kernel virtual mapping, the existing mapping is updated using recently added coco mechanisms. When memory is changed between private and shared using set_memory_decrypted() and set_memory_encrypted(), the PTEs for the existing kernel mapping are changed to add or remove the vTOM bit in the guest physical address, just as with TDX. The hypercalls to change the memory status on the host side are made using the existing callback mechanism. Everything just works, with a minor tweak to map the IO-APIC to use private accesses. To accomplish the switch in approach, the following must be done: * Update Hyper-V initialization to set the cc_mask based on vTOM and do other coco initialization. * Update physical_mask so the vTOM bit is no longer treated as part of the physical address * Remove CC_VENDOR_HYPERV and merge the associated vTOM functionality under CC_VENDOR_AMD. Update cc_mkenc() and cc_mkdec() to set/clear the vTOM bit as a protection flag. * Code already exists to make hypercalls to inform Hyper-V about pages changing between shared and private. Update this code to run as a callback from __set_memory_enc_pgtable(). * Remove the Hyper-V special case from __set_memory_enc_dec() * Remove the Hyper-V specific call to swiotlb_update_mem_attributes() since mem_encrypt_init() will now do it. * Add a Hyper-V specific implementation of the is_private_mmio() callback that returns true for the IO-APIC and vTPM MMIO addresses [1] https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/ [2] https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/ [ bp: Touchups. ] Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/1679838727-87310-7-git-send-email-mikelley@microsoft.com
2023-03-17x86/hyperv: Block root partition functionality in a Confidential VMMichael Kelley1-4/+8
Hyper-V should never specify a VM that is a Confidential VM and also running in the root partition. Nonetheless, explicitly block such a combination to guard against a compromised Hyper-V maliciously trying to exploit root partition functionality in a Confidential VM to expose Confidential VM secrets. No known bug is being fixed, but the attack surface for Confidential VMs on Hyper-V is reduced. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1678894453-95392-1-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-02-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+1
Pull kvm updates from Paolo Bonzini: "ARM: - Provide a virtual cache topology to the guest to avoid inconsistencies with migration on heterogenous systems. Non secure software has no practical need to traverse the caches by set/way in the first place - Add support for taking stage-2 access faults in parallel. This was an accidental omission in the original parallel faults implementation, but should provide a marginal improvement to machines w/o FEAT_HAFDBS (such as hardware from the fruit company) - A preamble to adding support for nested virtualization to KVM, including vEL2 register state, rudimentary nested exception handling and masking unsupported features for nested guests - Fixes to the PSCI relay that avoid an unexpected host SVE trap when resuming a CPU when running pKVM - VGIC maintenance interrupt support for the AIC - Improvements to the arch timer emulation, primarily aimed at reducing the trap overhead of running nested - Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the interest of CI systems - Avoid VM-wide stop-the-world operations when a vCPU accesses its own redistributor - Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions in the host - Aesthetic and comment/kerneldoc fixes - Drop the vestiges of the old Columbia mailing list and add [Oliver] as co-maintainer RISC-V: - Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE - Correctly place the guest in S-mode after redirecting a trap to the guest - Redirect illegal instruction traps to guest - SBI PMU support for guest s390: - Sort out confusion between virtual and physical addresses, which currently are the same on s390 - A new ioctl that performs cmpxchg on guest memory - A few fixes x86: - Change tdp_mmu to a read-only parameter - Separate TDP and shadow MMU page fault paths - Enable Hyper-V invariant TSC control - Fix a variety of APICv and AVIC bugs, some of them real-world, some of them affecting architecurally legal but unlikely to happen in practice - Mark APIC timer as expired if its in one-shot mode and the count underflows while the vCPU task was being migrated - Advertise support for Intel's new fast REP string features - Fix a double-shootdown issue in the emergency reboot code - Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give SVM similar treatment to VMX - Update Xen's TSC info CPUID sub-leaves as appropriate - Add support for Hyper-V's extended hypercalls, where "support" at this point is just forwarding the hypercalls to userspace - Clean up the kvm->lock vs. kvm->srcu sequences when updating the PMU and MSR filters - One-off fixes and cleanups - Fix and cleanup the range-based TLB flushing code, used when KVM is running on Hyper-V - Add support for filtering PMU events using a mask. If userspace wants to restrict heavily what events the guest can use, it can now do so without needing an absurd number of filter entries - Clean up KVM's handling of "PMU MSRs to save", especially when vPMU support is disabled - Add PEBS support for Intel Sapphire Rapids - Fix a mostly benign overflow bug in SEV's send|receive_update_data() - Move several SVM-specific flags into vcpu_svm x86 Intel: - Handle NMI VM-Exits before leaving the noinstr region - A few trivial cleanups in the VM-Enter flows - Stop enabling VMFUNC for L1 purely to document that KVM doesn't support EPTP switching (or any other VM function) for L1 - Fix a crash when using eVMCS's enlighted MSR bitmaps Generic: - Clean up the hardware enable and initialization flow, which was scattered around multiple arch-specific hooks. Instead, just let the arch code call into generic code. Both x86 and ARM should benefit from not having to fight common KVM code's notion of how to do initialization - Account allocations in generic kvm_arch_alloc_vm() - Fix a memory leak if coalesced MMIO unregistration fails selftests: - On x86, cache the CPU vendor (AMD vs. Intel) and use the info to emit the correct hypercall instruction instead of relying on KVM to patch in VMMCALL - Use TAP interface for kvm_binary_stats_test and tsc_msrs_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (325 commits) KVM: SVM: hyper-v: placate modpost section mismatch error KVM: x86/mmu: Make tdp_mmu_allowed static KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes KVM: arm64: nv: Filter out unsupported features from ID regs KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2 KVM: arm64: nv: Allow a sysreg to be hidden from userspace only KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2 KVM: arm64: nv: Handle SMCs taken from virtual EL2 KVM: arm64: nv: Handle trapped ERET from virtual EL2 KVM: arm64: nv: Inject HVC exceptions to the virtual EL2 KVM: arm64: nv: Support virtual EL2 exceptions KVM: arm64: nv: Handle HCR_EL2.NV system register traps KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state KVM: arm64: nv: Add EL2 system registers to vcpu context KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set KVM: arm64: nv: Introduce nested virtualization VCPU feature KVM: arm64: Use the S2 MMU context to iterate over S2 table ...
2023-02-16x86/hyperv: Fix hv_get/set_register for nested bringupNuno Das Neves1-4/+4
hv_get_nested_reg only translates SINT0, resulting in the wrong sint being registered by nested vmbus. Fix the issue with new utility function hv_is_sint_reg. While at it, improve clarity of hv_set_non_nested_register and hv_is_synic_reg. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Jinank Jain <jinankjain@linux.microsoft.com> Link: https://lore.kernel.org/r/1675980172-6851-1-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-01-17Drivers: hv: Setup synic registers in case of nested root partitionJinank Jain1-0/+65
Child partitions are free to allocate SynIC message and event page but in case of root partition it must use the pages allocated by Microsoft Hypervisor (MSHV). Base address for these pages can be found using synthetic MSRs exposed by MSHV. There is a slight difference in those MSRs for nested vs non-nested root partition. Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/cb951fb1ad6814996fc54f4a255c5841a20a151f.1672639707.git.jinankjain@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-01-12x86/hyperv: Add support for detecting nested hypervisorJinank Jain1-0/+7
Detect if Linux is running as a nested hypervisor in the root partition for Microsoft Hypervisor, using flags provided by MSHV. Expose a new variable hv_nested that is used later for decisions specific to the nested use case. Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/8e3e7112806e81d2292a66a56fe547162754ecea.1672639707.git.jinankjain@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-12-29Merge branch 'kvm-late-6.1' into HEADPaolo Bonzini1-1/+1
x86: * Change tdp_mmu to a read-only parameter * Separate TDP and shadow MMU page fault paths * Enable Hyper-V invariant TSC control selftests: * Use TAP interface for kvm_binary_stats_test and tsc_msrs_test Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-29x86/hyperv: Add HV_EXPOSE_INVARIANT_TSC defineVitaly Kuznetsov1-1/+1
Avoid open coding BIT(0) of HV_X64_MSR_TSC_INVARIANT_CONTROL by adding a dedicated define. While there's only one user at this moment, the upcoming KVM implementation of Hyper-V Invariant TSC feature will need to use it as well. Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221013095849.705943-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-28iommu/hyper-v: Allow hyperv irq remapping without x2apicNuno Das Neves1-0/+6
If x2apic is not available, hyperv-iommu skips remapping irqs. This breaks root partition which always needs irqs remapped. Fix this by allowing irq remapping regardless of x2apic, and change hyperv_enable_irq_remapping() to return IRQ_REMAP_XAPIC_MODE in case x2apic is missing. Tested with root and non-root hyperv partitions. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1668715899-8971-1-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-05-28Merge tag 'hyperv-next-signed-20220528' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Harden hv_sock driver (Andrea Parri) - Harden Hyper-V PCI driver (Andrea Parri) - Fix multi-MSI for Hyper-V PCI driver (Jeffrey Hugo) - Fix Hyper-V PCI to reduce boot time (Dexuan Cui) - Remove code for long EOL'ed Hyper-V versions (Michael Kelley, Saurabh Sengar) - Fix balloon driver error handling (Shradha Gupta) - Fix a typo in vmbus driver (Julia Lawall) - Ignore vmbus IMC device (Michael Kelley) - Add a new error message to Hyper-V DRM driver (Saurabh Sengar) * tag 'hyperv-next-signed-20220528' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (28 commits) hv_balloon: Fix balloon_probe() and balloon_remove() error handling scsi: storvsc: Removing Pre Win8 related logic Drivers: hv: vmbus: fix typo in comment PCI: hv: Fix synchronization between channel callback and hv_pci_bus_exit() PCI: hv: Add validation for untrusted Hyper-V values PCI: hv: Fix interrupt mapping for multi-MSI PCI: hv: Reuse existing IRTE allocation in compose_msi_msg() drm/hyperv: Remove support for Hyper-V 2008 and 2008R2/Win7 video: hyperv_fb: Remove support for Hyper-V 2008 and 2008R2/Win7 scsi: storvsc: Remove support for Hyper-V 2008 and 2008R2/Win7 Drivers: hv: vmbus: Remove support for Hyper-V 2008 and Hyper-V 2008R2/Win7 x86/hyperv: Disable hardlockup detector by default in Hyper-V guests drm/hyperv: Add error message for fb size greater than allocated PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI Drivers: hv: vmbus: Refactor the ring-buffer iterator functions Drivers: hv: vmbus: Accept hv_sock offers in isolated guests hv_sock: Add validation for untrusted Hyper-V values hv_sock: Copy packets sent by Hyper-V out of the ring buffer hv_sock: Check hv_pkt_iter_first_raw()'s return value ...
2022-05-11x86/hyperv: Disable hardlockup detector by default in Hyper-V guestsMichael Kelley1-0/+2
In newer versions of Hyper-V, the x86/x64 PMU can be virtualized into guest VMs by explicitly enabling it. Linux kernels are typically built to automatically enable the hardlockup detector if the PMU is found. To prevent the possibility of false positives due to the vagaries of VM scheduling, disable the PMU-based hardlockup detector by default in a VM on Hyper-V. The hardlockup detector can still be enabled by overriding the default with the nmi_watchdog=1 option on the kernel boot line or via sysctl at runtime. This change mimics the approach taken with KVM guests in commit 692297d8f968 ("watchdog: introduce the hardlockup_detector_disable() function"). Linux on ARM64 does not provide a PMU-based hardlockup detector, so there's no corresponding disable in the Hyper-V init code on ARM64. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1652111063-6535-1-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-04-18x86: centralize setting SWIOTLB_FORCE when guest memory encryption is enabledChristoph Hellwig1-8/+0
Move enabling SWIOTLB_FORCE for guest memory encryption into common code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-03-24Merge tag 'hyperv-next-signed-20220322' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: "Minor patches from various people" * tag 'hyperv-next-signed-20220322' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Output host build info as normal Windows version number hv_balloon: rate-limit "Unhandled message" warning drivers: hv: log when enabling crash_kexec_post_notifiers hv_utils: Add comment about max VMbus packet size in VSS driver Drivers: hv: Compare cpumasks and not their weights in init_vp_index() Drivers: hv: Rename 'alloced' to 'allocated' Drivers: hv: vmbus: Use struct_size() helper in kmalloc()
2022-03-08x86/hyperv: Output host build info as normal Windows version numberMichael Kelley1-4/+4
Hyper-V provides host version number information that is output in text form by a Linux guest when it boots. For whatever reason, the formatting has historically been non-standard. Change it to output in normal Windows version format for better readability. Similar code for ARM64 guests already outputs in normal Windows version format. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1646767364-2234-1-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-02-23x86/coco: Explicitly declare type of confidential computing platformKirill A. Shutemov1-0/+6
The kernel derives the confidential computing platform type it is running as from sme_me_mask on AMD or by using hv_is_isolation_supported() on HyperV isolation VMs. This detection process will be more complicated as more platforms get added. Declare a confidential computing vendor variable explicitly and set it via cc_set_vendor() on the respective platform. [ bp: Massage commit message, fixup HyperV check. ] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20220222185740.26228-4-kirill.shutemov@linux.intel.com
2022-01-16Merge tag 'hyperv-next-signed-20220114' of ↵Linus Torvalds1-1/+14
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - More patches for Hyper-V isolation VM support (Tianyu Lan) - Bug fixes and clean-up patches from various people * tag 'hyperv-next-signed-20220114' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: scsi: storvsc: Fix storvsc_queuecommand() memory leak x86/hyperv: Properly deal with empty cpumasks in hyperv_flush_tlb_multi() Drivers: hv: vmbus: Initialize request offers message for Isolation VM scsi: storvsc: Fix unsigned comparison to zero swiotlb: Add CONFIG_HAS_IOMEM check around swiotlb_mem_remap() x86/hyperv: Fix definition of hv_ghcb_pg variable Drivers: hv: Fix definition of hypercall input & output arg variables net: netvsc: Add Isolation VM support for netvsc driver scsi: storvsc: Add Isolation VM support for storvsc driver hyper-v: Enable swiotlb bounce buffer for Isolation VM x86/hyper-v: Add hyperv Isolation VM check in the cc_platform_has() swiotlb: Add swiotlb bounce buffer remap function for HV IVM
2022-01-07random: remove unused irq_flags argument from add_interrupt_randomness()Sebastian Andrzej Siewior1-1/+1
Since commit ee3e00e9e7101 ("random: use registers from interrupted code for CPU's w/o a cycle counter") the irq_flags argument is no longer used. Remove unused irq_flags. Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Liu <wei.liu@kernel.org> Cc: linux-hyperv@vger.kernel.org Cc: x86@kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-12-20hyper-v: Enable swiotlb bounce buffer for Isolation VMTianyu Lan1-1/+14
hyperv Isolation VM requires bounce buffer support to copy data from/to encrypted memory and so enable swiotlb force mode to use swiotlb bounce buffer for DMA transaction. In Isolation VM with AMD SEV, the bounce buffer needs to be accessed via extra address space which is above shared_gpa_boundary (E.G 39 bit address line) reported by Hyper-V CPUID ISOLATION_CONFIG. The access physical address will be original physical address + shared_gpa_boundary. The shared_gpa_boundary in the AMD SEV SNP spec is called virtual top of memory(vTOM). Memory addresses below vTOM are automatically treated as private while memory above vTOM is treated as shared. Swiotlb bounce buffer code calls set_memory_decrypted() to mark bounce buffer visible to host and map it in extra address space via memremap. Populate the shared_gpa_boundary (vTOM) via swiotlb_unencrypted_base variable. The map function memremap() can't work in the early place (e.g ms_hyperv_init_platform()) and so call swiotlb_update_mem_ attributes() in the hyperv_init(). Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20211213071407.314309-4-ltykernel@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-11-15x86/hyperv: Move required MSRs check to initial platform probingSean Christopherson1-5/+15
Explicitly check for MSR_HYPERCALL and MSR_VP_INDEX support when probing for running as a Hyper-V guest instead of waiting until hyperv_init() to detect the bogus configuration. Add messages to give the admin a heads up that they are likely running on a broken virtual machine setup. At best, silently disabling Hyper-V is confusing and difficult to debug, e.g. the kernel _says_ it's using all these fancy Hyper-V features, but always falls back to the native versions. At worst, the half baked setup will crash/hang the kernel. Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20211104182239.1302956-3-seanjc@google.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-10-28x86/hyperv: Initialize shared memory boundary in the Isolation VM.Tianyu Lan1-0/+2
Hyper-V exposes shared memory boundary via cpuid HYPERV_CPUID_ISOLATION_CONFIG and store it in the shared_gpa_boundary of ms_hyperv struct. This prepares to share memory with host for SNP guest. Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Link: https://lore.kernel.org/r/20211025122116.264793-3-ltykernel@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-10-28x86/hyperv: Initialize GHCB page in Isolation VMTianyu Lan1-0/+3
Hyperv exposes GHCB page via SEV ES GHCB MSR for SNP guest to communicate with hypervisor. Map GHCB page for all cpus to read/write MSR register and submit hvcall request via ghcb page. Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Link: https://lore.kernel.org/r/20211025122116.264793-2-ltykernel@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-09-02Merge tag 'hyperv-next-signed-20210831' of ↵Linus Torvalds1-22/+16
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - make Hyper-V code arch-agnostic (Michael Kelley) - fix sched_clock behaviour on Hyper-V (Ani Sinha) - fix a fault when Linux runs as the root partition on MSHV (Praveen Kumar) - fix VSS driver (Vitaly Kuznetsov) - cleanup (Sonia Sharma) * tag 'hyperv-next-signed-20210831' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: hv_utils: Set the maximum packet size for VSS driver to the length of the receive buffer Drivers: hv: Enable Hyper-V code to be built on ARM64 arm64: efi: Export screen_info arm64: hyperv: Initialize hypervisor on boot arm64: hyperv: Add panic handler arm64: hyperv: Add Hyper-V hypercall and register access utilities x86/hyperv: fix root partition faults when writing to VP assist page MSR hv: hyperv.h: Remove unused inline functions drivers: hv: Decouple Hyper-V clock/timer code from VMbus drivers x86/hyperv: add comment describing TSC_INVARIANT_CONTROL MSR setting bit 0 Drivers: hv: Move Hyper-V misc functionality to arch-neutral code Drivers: hv: Add arch independent default functions for some Hyper-V handlers Drivers: hv: Make portions of Hyper-V init code be arch neutral x86/hyperv: fix for unwanted manipulation of sched_clock when TSC marked unstable asm-generic/hyperv: Add missing #include of nmi.h
2021-07-21Revert "x86/hyperv: fix logical processor creation"Wei Liu1-1/+1
This reverts commit 450605c28d571eddca39a65fdbc1338add44c6d9. Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-07-16x86/hyperv: add comment describing TSC_INVARIANT_CONTROL MSR setting bit 0Ani Sinha1-0/+9
Commit dce7cd62754b5 ("x86/hyperv: Allow guests to enable InvariantTSC") added the support for HV_X64_MSR_TSC_INVARIANT_CONTROL. Setting bit 0 of this synthetic MSR will allow hyper-v guests to report invariant TSC CPU feature through CPUID. This comment adds this explanation to the code and mentions where the Intel's generic platform init code reads this feature bit from CPUID. The comment will help developers understand how the two parts of the initialization (hyperV specific and non-hyperV specific generic hw init) are related. Signed-off-by: Ani Sinha <ani@anisinha.ca> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20210716133245.3272672-1-ani@anisinha.ca Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-07-15Drivers: hv: Move Hyper-V misc functionality to arch-neutral codeMichael Kelley1-11/+0
The check for whether hibernation is possible, and the enabling of Hyper-V panic notification during kexec, are both architecture neutral. Move the code from under arch/x86 and into drivers/hv/hv_common.c where it can also be used for ARM64. No functional change. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1626287687-2045-4-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>