summaryrefslogtreecommitdiff
path: root/arch/x86/hyperv
AgeCommit message (Collapse)AuthorFilesLines
2025-03-26Merge tag 'hyperv-next-signed-20250324' of ↵Linus Torvalds8-232/+54
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Add support for running as the root partition in Hyper-V (Microsoft Hypervisor) by exposing /dev/mshv (Nuno and various people) - Add support for CPU offlining in Hyper-V (Hamza Mahfooz) - Misc fixes and cleanups (Roman Kisel, Tianyu Lan, Wei Liu, Michael Kelley, Thorsten Blum) * tag 'hyperv-next-signed-20250324' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (24 commits) x86/hyperv: fix an indentation issue in mshyperv.h x86/hyperv: Add comments about hv_vpset and var size hypercall input args Drivers: hv: Introduce mshv_root module to expose /dev/mshv to VMMs hyperv: Add definitions for root partition driver to hv headers x86: hyperv: Add mshv_handler() irq handler and setup function Drivers: hv: Introduce per-cpu event ring tail Drivers: hv: Export some functions for use by root partition module acpi: numa: Export node_to_pxm() hyperv: Introduce hv_recommend_using_aeoi() arm64/hyperv: Add some missing functions to arm64 x86/mshyperv: Add support for extended Hyper-V features hyperv: Log hypercall status codes as strings x86/hyperv: Fix check of return value from snp_set_vmsa() x86/hyperv: Add VTL mode callback for restarting the system x86/hyperv: Add VTL mode emergency restart callback hyperv: Remove unused union and structs hyperv: Add CONFIG_MSHV_ROOT to gate root partition support hyperv: Change hv_root_partition into a function hyperv: Convert hypercall statuses to linux error codes drivers/hv: add CPU offlining support ...
2025-03-21x86/hyperv: Add comments about hv_vpset and var size hypercall input argsMichael Kelley2-0/+9
Current code varies in how the size of the variable size input header for hypercalls is calculated when the input contains struct hv_vpset. Surprisingly, this variation is correct, as different hypercalls make different choices for what portion of struct hv_vpset is treated as part of the variable size input header. The Hyper-V TLFS is silent on these details, but the behavior has been confirmed with Hyper-V developers. To avoid future confusion about these differences, add comments to struct hv_vpset, and to hypercall call sites with input that contains a struct hv_vpset. The comments describe the overall situation and the calculation that should be used at each particular call site. No functional change as only comments are updated. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250318214919.958953-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250318214919.958953-1-mhklinux@outlook.com>
2025-03-21hyperv: Log hypercall status codes as stringsNuno Das Neves1-3/+3
Introduce hv_status_printk() macros as a convenience to log hypercall errors, formatting them with the status code (HV_STATUS_*) as a raw hex value and also as a string, which saves some time while debugging. Create a table of HV_STATUS_ codes with strings and mapped errnos, and use it for hv_result_to_string() and hv_result_to_errno(). Use the new hv_status_printk()s in hv_proc.c, hyperv-iommu.c, and irqdomain.c hypercalls to aid debugging in the root partition. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Link: https://lore.kernel.org/r/1741980536-3865-2-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-2-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-21x86/hyperv: Fix check of return value from snp_set_vmsa()Tianyu Lan1-1/+1
snp_set_vmsa() returns 0 as success result and so fix it. Cc: stable@vger.kernel.org Fixes: 44676bb9d566 ("x86/hyperv: Add smp support for SEV-SNP guest") Signed-off-by: Tianyu Lan <tiala@microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250313085217.45483-1-ltykernel@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250313085217.45483-1-ltykernel@gmail.com>
2025-03-21x86/hyperv: Add VTL mode callback for restarting the systemRoman Kisel1-0/+11
The kernel runs as a firmware in the VTL mode, and the only way to restart in the VTL mode on x86 is to triple fault. Thus, one has to always supply "reboot=t" on the kernel command line in the VTL mode, and missing that renders rebooting not working. Define the machine restart callback to always use the triple fault to provide the robust configuration by default. Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Link: https://lore.kernel.org/r/20250227214728.15672-3-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250227214728.15672-3-romank@linux.microsoft.com>
2025-03-21x86/hyperv: Add VTL mode emergency restart callbackRoman Kisel1-0/+23
By default, X86(-64) systems use the emergecy restart routine in the course of which the code unconditionally writes to the physical address of 0x472 to indicate the boot mode to the firmware (BIOS or UEFI). When the kernel itself runs as a firmware in the VTL mode, that write corrupts the memory of the guest upon emergency restarting. Preserving the state intact in that situation is important for debugging, at least. Define the specialized machine callback to avoid that write and use the triple fault to perform emergency restart. Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Link: https://lore.kernel.org/r/20250227214728.15672-2-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250227214728.15672-2-romank@linux.microsoft.com>
2025-03-19Merge tag 'v6.14-rc7' into x86/core, to pick up fixesIngo Molnar2-2/+2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-10x86/hyperv: Fix output argument to hypercall that changes page visibilityMichael Kelley1-2/+1
The hypercall in hv_mark_gpa_visibility() is invoked with an input argument and an output argument. The output argument ostensibly returns the number of pages that were processed. But in fact, the hypercall does not provide any output, so the output argument is spurious. The spurious argument is harmless because Hyper-V ignores it, but in the interest of correctness and to avoid the potential for future problems, remove it. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Link: https://lore.kernel.org/r/20250226200612.2062-2-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250226200612.2062-2-mhklinux@outlook.com>
2025-02-22hyperv: Change hv_root_partition into a functionNuno Das Neves1-5/+5
Introduce hv_curr_partition_type to store the partition type as an enum. Right now this is limited to guest or root partition, but there will be other kinds in future and the enum is easily extensible. Set up hv_curr_partition_type early in Hyper-V initialization with hv_identify_partition_type(). hv_root_partition() just queries this value, and shouldn't be called before that. Making this check into a function sets the stage for adding a config option to gate the compilation of root partition code. In particular, hv_root_partition() can be stubbed out always be false if root partition support isn't desired. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1740167795-13296-3-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1740167795-13296-3-git-send-email-nunodasneves@linux.microsoft.com>
2025-02-21x86/mm: Remove pv_ops.mmu.tlb_remove_table callRik van Riel1-1/+0
Every pv_ops.mmu.tlb_remove_table call ends up calling tlb_remove_table. Get rid of the indirection by simply calling tlb_remove_table directly, and not going through the paravirt function pointers. Suggested-by: Qi Zheng <zhengqi.arch@bytedance.com> Signed-off-by: Rik van Riel <riel@surriel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Manali Shukla <Manali.Shukla@amd.com> Tested-by: Brendan Jackman <jackmanb@google.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250213161423.449435-3-riel@surriel.com
2025-02-14hyperv: Move arch/x86/hyperv/hv_proc.c to drivers/hvNuno Das Neves2-199/+1
These helpers are not specific to x86_64 and will be needed by common code. Remove some unnecessary #includes. Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Link: https://lore.kernel.org/r/1738955002-20821-3-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1738955002-20821-3-git-send-email-nunodasneves@linux.microsoft.com>
2025-02-14hyperv: Move hv_current_partition_id to arch-generic codeNuno Das Neves1-24/+1
Move hv_current_partition_id and hv_get_partition_id() to hv_common.c, and call hv_get_partition_id() on arm64 in hyperv_init(). These aren't specific to x86_64 and will be needed by common code. Set hv_current_partition_id to HV_PARTITION_ID_SELF by default. Rename struct hv_get_partition_id to hv_output_get_partition_id, to make it distinct from the function hv_get_partition_id(), and match the original Hyper-V struct name. Remove the BUG()s. Failing to get the id need not crash the machine. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1738955002-20821-2-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1738955002-20821-2-git-send-email-nunodasneves@linux.microsoft.com>
2025-02-12x86/hyperv/vtl: Stop kernel from probing VTL0 low memoryNaman Jain1-0/+1
For Linux, running in Hyper-V VTL (Virtual Trust Level), kernel in VTL2 tries to access VTL0 low memory in probe_roms. This memory is not described in the e820 map. Initialize probe_roms call to no-ops during boot for VTL2 kernel to avoid this. The issue got identified in OpenVMM which detects invalid accesses initiated from kernel running in VTL2. Co-developed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Tested-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Link: https://lore.kernel.org/r/20250116061224.1701-1-namjain@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250116061224.1701-1-namjain@linux.microsoft.com>
2025-01-25Merge tag 'hyperv-next-signed-20250123' of ↵Linus Torvalds7-19/+14
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Introduce a new set of Hyper-V headers in include/hyperv and replace the old hyperv-tlfs.h with the new headers (Nuno Das Neves) - Fixes for the Hyper-V VTL mode (Roman Kisel) - Fixes for cpu mask usage in Hyper-V code (Michael Kelley) - Document the guest VM hibernation behaviour (Michael Kelley) - Miscellaneous fixes and cleanups (Jacob Pan, John Starks, Naman Jain) * tag 'hyperv-next-signed-20250123' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Documentation: hyperv: Add overview of guest VM hibernation hyperv: Do not overlap the hvcall IO areas in hv_vtl_apicid_to_vp_id() hyperv: Do not overlap the hvcall IO areas in get_vtl() hyperv: Enable the hypercall output page for the VTL mode hv_balloon: Fallback to generic_online_page() for non-HV hot added mem Drivers: hv: vmbus: Log on missing offers if any Drivers: hv: vmbus: Wait for boot-time offers during boot and resume uio_hv_generic: Add a check for HV_NIC for send, receive buffers setup iommu/hyper-v: Don't assume cpu_possible_mask is dense Drivers: hv: Don't assume cpu_possible_mask is dense x86/hyperv: Don't assume cpu_possible_mask is dense hyperv: Remove the now unused hyperv-tlfs.h files hyperv: Switch from hyperv-tlfs.h to hyperv/hvhdk.h hyperv: Add new Hyper-V headers in include/hyperv hyperv: Clean up unnecessary #includes hyperv: Move hv_connection_id to hyperv-tlfs.h
2025-01-22Merge tag 'irq-core-2025-01-21' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull interrupt subsystem updates from Thomas Gleixner: - Consolidate the machine_kexec_mask_interrupts() by providing a generic implementation and replacing the copy & pasta orgy in the relevant architectures. - Prevent unconditional operations on interrupt chips during kexec shutdown, which can trigger warnings in certain cases when the underlying interrupt has been shut down before. - Make the enforcement of interrupt handling in interrupt context unconditionally available, so that it actually works for non x86 related interrupt chips. The earlier enablement for ARM GIC chips set the required chip flag, but did not notice that the check was hidden behind a config switch which is not selected by ARM[64]. - Decrapify the handling of deferred interrupt affinity setting. Some interrupt chips require that affinity changes are made from the context of handling an interrupt to avoid certain race conditions. For x86 this was the default, but with interrupt remapping this requirement was lifted and a flag was introduced which tells the core code that affinity changes can be done in any context. Unrestricted affinity changes are the default for the majority of interrupt chips. RISCV has the requirement to add the deferred mode to one of it's interrupt controllers, but with the original implementation this would require to add the any context flag to all other RISC-V interrupt chips. That's backwards, so reverse the logic and require that chips, which need the deferred mode have to be marked accordingly. That avoids chasing the 'sane' chips and marking them. - Add multi-node support to the Loongarch AVEC interrupt controller driver. - The usual tiny cleanups, fixes and improvements all over the place. * tag 'irq-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/generic_chip: Export irq_gc_mask_disable_and_ack_set() genirq/timings: Add kernel-doc for a function parameter genirq: Remove IRQ_MOVE_PCNTXT and related code x86/apic: Convert to IRQCHIP_MOVE_DEFERRED genirq: Provide IRQCHIP_MOVE_DEFERRED hexagon: Remove GENERIC_PENDING_IRQ leftover ARC: Remove GENERIC_PENDING_IRQ genirq: Remove handle_enforce_irqctx() wrapper genirq: Make handle_enforce_irqctx() unconditionally available irqchip/loongarch-avec: Add multi-nodes topology support irqchip/ts4800: Replace seq_printf() by seq_puts() irqchip/ti-sci-inta : Add module build support irqchip/ti-sci-intr: Add module build support irqchip/irq-brcmstb-l2: Replace brcmstb_l2_mask_and_ack() by generic function irqchip: keystone: Use syscon_regmap_lookup_by_phandle_args genirq/kexec: Prevent redundant IRQ masking by checking state before shutdown kexec: Consolidate machine_kexec_mask_interrupts() implementation genirq: Reuse irq_thread_fn() for forced thread case genirq: Move irq_thread_fn() further up in the code
2025-01-15x86/apic: Convert to IRQCHIP_MOVE_DEFERREDThomas Gleixner1-1/+1
Instead of marking individual interrupts as safe to be migrated in arbitrary contexts, mark the interrupt chips, which require the interrupt to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED flag. This makes more sense because this is a per interrupt chip property and not restricted to individual interrupts. That flips the logic from the historical opt-out to a opt-in model. This is simpler to handle for other architectures, which default to unrestricted affinity setting. It also allows to cleanup the redundant core logic significantly. All interrupt chips, which belong to a top-level domain sitting directly on top of the x86 vector domain are marked accordingly, unless the related setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Acked-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/all/20241210103335.563277044@linutronix.de
2025-01-10hyperv: Do not overlap the hvcall IO areas in hv_vtl_apicid_to_vp_id()Roman Kisel1-1/+1
The Top-Level Functional Specification for Hyper-V, Section 3.6 [1, 2], disallows overlapping of the input and output hypercall areas, and hv_vtl_apicid_to_vp_id() overlaps them. Use the output hypercall page of the current vCPU for the hypercall. [1] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/hypercall-interface [2] https://github.com/MicrosoftDocs/Virtualization-Documentation/tree/main/tlfs Reported-by: Michael Kelley <mhklinux@outlook.com> Closes: https://lore.kernel.org/lkml/SN6PR02MB4157B98CD34781CC87A9D921D40D2@SN6PR02MB4157.namprd02.prod.outlook.com/ Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Link: https://lore.kernel.org/r/20250108222138.1623703-6-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250108222138.1623703-6-romank@linux.microsoft.com>
2025-01-10hyperv: Do not overlap the hvcall IO areas in get_vtl()Roman Kisel1-1/+1
The Top-Level Functional Specification for Hyper-V, Section 3.6 [1, 2], disallows overlapping of the input and output hypercall areas, and get_vtl(void) does overlap them. Use the output hypercall page of the current vCPU for the hypercall. [1] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/hypercall-interface [2] https://github.com/MicrosoftDocs/Virtualization-Documentation/tree/main/tlfs Fixes: 8387ce06d70b ("x86/hyperv: Set Virtual Trust Level in VMBus init message") Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Link: https://lore.kernel.org/r/20250108222138.1623703-5-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250108222138.1623703-5-romank@linux.microsoft.com>
2025-01-10x86/hyperv: Don't assume cpu_possible_mask is denseMichael Kelley1-1/+1
Current code allocates the hv_vp_assist_page array with size num_possible_cpus(). This code assumes cpu_possible_mask is dense, which is not true in the general case per [1]. If cpu_possible_mask is sparse, the array might be indexed by a value beyond the size of the array. However, the configurations that Hyper-V provides to guest VMs on x86 hardware, in combination with how x86 code assigns Linux CPU numbers, *does* always produce a dense cpu_possible_mask. So the dense assumption is not currently causing failures. But for robustness against future changes in how cpu_possible_mask is populated, update the code to no longer assume dense. The correct approach is to allocate the array with size "nr_cpu_ids". While this leaves unused array entries corresponding to holes in cpu_possible_mask, the holes are assumed to be minimal and hence the amount of memory wasted by unused entries is minimal. [1] https://lore.kernel.org/lkml/SN6PR02MB4157210CC36B2593F8572E5ED4692@SN6PR02MB4157.namprd02.prod.outlook.com/ Signed-off-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20241003035333.49261-2-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20241003035333.49261-2-mhklinux@outlook.com>
2025-01-10hyperv: Switch from hyperv-tlfs.h to hyperv/hvhdk.hNuno Das Neves3-12/+12
Switch to using hvhdk.h everywhere in the kernel. This header includes all the new Hyper-V headers in include/hyperv, which form a superset of the definitions found in hyperv-tlfs.h. This makes it easier to add new Hyper-V interfaces without being restricted to those in the TLFS doc (reflected in hyperv-tlfs.h). To be more consistent with the original Hyper-V code, the names of some definitions are changed slightly. Update those where needed. Update comments in mshyperv.h files to point to include/hyperv for adding new definitions. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Link: https://lore.kernel.org/r/1732577084-2122-5-git-send-email-nunodasneves@linux.microsoft.com Link: https://lore.kernel.org/r/20250108222138.1623703-3-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-01-08hyperv: Clean up unnecessary #includesNuno Das Neves5-5/+0
Remove includes of linux/hyperv.h, mshyperv.h, and hyperv-tlfs.h where they are not used. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Acked-by: Wei Liu <wei.liu@kernel.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Link: https://lore.kernel.org/r/1732577084-2122-3-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1732577084-2122-3-git-send-email-nunodasneves@linux.microsoft.com>
2024-12-04x86/mtrr: Rename mtrr_overwrite_state() to guest_force_mtrr_state()Kirill A. Shutemov1-1/+1
Rename the helper to better reflect its function. Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Link: https://lore.kernel.org/all/20241202073139.448208-1-kirill.shutemov%40linux.intel.com
2024-09-19Merge tag 'hyperv-next-signed-20240916' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull Hyper-V updates from Wei Liu: - Optimize boot time by concurrent execution of hv_synic_init() (Saurabh Sengar) - Use helpers to read control registers in hv_snp_boot_ap() (Yosry Ahmed) - Add memory allocation check in hv_fcopy_start (Zhu Jun) * tag 'hyperv-next-signed-20240916' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: tools/hv: Add memory allocation check in hv_fcopy_start x86/hyperv: use helpers to read control registers in hv_snp_boot_ap() Drivers: hv: vmbus: Optimize boot time by concurrent execution of hv_synic_init()
2024-09-09Merge tag 'hyperv-fixes-signed-20240908' of ↵Linus Torvalds1-4/+1
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Add a documentation overview of Confidential Computing VM support (Michael Kelley) - Use lapic timer in a TDX VM without paravisor (Dexuan Cui) - Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency (Michael Kelley) - Fix a kexec crash due to VP assist page corruption (Anirudh Rayabharam) - Python3 compatibility fix for lsvmbus (Anthony Nandaa) - Misc fixes (Rachel Menge, Roman Kisel, zhang jiao, Hongbo Li) * tag 'hyperv-fixes-signed-20240908' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: hv: vmbus: Constify struct kobj_type and struct attribute_group tools: hv: rm .*.cmd when make clean x86/hyperv: fix kexec crash due to VP assist page corruption Drivers: hv: vmbus: Fix the misplaced function description tools: hv: lsvmbus: change shebang to use python3 x86/hyperv: Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency Documentation: hyperv: Add overview of Confidential Computing VM support clocksource: hyper-v: Use lapic timer in a TDX VM without paravisor Drivers: hv: Remove deprecated hv_fcopy declarations
2024-09-05x86/hyperv: fix kexec crash due to VP assist page corruptionAnirudh Rayabharam (Microsoft)1-4/+1
commit 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline") introduces a new cpuhp state for hyperv initialization. cpuhp_setup_state() returns the state number if state is CPUHP_AP_ONLINE_DYN or CPUHP_BP_PREPARE_DYN and 0 for all other states. For the hyperv case, since a new cpuhp state was introduced it would return 0. However, in hv_machine_shutdown(), the cpuhp_remove_state() call is conditioned upon "hyperv_init_cpuhp > 0". This will never be true and so hv_cpu_die() won't be called on all CPUs. This means the VP assist page won't be reset. When the kexec kernel tries to setup the VP assist page again, the hypervisor corrupts the memory region of the old VP assist page causing a panic in case the kexec kernel is using that memory elsewhere. This was originally fixed in commit dfe94d4086e4 ("x86/hyperv: Fix kexec panic/hang issues"). Get rid of hyperv_init_cpuhp entirely since we are no longer using a dynamic cpuhp state and use CPUHP_AP_HYPERV_ONLINE directly with cpuhp_remove_state(). Cc: stable@vger.kernel.org Fixes: 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline") Signed-off-by: Anirudh Rayabharam (Microsoft) <anirudh@anirudhrb.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20240828112158.3538342-1-anirudh@anirudhrb.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240828112158.3538342-1-anirudh@anirudhrb.com>
2024-08-24x86/hyperv: use helpers to read control registers in hv_snp_boot_ap()Yosry Ahmed1-3/+3
Use native_read_cr*() helpers to read control registers into vmsa->cr* instead of open-coded assembly. No functional change intended, unless there was a purpose to specifying rax. Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Link: https://lore.kernel.org/r/20240805201247.427982-1-yosryahmed@google.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240805201247.427982-1-yosryahmed@google.com>
2024-06-17x86/mm: Make x86_platform.guest.enc_status_change_*() return an errorKirill A. Shutemov1-12/+10
TDX is going to have more than one reason to fail enc_status_change_prepare(). Change the callback to return errno instead of assuming -EIO. Change enc_status_change_finish() too to keep the interface symmetric. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Tao Liu <ltao@redhat.com> Link: https://lore.kernel.org/r/20240614095904.1345461-8-kirill.shutemov@linux.intel.com
2024-05-14Merge tag 'x86-platform-2024-05-13' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Ingo Molnar: - Improve the DeviceTree (OF) NUMA enumeration code to address kernel warnings & mis-mappings on DeviceTree platforms - Migrate x86 platform drivers to the .remove_new callback API - Misc cleanups & fixes * tag 'x86-platform-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/olpc-xo1-sci: Convert to platform remove callback returning void x86/platform/olpc-x01-pm: Convert to platform remove callback returning void x86/platform/iris: Convert to platform remove callback returning void x86/of: Change x86_dtb_parse_smp_config() to static x86/of: Map NUMA node to CPUs as per DeviceTree x86/of: Set the parse_smp_cfg for all the DeviceTree platforms by default x86/hyperv/vtl: Correct x86_init.mpparse.parse_smp_cfg assignment
2024-04-12Merge tag 'hyperv-fixes-signed-20240411' of ↵Linus Torvalds2-26/+12
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Some cosmetic changes (Erni Sri Satya Vennela, Li Zhijian) - Introduce hv_numa_node_to_pxm_info() (Nuno Das Neves) - Fix KVP daemon to handle IPv4 and IPv6 combination for keyfile format (Shradha Gupta) - Avoid freeing decrypted memory in a confidential VM (Rick Edgecombe and Michael Kelley) * tag 'hyperv-fixes-signed-20240411' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: Don't free ring buffers that couldn't be re-encrypted uio_hv_generic: Don't free decrypted memory hv_netvsc: Don't free decrypted memory Drivers: hv: vmbus: Track decrypted status in vmbus_gpadl Drivers: hv: vmbus: Leak pages if set_memory_encrypted() fails hv/hv_kvp_daemon: Handle IPv4 and Ipv6 combination for keyfile format hv: vmbus: Convert sprintf() family to sysfs_emit() family mshyperv: Introduce hv_numa_node_to_pxm_info() x86/hyperv: Cosmetic changes for hv_apic.c
2024-04-03x86/of: Set the parse_smp_cfg for all the DeviceTree platforms by defaultSaurabh Sengar1-1/+0
x86_dtb_parse_smp_config() must be set by DeviceTree platform for parsing SMP configuration. Set the parse_smp_cfg pointer to x86_dtb_parse_smp_config() by default so that all the dtb platforms need not to assign it explicitly. Today there are only two platforms using DeviceTree in x86, ce4100 and hv_vtl. Remove the explicit assignment of x86_dtb_parse_smp_config() function from these. Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/1712068830-4513-3-git-send-email-ssengar@linux.microsoft.com
2024-04-03x86/hyperv/vtl: Correct x86_init.mpparse.parse_smp_cfg assignmentSaurabh Sengar1-1/+1
VTL platform uses DeviceTree for fetching SMP configuration, assign the correct parsing function x86_dtb_parse_smp_config() for it to parse_smp_cfg. Fixes: c22e19cd2c8a ("x86/hyperv/vtl: Prepare for separate mpparse callbacks") Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/1712068830-4513-2-git-send-email-ssengar@linux.microsoft.com
2024-03-23mshyperv: Introduce hv_numa_node_to_pxm_info()Nuno Das Neves1-18/+4
Factor out logic for converting numa node to hv_proximity_domain_info into a helper function. Change hv_proximity_domain_info to a struct to improve readability. While at it, rename hv_add_logical_processor_* structs to the correct hv_input_/hv_output_ prefix, and remove the flags field which is not present in the ABI. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/1711141826-9458-1-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1711141826-9458-1-git-send-email-nunodasneves@linux.microsoft.com>
2024-03-23x86/hyperv: Cosmetic changes for hv_apic.cErni Sri Satya Vennela1-8/+8
Fix issues reported by checkpatch.pl script for hv_apic.c file - Alignment should match open parenthesis - Remove unnecessary parenthesis No functional changes intended. Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/1711009325-21894-1-git-send-email-ernis@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1711009325-21894-1-git-send-email-ernis@linux.microsoft.com>
2024-03-21Merge tag 'hyperv-next-signed-20240320' of ↵Linus Torvalds3-9/+21
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Use Hyper-V entropy to seed guest random number generator (Michael Kelley) - Convert to platform remove callback returning void for vmbus (Uwe Kleine-König) - Introduce hv_get_hypervisor_version function (Nuno Das Neves) - Rename some HV_REGISTER_* defines for consistency (Nuno Das Neves) - Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_* (Nuno Das Neves) - Cosmetic changes for hv_spinlock.c (Purna Pavan Chandra Aekkaladevi) - Use per cpu initial stack for vtl context (Saurabh Sengar) * tag 'hyperv-next-signed-20240320' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Use Hyper-V entropy to seed guest random number generator x86/hyperv: Cosmetic changes for hv_spinlock.c hyperv-tlfs: Rename some HV_REGISTER_* defines for consistency hv: vmbus: Convert to platform remove callback returning void mshyperv: Introduce hv_get_hypervisor_version function x86/hyperv: Use per cpu initial stack for vtl context hyperv-tlfs: Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_*
2024-03-19x86/hyperv: Cosmetic changes for hv_spinlock.cPurna Pavan Chandra Aekkaladevi1-1/+2
Fix issues reported by checkpatch.pl script for hv_spinlock.c file. - Place __initdata after variable name - Add missing blank line after enum declaration No functional changes intended. Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/1710763751-14137-1-git-send-email-paekkaladevi@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1710763751-14137-1-git-send-email-paekkaladevi@linux.microsoft.com>
2024-03-12Merge tag 'x86-apic-2024-03-10' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 APIC updates from Thomas Gleixner: "Rework of APIC enumeration and topology evaluation. The current implementation has a couple of shortcomings: - It fails to handle hybrid systems correctly. - The APIC registration code which handles CPU number assignents is in the middle of the APIC code and detached from the topology evaluation. - The various mechanisms which enumerate APICs, ACPI, MPPARSE and guest specific ones, tweak global variables as they see fit or in case of XENPV just hack around the generic mechanisms completely. - The CPUID topology evaluation code is sprinkled all over the vendor code and reevaluates global variables on every hotplug operation. - There is no way to analyze topology on the boot CPU before bringing up the APs. This causes problems for infrastructure like PERF which needs to size certain aspects upfront or could be simplified if that would be possible. - The APIC admission and CPU number association logic is incomprehensible and overly complex and needs to be kept around after boot instead of completing this right after the APIC enumeration. This update addresses these shortcomings with the following changes: - Rework the CPUID evaluation code so it is common for all vendors and provides information about the APIC ID segments in a uniform way independent of the number of segments (Thread, Core, Module, ..., Die, Package) so that this information can be computed instead of rewriting global variables of dubious value over and over. - A few cleanups and simplifcations of the APIC, IO/APIC and related interfaces to prepare for the topology evaluation changes. - Seperation of the parser stages so the early evaluation which tries to find the APIC address can be seperately overridden from the late evaluation which enumerates and registers the local APIC as further preparation for sanitizing the topology evaluation. - A new registration and admission logic which - encapsulates the inner workings so that parsers and guest logic cannot longer fiddle in it - uses the APIC ID segments to build topology bitmaps at registration time - provides a sane admission logic - allows to detect the crash kernel case, where CPU0 does not run on the real BSP, automatically. This is required to prevent sending INIT/SIPI sequences to the real BSP which would reset the whole machine. This was so far handled by a tedious command line parameter, which does not even work in nested crash scenarios. - Associates CPU number after the enumeration completed and prevents the late registration of APICs, which was somehow tolerated before. - Converting all parsers and guest enumeration mechanisms over to the new interfaces. This allows to get rid of all global variable tweaking from the parsers and enumeration mechanisms and sanitizes the XEN[PV] handling so it can use CPUID evaluation for the first time. - Mopping up existing sins by taking the information from the APIC ID segment bitmaps. This evaluates hybrid systems correctly on the boot CPU and allows for cleanups and fixes in the related drivers, e.g. PERF. The series has been extensively tested and the minimal late fallout due to a broken ACPI/MADT table has been addressed by tightening the admission logic further" * tag 'x86-apic-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (76 commits) x86/topology: Ignore non-present APIC IDs in a present package x86/apic: Build the x86 topology enumeration functions on UP APIC builds too smp: Provide 'setup_max_cpus' definition on UP too smp: Avoid 'setup_max_cpus' namespace collision/shadowing x86/bugs: Use fixed addressing for VERW operand x86/cpu/topology: Get rid of cpuinfo::x86_max_cores x86/cpu/topology: Provide __num_[cores|threads]_per_package x86/cpu/topology: Rename topology_max_die_per_package() x86/cpu/topology: Rename smp_num_siblings x86/cpu/topology: Retrieve cores per package from topology bitmaps x86/cpu/topology: Use topology logical mapping mechanism x86/cpu/topology: Provide logical pkg/die mapping x86/cpu/topology: Simplify cpu_mark_primary_thread() x86/cpu/topology: Mop up primary thread mask handling x86/cpu/topology: Use topology bitmaps for sizing x86/cpu/topology: Let XEN/PV use topology from CPUID/MADT x86/xen/smp_pv: Count number of vCPUs early x86/cpu/topology: Assign hotpluggable CPUIDs during init x86/cpu/topology: Reject unknown APIC IDs on ACPI hotplug x86/topology: Add a mechanism to track topology via APIC IDs ...
2024-03-09x86/hyperv: Use per cpu initial stack for vtl contextSaurabh Sengar1-4/+15
Currently, the secondary CPUs in Hyper-V VTL context lack support for parallel startup. Therefore, relying on the single initial_stack fetched from the current task structure suffices for all vCPUs. However, common initial_stack risks stack corruption when parallel startup is enabled. In order to facilitate parallel startup, use the initial_stack from the per CPU idle thread instead of the current task. Fixes: 3be1bc2fe9d2 ("x86/hyperv: VTL support for Hyper-V") Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1709452896-13342-1-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1709452896-13342-1-git-send-email-ssengar@linux.microsoft.com>
2024-03-04hyperv-tlfs: Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_*Nuno Das Neves1-4/+4
The HV_REGISTER_ are used as arguments to hv_set/get_register(), which delegate to arch-specific mechanisms for getting/setting synthetic Hyper-V MSRs. On arm64, HV_REGISTER_ defines are synthetic VP registers accessed via the get/set vp registers hypercalls. The naming matches the TLFS document, although these register names are not specific to arm64. However, on x86 the prefix HV_REGISTER_ indicates Hyper-V MSRs accessed via rdmsrl()/wrmsrl(). This is not consistent with the TLFS doc, where HV_REGISTER_ is *only* used for used for VP register names used by the get/set register hypercalls. To fix this inconsistency and prevent future confusion, change the arch-generic aliases used by callers of hv_set/get_register() to have the prefix HV_MSR_ instead of HV_REGISTER_. Use the prefix HV_X64_MSR_ for the x86-only Hyper-V MSRs. On x86, the generic HV_MSR_'s point to the corresponding HV_X64_MSR_. Move the arm64 HV_REGISTER_* defines to the asm-generic hyperv-tlfs.h, since these are not specific to arm64. On arm64, the generic HV_MSR_'s point to the corresponding HV_REGISTER_. While at it, rename hv_get/set_registers() and related functions to hv_get/set_msr(), hv_get/set_nested_msr(), etc. These are only used for Hyper-V MSRs and this naming makes that clear. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1708440933-27125-1-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1708440933-27125-1-git-send-email-nunodasneves@linux.microsoft.com>
2024-03-01x86/hyperv: Allow 15-bit APIC IDs for VTL platformsSaurabh Sengar1-0/+7
The current method for signaling the compatibility of a Hyper-V host with MSIs featuring 15-bit APIC IDs relies on a synthetic cpuid leaf. However, for higher VTLs, this leaf is not reported, due to the absence of an IO-APIC. As an alternative, assume that when running at a high VTL, the host supports 15-bit APIC IDs. This assumption is safe, as Hyper-V does not employ any architectural MSIs at higher VTLs This unblocks startup of VTL2 environments with more than 256 CPUs. Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1705341460-18394-1-git-send-email-ssengar@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1705341460-18394-1-git-send-email-ssengar@linux.microsoft.com>
2024-03-01x86/hyperv: Make encrypted/decrypted changes safe for load_unaligned_zeropad()Michael Kelley1-4/+49
In a CoCo VM, when transitioning memory from encrypted to decrypted, or vice versa, the caller of set_memory_encrypted() or set_memory_decrypted() is responsible for ensuring the memory isn't in use and isn't referenced while the transition is in progress. The transition has multiple steps, and the memory is in an inconsistent state until all steps are complete. A reference while the state is inconsistent could result in an exception that can't be cleanly fixed up. However, the kernel load_unaligned_zeropad() mechanism could cause a stray reference that can't be prevented by the caller of set_memory_encrypted() or set_memory_decrypted(), so there's specific code to handle this case. But a CoCo VM running on Hyper-V may be configured to run with a paravisor, with the #VC or #VE exception routed to the paravisor. There's no architectural way to forward the exceptions back to the guest kernel, and in such a case, the load_unaligned_zeropad() specific code doesn't work. To avoid this problem, mark pages as "not present" while a transition is in progress. If load_unaligned_zeropad() causes a stray reference, a normal page fault is generated instead of #VC or #VE, and the page-fault-based fixup handlers for load_unaligned_zeropad() resolve the reference. When the encrypted/decrypted transition is complete, mark the pages as "present" again. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://lore.kernel.org/r/20240116022008.1023398-4-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240116022008.1023398-4-mhklinux@outlook.com>
2024-03-01x86/hyperv: Use slow_virt_to_phys() in page transition hypervisor callbackMichael Kelley1-1/+11
In preparation for temporarily marking pages not present during a transition between encrypted and decrypted, use slow_virt_to_phys() in the hypervisor callback. As long as the PFN is correct, slow_virt_to_phys() works even if the leaf PTE is not present. The existing functions that depend on vmalloc_to_page() all require that the leaf PTE be marked present, so they don't work. Update the comments for slow_virt_to_phys() to note this broader usage and the requirement to work even if the PTE is not marked present. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Link: https://lore.kernel.org/r/20240116022008.1023398-2-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240116022008.1023398-2-mhklinux@outlook.com>
2024-02-16x86/mpparse: Switch to new init callbacksThomas Gleixner1-1/+0
Now that all platforms have the new split SMP configuration callbacks set up, flip the switch and remove the old callback pointer and mop up the platform code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Link: https://lore.kernel.org/r/20240212154639.870883080@linutronix.de
2024-02-16x86/hyperv/vtl: Prepare for separate mpparse callbacksThomas Gleixner1-1/+3
Initialize the new callbacks in preparation for switching the core code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Link: https://lore.kernel.org/r/20240212154639.808238769@linutronix.de
2024-01-09Merge tag 'x86-cleanups-2024-01-08' of ↵Linus Torvalds3-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: - Change global variables to local - Add missing kernel-doc function parameter descriptions - Remove unused parameter from a macro - Remove obsolete Kconfig entry - Fix comments - Fix typos, mostly scripted, manually reviewed and a micro-optimization got misplaced as a cleanup: - Micro-optimize the asm code in secondary_startup_64_no_verify() * tag 'x86-cleanups-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arch/x86: Fix typos x86/head_64: Use TESTB instead of TESTL in secondary_startup_64_no_verify() x86/docs: Remove reference to syscall trampoline in PTI x86/Kconfig: Remove obsolete config X86_32_SMP x86/io: Remove the unused 'bw' parameter from the BUILDIO() macro x86/mtrr: Document missing function parameters in kernel-doc x86/setup: Make relocated_ramdisk a local variable of relocate_initrd()
2024-01-03arch/x86: Fix typosBjorn Helgaas3-3/+3
Fix typos, most reported by "codespell arch/x86". Only touches comments, no code changes. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20240103004011.1758650-1-helgaas@kernel.org
2023-11-22Merge tag 'hyperv-fixes-signed-20231121' of ↵Linus Torvalds1-4/+21
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - One fix for the KVP daemon (Ani Sinha) - Fix for the detection of E820_TYPE_PRAM in a Gen2 VM (Saurabh Sengar) - Micro-optimization for hv_nmi_unknown() (Uros Bizjak) * tag 'hyperv-fixes-signed-20231121' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Use atomic_try_cmpxchg() to micro-optimize hv_nmi_unknown() x86/hyperv: Fix the detection of E820_TYPE_PRAM in a Gen2 VM hv/hv_kvp_daemon: Some small fixes for handling NM keyfiles
2023-11-13x86/hyperv: Fix the detection of E820_TYPE_PRAM in a Gen2 VMSaurabh Sengar1-4/+21
A Gen2 VM doesn't support legacy PCI/PCIe, so both raw_pci_ops and raw_pci_ext_ops are NULL, and pci_subsys_init() -> pcibios_init() doesn't call pcibios_resource_survey() -> e820__reserve_resources_late(); as a result, any emulated persistent memory of E820_TYPE_PRAM (12) via the kernel parameter memmap=nn[KMG]!ss is not added into iomem_resource and hence can't be detected by register_e820_pmem(). Fix this by directly calling e820__reserve_resources_late() in hv_pci_init(), which is called from arch_initcall(pci_arch_init). It's ok to move a Gen2 VM's e820__reserve_resources_late() from subsys_initcall(pci_subsys_init) to arch_initcall(pci_arch_init) because the code in-between doesn't depend on the E820 resources. e820__reserve_resources_late() depends on e820__reserve_resources(), which has been called earlier from setup_arch(). For a Gen-2 VM, the new hv_pci_init() also adds any memory of E820_TYPE_PMEM (7) into iomem_resource, and acpi_nfit_register_region() -> acpi_nfit_insert_resource() -> region_intersects() returns REGION_INTERSECTS, so the memory of E820_TYPE_PMEM won't get added twice. Changed the local variable "int gen2vm" to "bool gen2vm". Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1699691867-9827-1-git-send-email-ssengar@linux.microsoft.com>
2023-11-01Merge tag 'x86_tdx_for_6.7' of ↵Linus Torvalds1-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 TDX updates from Dave Hansen: "The majority of this is a rework of the assembly and C wrappers that are used to talk to the TDX module and VMM. This is a nice cleanup in general but is also clearing the way for using this code when Linux is the TDX VMM. There are also some tidbits to make TDX guests play nicer with Hyper-V and to take advantage the hardware TSC. Summary: - Refactor and clean up TDX hypercall/module call infrastructure - Handle retrying/resuming page conversion hypercalls - Make sure to use the (shockingly) reliable TSC in TDX guests" [ TLA reminder: TDX is "Trust Domain Extensions", Intel's guest VM confidentiality technology ] * tag 'x86_tdx_for_6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tdx: Mark TSC reliable x86/tdx: Fix __noreturn build warning around __tdx_hypercall_failed() x86/virt/tdx: Make TDX_MODULE_CALL handle SEAMCALL #UD and #GP x86/virt/tdx: Wire up basic SEAMCALL functions x86/tdx: Remove 'struct tdx_hypercall_args' x86/tdx: Reimplement __tdx_hypercall() using TDX_MODULE_CALL asm x86/tdx: Make TDX_HYPERCALL asm similar to TDX_MODULE_CALL x86/tdx: Extend TDX_MODULE_CALL to support more TDCALL/SEAMCALL leafs x86/tdx: Pass TDCALL/SEAMCALL input/output registers via a structure x86/tdx: Rename __tdx_module_call() to __tdcall() x86/tdx: Make macros of TDCALLs consistent with the spec x86/tdx: Skip saving output regs when SEAMCALL fails with VMFailInvalid x86/tdx: Zero out the missing RSI in TDX_HYPERCALL macro x86/tdx: Retry partially-completed page conversion hypercalls
2023-10-31Merge tag 'x86-core-2023-10-29-v2' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Thomas Gleixner: - Limit the hardcoded topology quirk for Hygon CPUs to those which have a model ID less than 4. The newer models have the topology CPUID leaf 0xB correctly implemented and are not affected. - Make SMT control more robust against enumeration failures SMT control was added to allow controlling SMT at boottime or runtime. The primary purpose was to provide a simple mechanism to disable SMT in the light of speculation attack vectors. It turned out that the code is sensible to enumeration failures and worked only by chance for XEN/PV. XEN/PV has no real APIC enumeration which means the primary thread mask is not set up correctly. By chance a XEN/PV boot ends up with smp_num_siblings == 2, which makes the hotplug control stay at its default value "enabled". So the mask is never evaluated. The ongoing rework of the topology evaluation caused XEN/PV to end up with smp_num_siblings == 1, which sets the SMT control to "not supported" and the empty primary thread mask causes the hotplug core to deny the bringup of the APS. Make the decision logic more robust and take 'not supported' and 'not implemented' into account for the decision whether a CPU should be booted or not. - Fake primary thread mask for XEN/PV Pretend that all XEN/PV vCPUs are primary threads, which makes the usage of the primary thread mask valid on XEN/PV. That is consistent with because all of the topology information on XEN/PV is fake or even non-existent. - Encapsulate topology information in cpuinfo_x86 Move the randomly scattered topology data into a separate data structure for readability and as a preparatory step for the topology evaluation overhaul. - Consolidate APIC ID data type to u32 It's fixed width hardware data and not randomly u16, int, unsigned long or whatever developers decided to use. - Cure the abuse of cpuinfo for persisting logical IDs. Per CPU cpuinfo is used to persist the logical package and die IDs. That's really not the right place simply because cpuinfo is subject to be reinitialized when a CPU goes through an offline/online cycle. Use separate per CPU data for the persisting to enable the further topology management rework. It will be removed once the new topology management is in place. - Provide a debug interface for inspecting topology information Useful in general and extremly helpful for validating the topology management rework in terms of correctness or "bug" compatibility. * tag 'x86-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/apic, x86/hyperv: Use u32 in hv_snp_boot_ap() too x86/cpu: Provide debug interface x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids x86/apic: Use u32 for wakeup_secondary_cpu[_64]() x86/apic: Use u32 for [gs]et_apic_id() x86/apic: Use u32 for phys_pkg_id() x86/apic: Use u32 for cpu_present_to_apicid() x86/apic: Use u32 for check_apicid_used() x86/apic: Use u32 for APIC IDs in global data x86/apic: Use BAD_APICID consistently x86/cpu: Move cpu_l[l2]c_id into topology info x86/cpu: Move logical package and die IDs into topology info x86/cpu: Remove pointless evaluation of x86_coreid_bits x86/cpu: Move cu_id into topology info x86/cpu: Move cpu_core_id into topology info hwmon: (fam15h_power) Use topology_core_id() scsi: lpfc: Use topology_core_id() x86/cpu: Move cpu_die_id into topology info x86/cpu: Move phys_proc_id into topology info x86/cpu: Encapsulate topology information in cpuinfo_x86 ...
2023-10-13x86/apic, x86/hyperv: Use u32 in hv_snp_boot_ap() tooIngo Molnar1-1/+1
The data type for APIC IDs was standardized to 'u32' in the following recent commit: db4a4086a223 ("x86/apic: Use u32 for wakeup_secondary_cpu[_64]()") Which changed the function arguments type signature of the apic->wakeup_secondary_cpu() APIC driver function. Propagate this to hv_snp_boot_ap() as well, which also addresses a 'assignment from incompatible pointer type' build warning that triggers under the -Werror=incompatible-pointer-types GCC warning. Fixes: db4a4086a223 ("x86/apic: Use u32 for wakeup_secondary_cpu[_64]()") Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230814085113.233274223@linutronix.de