summaryrefslogtreecommitdiff
path: root/arch/x86/include
AgeCommit message (Collapse)AuthorFilesLines
2022-01-23Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linuxLinus Torvalds1-2/+0
Pull bitmap updates from Yury Norov: - introduce for_each_set_bitrange() - use find_first_*_bit() instead of find_next_*_bit() where possible - unify for_each_bit() macros * tag 'bitmap-5.17-rc1' of git://github.com/norov/linux: vsprintf: rework bitmap_list_string lib: bitmap: add performance test for bitmap_print_to_pagebuf bitmap: unify find_bit operations mm/percpu: micro-optimize pcpu_is_populated() Replace for_each_*_bit_from() with for_each_*_bit() where appropriate find: micro-optimize for_each_{set,clear}_bit() include/linux: move for_each_bit() macros from bitops.h to find.h cpumask: replace cpumask_next_* with cpumask_first_* where appropriate tools: sync tools/bitmap with mother linux all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate cpumask: use find_first_and_bit() lib: add find_first_and_bit() arch: remove GENERIC_FIND_FIRST_BIT entirely include: move find.h from asm_generic to linux bitops: move find_bit_*_le functions from le.h to find.h bitops: protect find_first_{,zero}_bit properly
2022-01-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-14/+2
Pull more kvm updates from Paolo Bonzini: "Generic: - selftest compilation fix for non-x86 - KVM: avoid warning on s390 in mark_page_dirty x86: - fix page write-protection bug and improve comments - use binary search to lookup the PMU event filter, add test - enable_pmu module parameter support for Intel CPUs - switch blocked_vcpu_on_cpu_lock to raw spinlock - cleanups of blocked vCPU logic - partially allow KVM_SET_CPUID{,2} after KVM_RUN (5.16 regression) - various small fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits) docs: kvm: fix WARNINGs from api.rst selftests: kvm/x86: Fix the warning in lib/x86_64/processor.c selftests: kvm/x86: Fix the warning in pmu_event_filter_test.c kvm: selftests: Do not indent with spaces kvm: selftests: sync uapi/linux/kvm.h with Linux header selftests: kvm: add amx_test to .gitignore KVM: SVM: Nullify vcpu_(un)blocking() hooks if AVIC is disabled KVM: SVM: Move svm_hardware_setup() and its helpers below svm_x86_ops KVM: SVM: Drop AVIC's intermediate avic_set_running() helper KVM: VMX: Don't do full kick when handling posted interrupt wakeup KVM: VMX: Fold fallback path into triggering posted IRQ helper KVM: VMX: Pass desired vector instead of bool for triggering posted IRQ KVM: VMX: Don't do full kick when triggering posted interrupt "fails" KVM: SVM: Skip AVIC and IRTE updates when loading blocking vCPU KVM: SVM: Use kvm_vcpu_is_blocking() in AVIC load to handle preemption KVM: SVM: Remove unnecessary APICv/AVIC update in vCPU unblocking path KVM: SVM: Don't bother checking for "running" AVIC when kicking for IPIs KVM: SVM: Signal AVIC doorbell iff vCPU is in guest mode KVM: x86: Remove defunct pre_block/post_block kvm_x86_ops hooks KVM: x86: Unexport LAPIC's switch_to_{hv,sw}_timer() helpers ...
2022-01-19KVM: x86: Remove defunct pre_block/post_block kvm_x86_ops hooksSean Christopherson2-14/+0
Drop kvm_x86_ops' pre/post_block() now that all implementations are nops. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211208015236.1616697-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-19Merge branch 'kvm-pi-raw-spinlock' into HEADPaolo Bonzini2-4/+2
Bring in fix for VT-d posted interrupts before further changing the code in 5.17. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-19KVM: VMX: Reject KVM_RUN if emulation is required with pending exceptionSean Christopherson2-0/+2
Reject KVM_RUN if emulation is required (because VMX is running without unrestricted guest) and an exception is pending, as KVM doesn't support emulating exceptions except when emulating real mode via vm86. The vCPU is hosed either way, but letting KVM_RUN proceed triggers a WARN due to the impossible condition. Alternatively, the WARN could be removed, but then userspace and/or KVM bugs would result in the vCPU silently running in a bad state, which isn't very friendly to users. Originally, the bug was hit by syzkaller with a nested guest as that doesn't require kvm_intel.unrestricted_guest=0. That particular flavor is likely fixed by commit cd0e615c49e5 ("KVM: nVMX: Synthesize TRIPLE_FAULT for L2 if emulation is required"), but it's trivial to trigger the WARN with a non-nested guest, and userspace can likely force bad state via ioctls() for a nested guest as well. Checking for the impossible condition needs to be deferred until KVM_RUN because KVM can't force specific ordering between ioctls. E.g. clearing exception.pending in KVM_SET_SREGS doesn't prevent userspace from setting it in KVM_SET_VCPU_EVENTS, and disallowing KVM_SET_VCPU_EVENTS with emulation_required would prevent userspace from queuing an exception and then stuffing sregs. Note, if KVM were to try and detect/prevent the condition prior to KVM_RUN, handle_invalid_guest_state() and/or handle_emulation_failure() would need to be modified to clear the pending exception prior to exiting to userspace. ------------[ cut here ]------------ WARNING: CPU: 6 PID: 137812 at arch/x86/kvm/vmx/vmx.c:1623 vmx_queue_exception+0x14f/0x160 [kvm_intel] CPU: 6 PID: 137812 Comm: vmx_invalid_nes Not tainted 5.15.2-7cc36c3e14ae-pop #279 Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014 RIP: 0010:vmx_queue_exception+0x14f/0x160 [kvm_intel] Code: <0f> 0b e9 fd fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 RSP: 0018:ffffa45c83577d38 EFLAGS: 00010202 RAX: 0000000000000003 RBX: 0000000080000006 RCX: 0000000000000006 RDX: 0000000000000000 RSI: 0000000000010002 RDI: ffff9916af734000 RBP: ffff9916af734000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000006 R13: 0000000000000000 R14: ffff9916af734038 R15: 0000000000000000 FS: 00007f1e1a47c740(0000) GS:ffff99188fb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1e1a6a8008 CR3: 000000026f83b005 CR4: 00000000001726e0 Call Trace: kvm_arch_vcpu_ioctl_run+0x13a2/0x1f20 [kvm] kvm_vcpu_ioctl+0x279/0x690 [kvm] __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Reported-by: syzbot+82112403ace4cbd780d8@syzkaller.appspotmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211228232437.1875318-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds8-44/+115
Pull kvm updates from Paolo Bonzini: "RISCV: - Use common KVM implementation of MMU memory caches - SBI v0.2 support for Guest - Initial KVM selftests support - Fix to avoid spurious virtual interrupts after clearing hideleg CSR - Update email address for Anup and Atish ARM: - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update s390: - fix sigp sense/start/stop/inconsistency - cleanups x86: - Clean up some function prototypes more - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery - completely remove potential TOC/TOU races in nested SVM consistency checks - update some PMCs on emulated instructions - Intel AMX support (joint work between Thomas and Intel) - large MMU cleanups - module parameter to disable PMU virtualization - cleanup register cache - first part of halt handling cleanups - Hyper-V enlightened MSR bitmap support for nested hypervisors Generic: - clean up Makefiles - introduce CONFIG_HAVE_KVM_DIRTY_RING - optimize memslot lookup using a tree - optimize vCPU array usage by converting to xarray" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (268 commits) x86/fpu: Fix inline prefix warnings selftest: kvm: Add amx selftest selftest: kvm: Move struct kvm_x86_state to header selftest: kvm: Reorder vcpu_load_state steps for AMX kvm: x86: Disable interception for IA32_XFD on demand x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() kvm: selftests: Add support for KVM_CAP_XSAVE2 kvm: x86: Add support for getting/setting expanded xstate buffer x86/fpu: Add uabi_size to guest_fpu kvm: x86: Add CPUID support for Intel AMX kvm: x86: Add XCR0 support for Intel AMX kvm: x86: Disable RDMSR interception of IA32_XFD_ERR kvm: x86: Emulate IA32_XFD_ERR for guest kvm: x86: Intercept #NM for saving IA32_XFD_ERR x86/fpu: Prepare xfd_err in struct fpu_guest kvm: x86: Add emulation for IA32_XFD x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation kvm: x86: Enable dynamic xfeatures at KVM_SET_CPUID2 x86/fpu: Provide fpu_enable_guest_xfd_features() for KVM x86/fpu: Add guest support to xfd_enable_feature() ...
2022-01-16Merge tag 'hyperv-next-signed-20220114' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - More patches for Hyper-V isolation VM support (Tianyu Lan) - Bug fixes and clean-up patches from various people * tag 'hyperv-next-signed-20220114' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: scsi: storvsc: Fix storvsc_queuecommand() memory leak x86/hyperv: Properly deal with empty cpumasks in hyperv_flush_tlb_multi() Drivers: hv: vmbus: Initialize request offers message for Isolation VM scsi: storvsc: Fix unsigned comparison to zero swiotlb: Add CONFIG_HAS_IOMEM check around swiotlb_mem_remap() x86/hyperv: Fix definition of hv_ghcb_pg variable Drivers: hv: Fix definition of hypercall input & output arg variables net: netvsc: Add Isolation VM support for netvsc driver scsi: storvsc: Add Isolation VM support for storvsc driver hyper-v: Enable swiotlb bounce buffer for Isolation VM x86/hyper-v: Add hyperv Isolation VM check in the cc_platform_has() swiotlb: Add swiotlb bounce buffer remap function for HV IVM
2022-01-16Merge tag 'pci-v5.17-changes' of ↵Linus Torvalds2-7/+33
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Use pci_find_vsec_capability() instead of open-coding it (Andy Shevchenko) - Convert pci_dev_present() stub from macro to static inline to avoid 'unused variable' errors (Hans de Goede) - Convert sysfs slot attributes from default_attrs to default_groups (Greg Kroah-Hartman) - Use DWORD accesses for LTR, L1 SS to avoid BayHub OZ711LV2 erratum (Rajat Jain) - Remove unnecessary initialization of static variables (Longji Guo) Resource management: - Always write Intel I210 ROM BAR on update to work around device defect (Bjorn Helgaas) PCIe native device hotplug: - Fix pciehp lockdep errors on Thunderbolt undock (Hans de Goede) - Fix infinite loop in pciehp IRQ handler on power fault (Lukas Wunner) Power management: - Convert amd64-agp, sis-agp, via-agp from legacy PCI power management to generic power management (Vaibhav Gupta) IOMMU: - Add function 1 DMA alias quirk for Marvell 88SE9125 SATA controller so it can work with an IOMMU (Yifeng Li) Error handling: - Add PCI_ERROR_RESPONSE and related definitions for signaling and checking for transaction errors on PCI (Naveen Naidu) - Fabricate PCI_ERROR_RESPONSE data (~0) in config read wrappers, instead of in host controller drivers, when transactions fail on PCI (Naveen Naidu) - Use PCI_POSSIBLE_ERROR() to check for possible failure of config reads (Naveen Naidu) Peer-to-peer DMA: - Add Logan Gunthorpe as P2PDMA maintainer (Bjorn Helgaas) ASPM: - Calculate link L0s and L1 exit latencies when needed instead of caching them (Saheed O. Bolarinwa) - Calculate device L0s and L1 acceptable exit latencies when needed instead of caching them (Saheed O. Bolarinwa) - Remove struct aspm_latency since it's no longer needed (Saheed O. Bolarinwa) APM X-Gene PCIe controller driver: - Fix IB window setup, which was broken by the fact that IB resources are now sorted in address order instead of DT dma-ranges order (Rob Herring) Apple PCIe controller driver: - Enable clock gating to save power (Hector Martin) - Fix REFCLK1 enable/poll logic (Hector Martin) Broadcom STB PCIe controller driver: - Declare bitmap correctly for use by bitmap interfaces (Christophe JAILLET) - Clean up computation of legacy and non-legacy MSI bitmasks (Florian Fainelli) - Update suspend/resume/remove error handling to warn about errors and not fail the operation (Jim Quinlan) - Correct the "pcie" and "msi" interrupt descriptions in DT binding (Jim Quinlan) - Add DT bindings for endpoint voltage regulators (Jim Quinlan) - Split brcm_pcie_setup() into two functions (Jim Quinlan) - Add mechanism for turning on voltage regulators for connected devices (Jim Quinlan) - Turn voltage regulators for connected devices on/off when bus is added or removed (Jim Quinlan) - When suspending, don't turn off voltage regulators for wakeup devices (Jim Quinlan) Freescale i.MX6 PCIe controller driver: - Add i.MX8MM support (Richard Zhu) Freescale Layerscape PCIe controller driver: - Use DWC common ops instead of layerscape-specific link-up functions (Hou Zhiqiang) Intel VMD host bridge driver: - Honor platform ACPI _OSC feature negotiation for Root Ports below VMD (Kai-Heng Feng) - Add support for Raptor Lake SKUs (Karthik L Gopalakrishnan) - Reset everything below VMD before enumerating to work around failure to enumerate NVMe devices when guest OS reboots (Nirmal Patel) Bridge emulation (used by Marvell Aardvark and MVEBU): - Make emulated ROM BAR read-only by default (Pali Rohár) - Make some emulated legacy PCI bits read-only for PCIe devices (Pali Rohár) - Update reserved bits in emulated PCIe Capability (Pali Rohár) - Allow drivers to emulate different PCIe Capability versions (Pali Rohár) - Set emulated Capabilities List bit for all PCIe devices, since they must have at least a PCIe Capability (Pali Rohár) Marvell Aardvark PCIe controller driver: - Add bridge emulation definitions for PCIe DEVCAP2, DEVCTL2, DEVSTA2, LNKCAP2, LNKCTL2, LNKSTA2, SLTCAP2, SLTCTL2, SLTSTA2 (Pali Rohár) - Add aardvark support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2 registers (Pali Rohár) - Clear all MSIs at setup to avoid spurious interrupts (Pali Rohár) - Disable bus mastering when unbinding host controller driver (Pali Rohár) - Mask all interrupts when unbinding host controller driver (Pali Rohár) - Fix memory leak in host controller unbind (Pali Rohár) - Assert PERST# when unbinding host controller driver (Pali Rohár) - Disable link training when unbinding host controller driver (Pali Rohár) - Disable common PHY when unbinding host controller driver (Pali Rohár) - Fix resource type checking to check only IORESOURCE_MEM, not IORESOURCE_MEM_64, which is a flavor of IORESOURCE_MEM (Pali Rohár) Marvell MVEBU PCIe controller driver: - Implement pci_remap_iospace() for ARM so mvebu can use devm_pci_remap_iospace() instead of the previous ARM-specific pci_ioremap_io() interface (Pali Rohár) - Use the standard pci_host_probe() instead of the device-specific mvebu_pci_host_probe() (Pali Rohár) - Replace all uses of ARM-specific pci_ioremap_io() with the ARM implementation of the standard pci_remap_iospace() interface and remove pci_ioremap_io() (Pali Rohár) - Skip initializing invalid Root Ports (Pali Rohár) - Check for errors from pci_bridge_emul_init() (Pali Rohár) - Ignore any bridges at non-zero function numbers (Pali Rohár) - Return ~0 data for invalid config read size (Pali Rohár) - Disallow mapping interrupts on emulated bridges (Pali Rohár) - Clear Root Port Memory & I/O Space Enable and Bus Master Enable at initialization (Pali Rohár) - Make type bits in Root Port I/O Base register read-only (Pali Rohár) - Disable Root Port windows when base/limit set to invalid values (Pali Rohár) - Set controller to Root Complex mode (Pali Rohár) - Set Root Port Class Code to PCI Bridge (Pali Rohár) - Update emulated Root Port secondary bus numbers to better reflect the actual topology (Pali Rohár) - Add PCI_BRIDGE_CTL_BUS_RESET support to emulated Root Ports so pci_reset_secondary_bus() can reset connected devices (Pali Rohár) - Add PCI_EXP_DEVCTL Error Reporting Enable support to emulated Root Ports (Pali Rohár) - Add PCI_EXP_RTSTA PME Status bit support to emulated Root Ports (Pali Rohár) - Add DEVCAP2, DEVCTL2 and LNKCTL2 support to emulated Root Ports on Armada XP and newer devices (Pali Rohár) - Export mvebu-mbus.c symbols to allow pci-mvebu.c to be a module (Pali Rohár) - Add support for compiling as a module (Pali Rohár) MediaTek PCIe controller driver: - Assert PERST# for 100ms to allow power and clock to stabilize (qizhong cheng) MediaTek PCIe Gen3 controller driver: - Disable Mediatek DVFSRC voltage request since lack of DVFSRC to respond to the request causes failure to exit L1 PM Substate (Jianjun Wang) MediaTek MT7621 PCIe controller driver: - Declare mt7621_pci_ops static (Sergio Paracuellos) - Give pcibios_root_bridge_prepare() access to host bridge windows (Sergio Paracuellos) - Move MIPS I/O coherency unit setup from driver to pcibios_root_bridge_prepare() (Sergio Paracuellos) - Add missing MODULE_LICENSE() (Sergio Paracuellos) - Allow COMPILE_TEST for all arches (Sergio Paracuellos) Microsoft Hyper-V host bridge driver: - Add hv-internal interfaces to encapsulate arch IRQ dependencies (Sunil Muthuswamy) - Add arm64 Hyper-V vPCI support (Sunil Muthuswamy) Qualcomm PCIe controller driver: - Undo PM setup in qcom_pcie_probe() error handling path (Christophe JAILLET) - Use __be16 type to store return value from cpu_to_be16() (Manivannan Sadhasivam) - Constify static dw_pcie_ep_ops (Rikard Falkeborn) Renesas R-Car PCIe controller driver: - Fix aarch32 abort handler so it doesn't check the wrong bus clock before accessing the host controller (Marek Vasut) TI Keystone PCIe controller driver: - Add register offset for ti,syscon-pcie-id and ti,syscon-pcie-mode DT properties (Kishon Vijay Abraham I) MicroSemi Switchtec management driver: - Add Gen4 automotive device IDs (Kelvin Cao) - Declare state_names[] as static so it's not allocated and initialized for every call (Kelvin Cao) Host controller driver cleanups: - Use of_device_get_match_data(), not of_match_device(), when we only need the device data in altera, artpec6, cadence, designware-plat, dra7xx, keystone, kirin (Fan Fei) - Drop pointless of_device_get_match_data() cast in j721e (Bjorn Helgaas) - Drop redundant struct device * from j721e since struct cdns_pcie already has one (Bjorn Helgaas) - Rename driver structs to *_pcie in intel-gw, iproc, ls-gen4, mediatek-gen3, microchip, mt7621, rcar-gen2, tegra194, uniphier, xgene, xilinx, xilinx-cpm for consistency across drivers (Fan Fei) - Fix invalid address space conversions in hisi, spear13xx (Bjorn Helgaas) Miscellaneous: - Sort Intel Device IDs by value (Andy Shevchenko) - Change Capability offsets to hex to match spec (Baruch Siach) - Correct misspellings (Krzysztof Wilczyński) - Terminate statement with semicolon in pci_endpoint_test.c (Ming Wang)" * tag 'pci-v5.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (151 commits) PCI: mt7621: Allow COMPILE_TEST for all arches PCI: mt7621: Add missing MODULE_LICENSE() PCI: mt7621: Move MIPS setup to pcibios_root_bridge_prepare() PCI: Let pcibios_root_bridge_prepare() access bridge->windows PCI: mt7621: Declare mt7621_pci_ops static PCI: brcmstb: Do not turn off WOL regulators on suspend PCI: brcmstb: Add control of subdevice voltage regulators PCI: brcmstb: Add mechanism to turn on subdev regulators PCI: brcmstb: Split brcm_pcie_setup() into two funcs dt-bindings: PCI: Add bindings for Brcmstb EP voltage regulators dt-bindings: PCI: Correct brcmstb interrupts, interrupt-map. PCI: brcmstb: Fix function return value handling PCI: brcmstb: Do not use __GENMASK PCI: brcmstb: Declare 'used' as bitmap, not unsigned long PCI: hv: Add arm64 Hyper-V vPCI support PCI: hv: Make the code arch neutral by adding arch specific interfaces PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors x86/PCI: Remove initialization of static variables to false PCI: Use DWORD accesses for LTR, L1 SS to avoid erratum misc: pci_endpoint_test: Terminate statement with semicolon ...
2022-01-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-3/+28
Merge misc updates from Andrew Morton: "146 patches. Subsystems affected by this patch series: kthread, ia64, scripts, ntfs, squashfs, ocfs2, vfs, and mm (slab-generic, slab, kmemleak, dax, kasan, debug, pagecache, gup, shmem, frontswap, memremap, memcg, selftests, pagemap, dma, vmalloc, memory-failure, hugetlb, userfaultfd, vmscan, mempolicy, oom-kill, hugetlbfs, migration, thp, ksm, page-poison, percpu, rmap, zswap, zram, cleanups, hmm, and damon)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (146 commits) mm/damon: hide kernel pointer from tracepoint event mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging mm/damon/dbgfs: remove an unnecessary variable mm/damon: move the implementation of damon_insert_region to damon.h mm/damon: add access checking for hugetlb pages Docs/admin-guide/mm/damon/usage: update for schemes statistics mm/damon/dbgfs: support all DAMOS stats Docs/admin-guide/mm/damon/reclaim: document statistics parameters mm/damon/reclaim: provide reclamation statistics mm/damon/schemes: account how many times quota limit has exceeded mm/damon/schemes: account scheme actions that successfully applied mm/damon: remove a mistakenly added comment for a future feature Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk|rm)_contexts Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning Docs/admin-guide/mm/damon/usage: remove redundant information Docs/admin-guide/mm/damon/usage: update for scheme quotas and watermarks mm/damon: convert macro functions to static inline functions mm/damon: modify damon_rand() macro to static inline function mm/damon: move damon_rand() definition into damon.h ...
2022-01-15include: move find.h from asm_generic to linuxYury Norov1-2/+0
find_bit API and bitmap API are closely related, but inclusion paths are different - include/asm-generic and include/linux, correspondingly. In the past it made a lot of troubles due to circular dependencies and/or undefined symbols. Fix this by moving find.h under include/linux. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2022-01-15x86: mm: add x86_64 support for page table checkPasha Tatashin1-2/+27
Add page table check hooks into routines that modify user page tables. Link: https://lkml.kernel.org/r/20211221154650.1047963-5-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Xu <weixugc@google.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15mm: move tlb_flush_pending inline helpers to mm_inline.hArnd Bergmann1-1/+1
linux/mm_types.h should only define structure definitions, to make it cheap to include elsewhere. The atomic_t helper function definitions are particularly large, so it's better to move the helpers using those into the existing linux/mm_inline.h and only include that where needed. As a follow-up, we may want to go through all the indirect includes in mm_types.h and reduce them as much as possible. Link: https://lkml.kernel.org/r/20211207125710.2503446-2-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Colin Cross <ccross@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Yu Zhao <yuzhao@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-14x86/fpu: Fix inline prefix warningsYang Zhong1-1/+1
Fix sparse warnings in xstate and remove inline prefix. Fixes: 980fe2fddcff ("x86/fpu: Extend fpu_xstate_prctl() with guest permissions") Signed-off-by: Yang Zhong <yang.zhong@intel.com> Reported-by: kernel test robot <lkp@intel.com> Message-Id: <20220113180825.322333-1-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-14kvm: x86: Disable interception for IA32_XFD on demandKevin Tian1-0/+1
Always intercepting IA32_XFD causes non-negligible overhead when this register is updated frequently in the guest. Disable r/w emulation after intercepting the first WRMSR(IA32_XFD) with a non-zero value. Disable WRMSR emulation implies that IA32_XFD becomes out-of-sync with the software states in fpstate and the per-cpu xfd cache. This leads to two additional changes accordingly: - Call fpu_sync_guest_vmexit_xfd_state() after vm-exit to bring software states back in-sync with the MSR, before handle_exit_irqoff() is called. - Always trap #NM once write interception is disabled for IA32_XFD. The #NM exception is rare if the guest doesn't use dynamic features. Otherwise, there is at most one exception per guest task given a dynamic feature. p.s. We have confirmed that SDM is being revised to say that when setting IA32_XFD[18] the AMX register state is not guaranteed to be preserved. This clarification avoids adding mess for a creative guest which sets IA32_XFD[18]=1 before saving active AMX state to its own storage. Signed-off-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-22-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-14x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state()Thomas Gleixner1-0/+2
KVM can disable the write emulation for the XFD MSR when the vCPU's fpstate is already correctly sized to reduce the overhead. When write emulation is disabled the XFD MSR state after a VMEXIT is unknown and therefore not in sync with the software states in fpstate and the per CPU XFD cache. Provide fpu_sync_guest_vmexit_xfd_state() which has to be invoked after a VMEXIT before enabling interrupts when write emulation is disabled for the XFD MSR. It could be invoked unconditionally even when write emulation is enabled for the price of a pointless MSR read. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-21-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-14kvm: x86: Add support for getting/setting expanded xstate bufferGuang Zeng1-1/+15
With KVM_CAP_XSAVE, userspace uses a hardcoded 4KB buffer to get/set xstate data from/to KVM. This doesn't work when dynamic xfeatures (e.g. AMX) are exposed to the guest as they require a larger buffer size. Introduce a new capability (KVM_CAP_XSAVE2). Userspace VMM gets the required xstate buffer size via KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2). KVM_SET_XSAVE is extended to work with both legacy and new capabilities by doing properly-sized memdup_user() based on the guest fpu container. KVM_GET_XSAVE is kept for backward-compatible reason. Instead, KVM_GET_XSAVE2 is introduced under KVM_CAP_XSAVE2 as the preferred interface for getting xstate buffer (4KB or larger size) from KVM (Link: https://lkml.org/lkml/2021/12/15/510) Also, update the api doc with the new KVM_GET_XSAVE2 ioctl. Signed-off-by: Guang Zeng <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-19-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-14x86/fpu: Add uabi_size to guest_fpuThomas Gleixner1-0/+5
Userspace needs to inquire KVM about the buffer size to work with the new KVM_SET_XSAVE and KVM_GET_XSAVE2. Add the size info to guest_fpu for KVM to access. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-18-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-14kvm: x86: Add CPUID support for Intel AMXJing Liu1-0/+2
Extend CPUID emulation to support XFD, AMX_TILE, AMX_INT8 and AMX_BF16. Adding those bits into kvm_cpu_caps finally activates all previous logics in this series. Hide XFD on 32bit host kernels. Otherwise it leads to a weird situation where KVM tells userspace to migrate MSR_IA32_XFD and then rejects attempts to read/write the MSR. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-17-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-14x86/fpu: Prepare xfd_err in struct fpu_guestJing Liu1-0/+5
When XFD causes an instruction to generate #NM, IA32_XFD_ERR contains information about which disabled state components are being accessed. The #NM handler is expected to check this information and then enable the state components by clearing IA32_XFD for the faulting task (if having permission). If the XFD_ERR value generated in guest is consumed/clobbered by the host before the guest itself doing so, it may lead to non-XFD-related #NM treated as XFD #NM in host (due to non-zero value in XFD_ERR), or XFD-related #NM treated as non-XFD #NM in guest (XFD_ERR cleared by the host #NM handler). Introduce a new field in fpu_guest to save the guest xfd_err value. KVM is expected to save guest xfd_err before interrupt is enabled and restore it right before entering the guest (with interrupt disabled). Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-12-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-14x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulationKevin Tian1-0/+6
Guest XFD can be updated either in the emulation path or in the restore path. Provide a wrapper to update guest_fpu::fpstate::xfd. If the guest fpstate is currently in-use, also update the per-cpu xfd cache and the actual MSR. Signed-off-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-10-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-14x86/fpu: Provide fpu_enable_guest_xfd_features() for KVMSean Christopherson1-0/+1
Provide a wrapper for expanding the guest fpstate buffer according to requested xfeatures. KVM wants to call this wrapper to manage any dynamic xstate used by the guest. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220105123532.12586-8-yang.zhong@intel.com> [Remove unnecessary 32-bit check. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-13Merge tag 'irq-msi-2022-01-13' of ↵Linus Torvalds2-6/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MSI irq updates from Thomas Gleixner: "Rework of the MSI interrupt infrastructure. This is a treewide cleanup and consolidation of MSI interrupt handling in preparation for further changes in this area which are necessary to: - address existing shortcomings in the VFIO area - support the upcoming Interrupt Message Store functionality which decouples the message store from the PCI config/MMIO space" * tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits) genirq/msi: Populate sysfs entry only once PCI/MSI: Unbreak pci_irq_get_affinity() genirq/msi: Convert storage to xarray genirq/msi: Simplify sysfs handling genirq/msi: Add abuse prevention comment to msi header genirq/msi: Mop up old interfaces genirq/msi: Convert to new functions genirq/msi: Make interrupt allocation less convoluted platform-msi: Simplify platform device MSI code platform-msi: Let core code handle MSI descriptors bus: fsl-mc-msi: Simplify MSI descriptor handling soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs() soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation NTB/msi: Convert to msi_on_each_desc() PCI: hv: Rework MSI handling powerpc/mpic_u3msi: Use msi_for_each-desc() powerpc/fsl_msi: Use msi_for_each_desc() powerpc/pasemi/msi: Convert to msi_on_each_dec() powerpc/cell/axon_msi: Convert to msi_on_each_desc() powerpc/4xx/hsta: Rework MSI handling ...
2022-01-13Merge tag 'x86_core_for_v5.17_rc1' of ↵Linus Torvalds19-178/+214
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Borislav Petkov: - Get rid of all the .fixup sections because this generates misleading/wrong stacktraces and confuse RELIABLE_STACKTRACE and LIVEPATCH as the backtrace misses the function which is being fixed up. - Add Straight Line Speculation mitigation support which uses a new compiler switch -mharden-sls= which sticks an INT3 after a RET or an indirect branch in order to block speculation after them. Reportedly, CPUs do speculate behind such insns. - The usual set of cleanups and improvements * tag 'x86_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits) x86/entry_32: Fix segment exceptions objtool: Remove .fixup handling x86: Remove .fixup section x86/word-at-a-time: Remove .fixup usage x86/usercopy: Remove .fixup usage x86/usercopy_32: Simplify __copy_user_intel_nocache() x86/sgx: Remove .fixup usage x86/checksum_32: Remove .fixup usage x86/vmx: Remove .fixup usage x86/kvm: Remove .fixup usage x86/segment: Remove .fixup usage x86/fpu: Remove .fixup usage x86/xen: Remove .fixup usage x86/uaccess: Remove .fixup usage x86/futex: Remove .fixup usage x86/msr: Remove .fixup usage x86/extable: Extend extable functionality x86/entry_32: Remove .fixup usage x86/entry_64: Remove .fixup usage x86/copy_mc_64: Remove .fixup usage ...
2022-01-13Merge tag 'perf_core_for_v5.17_rc1' of ↵Linus Torvalds1-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Borislav Petkov: "Cleanup of the perf/kvm interaction." * tag 'perf_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Drop guest callback (un)register stubs KVM: arm64: Drop perf.c and fold its tiny bits of code into arm.c KVM: arm64: Hide kvm_arm_pmu_available behind CONFIG_HW_PERF_EVENTS=y KVM: arm64: Convert to the generic perf callbacks KVM: x86: Move Intel Processor Trace interrupt handler to vmx.c KVM: Move x86's perf guest info callbacks to generic KVM KVM: x86: More precisely identify NMI from guest when handling PMI KVM: x86: Drop current_vcpu for kvm_running_vcpu + kvm_arch_vcpu variable perf/core: Use static_call to optimize perf_guest_info_callbacks perf: Force architectures to opt-in to guest callbacks perf: Add wrappers for invoking guest callbacks perf/core: Rework guest callbacks to prepare for static_call support perf: Drop dead and useless guest "support" from arm, csky, nds32 and riscv perf: Stop pretending that perf can handle multiple guest callbacks KVM: x86: Register Processor Trace interrupt hook iff PT enabled in guest KVM: x86: Register perf callbacks after calling vendor's hardware_setup() perf: Protect perf_guest_cbs with RCU
2022-01-12x86/entry_32: Fix segment exceptionsPeter Zijlstra1-1/+10
The LKP robot reported that commit in Fixes: caused a failure. Turns out the ldt_gdt_32 selftest turns into an infinite loop trying to clear the segment. As discovered by Sean, what happens is that PARANOID_EXIT_TO_KERNEL_MODE in the handle_exception_return path overwrites the entry stack data with the task stack data, restoring the "bad" segment value. Instead of having the exception retry the instruction, have it emulate the full instruction. Replace EX_TYPE_POP_ZERO with EX_TYPE_POP_REG which will do the equivalent of: POP %reg; MOV $imm, %reg. In order to encode the segment registers, add them as registers 8-11 for 32-bit. By setting regs->[defg]s the (nested) RESTORE_REGS will pop this value at the end of the exception handler and by increasing regs->sp, it will have skipped the stack slot. This was debugged by Sean Christopherson <seanjc@google.com>. [ bp: Add EX_REG_GS too. ] Fixes: aa93e2ad7464 ("x86/entry_32: Remove .fixup usage") Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/Yd1l0gInc4zRcnt/@hirez.programming.kicks-ass.net
2022-01-12PCI: hv: Make the code arch neutral by adding arch specific interfacesSunil Muthuswamy2-7/+33
Encapsulate arch dependencies in Hyper-V vPCI through a set of arch-dependent interfaces. Adding these arch specific interfaces will allow for an implementation for other architectures, such as arm64. There are no functional changes expected from this patch. Link: https://lore.kernel.org/r/1641411156-31705-2-git-send-email-sunilmut@linux.microsoft.com Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
2022-01-12Merge tag 'efi-next-for-v5.17' of ↵Linus Torvalds1-4/+10
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - support taking the measurement of the initrd when loaded via the LoadFile2 protocol - kobject API cleanup from Greg - some header file whitespace fixes * tag 'efi-next-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: use default_groups in kobj_type efi/libstub: measure loaded initrd info into the TPM efi/libstub: consolidate initrd handling across architectures efi/libstub: x86/mixed: increase supported argument count efi/libstub: add prototype of efi_tcg2_protocol::hash_log_extend_event() include/linux/efi.h: Remove unneeded whitespaces before tabs
2022-01-11Merge tag 'kcsan.2022.01.09a' of ↵Linus Torvalds2-5/+6
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull KCSAN updates from Paul McKenney: "This provides KCSAN fixes and also the ability to take memory barriers into account for weakly-ordered systems. This last can increase the probability of detecting certain types of data races" * tag 'kcsan.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (29 commits) kcsan: Only test clear_bit_unlock_is_negative_byte if arch defines it kcsan: Avoid nested contexts reading inconsistent reorder_access kcsan: Turn barrier instrumentation into macros kcsan: Make barrier tests compatible with lockdep kcsan: Support WEAK_MEMORY with Clang where no objtool support exists compiler_attributes.h: Add __disable_sanitizer_instrumentation objtool, kcsan: Remove memory barrier instrumentation from noinstr objtool, kcsan: Add memory barrier instrumentation to whitelist sched, kcsan: Enable memory barrier instrumentation mm, kcsan: Enable barrier instrumentation x86/qspinlock, kcsan: Instrument barrier of pv_queued_spin_unlock() x86/barriers, kcsan: Use generic instrumentation for non-smp barriers asm-generic/bitops, kcsan: Add instrumentation for barriers locking/atomics, kcsan: Add instrumentation for barriers locking/barriers, kcsan: Support generic instrumentation locking/barriers, kcsan: Add instrumentation for barriers kcsan: selftest: Add test case to check memory barrier instrumentation kcsan: Ignore GCC 11+ warnings about TSan runtime support kcsan: test: Add test cases for memory barrier instrumentation kcsan: test: Match reordered or normal accesses ...
2022-01-11Merge tag 'pm-5.17-rc1' of ↵Linus Torvalds3-1/+19
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "The most signigicant change here is the addition of a new cpufreq 'P-state' driver for AMD processors as a better replacement for the venerable acpi-cpufreq driver. There are also other cpufreq updates (in the core, intel_pstate, ARM drivers), PM core updates (mostly related to adding new macros for declaring PM operations which should make the lives of driver developers somewhat easier), and a bunch of assorted fixes and cleanups. Summary: - Add new P-state driver for AMD processors (Huang Rui). - Fix initialization of min and max frequency QoS requests in the cpufreq core (Rafael Wysocki). - Fix EPP handling on Alder Lake in intel_pstate (Srinivas Pandruvada). - Make intel_pstate update cpuinfo.max_freq when notified of HWP capabilities changes and drop a redundant function call from that driver (Rafael Wysocki). - Improve IRQ support in the Qcom cpufreq driver (Ard Biesheuvel, Stephen Boyd, Vladimir Zapolskiy). - Fix double devm_remap() in the Mediatek cpufreq driver (Hector Yuan). - Introduce thermal pressure helpers for cpufreq CPU cooling (Lukasz Luba). - Make cpufreq use default_groups in kobj_type (Greg Kroah-Hartman). - Make cpuidle use default_groups in kobj_type (Greg Kroah-Hartman). - Fix two comments in cpuidle code (Jason Wang, Yang Li). - Allow model-specific normal EPB value to be used in the intel_epb sysfs attribute handling code (Srinivas Pandruvada). - Simplify locking in pm_runtime_put_suppliers() (Rafael Wysocki). - Add safety net to supplier device release in the runtime PM core code (Rafael Wysocki). - Capture device status before disabling runtime PM for it (Rafael Wysocki). - Add new macros for declaring PM operations to allow drivers to avoid guarding them with CONFIG_PM #ifdefs or __maybe_unused and update some drivers to use these macros (Paul Cercueil). - Allow ACPI hardware signature to be honoured during restore from hibernation (David Woodhouse). - Update outdated operating performance points (OPP) documentation (Tang Yizhou). - Reduce log severity for informative message regarding frequency transition failures in devfreq (Tzung-Bi Shih). - Add DRAM frequency controller devfreq driver for Allwinner sunXi SoCs (Samuel Holland). - Add missing COMMON_CLK dependency to sun8i devfreq driver (Arnd Bergmann). - Add support for new layout of Psys PowerLimit Register on SPR to the Intel RAPL power capping driver (Zhang Rui). - Fix typo in a comment in idle_inject.c (Jason Wang). - Remove unused function definition from the DTPM (Dynamit Thermal Power Management) power capping framework (Daniel Lezcano). - Reduce DTPM trace verbosity (Daniel Lezcano)" * tag 'pm-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits) x86, sched: Fix undefined reference to init_freq_invariance_cppc() build error cpufreq: amd-pstate: Fix Kconfig dependencies for AMD P-State cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc comment cpuidle: use default_groups in kobj_type x86: intel_epb: Allow model specific normal EPB value MAINTAINERS: Add AMD P-State driver maintainer entry Documentation: amd-pstate: Add AMD P-State driver introduction cpufreq: amd-pstate: Add AMD P-State performance attributes cpufreq: amd-pstate: Add AMD P-State frequencies attributes cpufreq: amd-pstate: Add boost mode support for AMD P-State cpufreq: amd-pstate: Add trace for AMD P-State module cpufreq: amd-pstate: Introduce the support for the processors with shared memory solution cpufreq: amd-pstate: Add fast switch function for AMD P-State cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors ACPI: CPPC: Add CPPC enable register function ACPI: CPPC: Check present CPUs for determining _CPC is valid ACPI: CPPC: Implement support for SystemIO registers x86/msr: Add AMD CPPC MSR definitions x86/cpufeatures: Add AMD Collaborative Processor Performance Control feature flag cpufreq: use default_groups in kobj_type ...
2022-01-10Merge tag 'ras_core_for_v5.17_rc1' of ↵Linus Torvalds2-21/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Borislav Petkov: "A relatively big amount of movements in RAS-land this time around: - First part of a series to move the AMD address translation code from arch/x86/ to amd64_edac as that is its only user anyway - Some MCE error injection improvements to the AMD side - Reorganization of the #MC handler code and the facilities it calls to make it noinstr-safe - Add support for new AMD MCA bank types and non-uniform banks layout - The usual set of cleanups and fixes" * tag 'ras_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/mce: Reduce number of machine checks taken during recovery x86/mce/inject: Avoid out-of-bounds write when setting flags x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types x86/mce: Check regs before accessing it x86/mce: Mark mce_start() noinstr x86/mce: Mark mce_timed_out() noinstr x86/mce: Move the tainting outside of the noinstr region x86/mce: Mark mce_read_aux() noinstr x86/mce: Mark mce_end() noinstr x86/mce: Mark mce_panic() noinstr x86/mce: Prevent severity computation from being instrumented x86/mce: Allow instrumentation during task work queueing x86/mce: Remove noinstr annotation from mce_setup() x86/mce: Use mce_rdmsrl() in severity checking code x86/mce: Remove function-local cpus variables x86/mce: Do not use memset to clear the banks bitmaps x86/mce/inject: Set the valid bit in MCA_STATUS before error injection x86/mce/inject: Check if a bank is populated before injecting x86/mce: Get rid of cpu_missing ...
2022-01-10Merge tag 'x86_cleanups_for_v5.17_rc1' of ↵Linus Torvalds2-5/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: "The mandatory set of random minor cleanups all over tip" * tag 'x86_cleanups_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/events/amd/iommu: Remove redundant assignment to variable shift x86/boot/string: Add missing function prototypes x86/fpu: Remove duplicate copy_fpstate_to_sigframe() prototype x86/uaccess: Move variable into switch case statement
2022-01-10Merge tag 'x86_misc_for_v5.17_rc1' of ↵Linus Torvalds2-7/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Borislav Petkov: "The pile which we cannot find the proper topic for so we stick it in x86/misc: - Add support for decoding instructions which do MMIO accesses in order to use it in SEV and TDX guests - An include fix and reorg to allow for removing set_fs in UML later" * tag 'x86_misc_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mtrr: Remove the mtrr_bp_init() stub x86/sev-es: Use insn_decode_mmio() for MMIO implementation x86/insn-eval: Introduce insn_decode_mmio() x86/insn-eval: Introduce insn_get_modrm_reg_ptr() x86/insn-eval: Handle insn_get_opcode() failure
2022-01-10Merge tag 'x86_mm_for_v5.17_rc1' of ↵Linus Torvalds4-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Borislav Petkov: - Flush *all* mappings from the TLB after switching to the trampoline pagetable to prevent any stale entries' presence - Flush global mappings from the TLB, in addition to the CR3-write, after switching off of the trampoline_pgd during boot to clear the identity mappings - Prevent instrumentation issues resulting from the above changes * tag 'x86_mm_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Prevent early boot triple-faults with instrumentation x86/mm: Include spinlock_t definition in pgtable. x86/mm: Flush global TLB when switching to trampoline page-table x86/mm/64: Flush global TLB on boot and AP bringup x86/realmode: Add comment for Global bit usage in trampoline_pgd x86/mm: Add missing <asm/cpufeatures.h> dependency to <asm/page_64.h>
2022-01-10Merge tag 'x86_sgx_for_v5.17_rc1' of ↵Linus Torvalds2-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SGX updates from Borislav Petkov: - Add support for handling hw errors in SGX pages: poisoning, recovering from poison memory and error injection into SGX pages - A bunch of changes to the SGX selftests to simplify and allow of SGX features testing without the need of a whole SGX software stack - Add a sysfs attribute which is supposed to show the amount of SGX memory in a NUMA node, similar to what /proc/meminfo is to normal memory - The usual bunch of fixes and cleanups too * tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/sgx: Fix NULL pointer dereference on non-SGX systems selftests/sgx: Fix corrupted cpuid macro invocation x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node x86/sgx: Fix minor documentation issues selftests/sgx: Add test for multiple TCS entry selftests/sgx: Enable multiple thread support selftests/sgx: Add page permission and exception test selftests/sgx: Rename test properties in preparation for more enclave tests selftests/sgx: Provide per-op parameter structs for the test enclave selftests/sgx: Add a new kselftest: Unclobbered_vdso_oversubscribed selftests/sgx: Move setup_test_encl() to each TEST_F() selftests/sgx: Encpsulate the test enclave creation selftests/sgx: Dump segments and /proc/self/maps only on failure selftests/sgx: Create a heap for the test enclave selftests/sgx: Make data measurement for an enclave segment optional selftests/sgx: Assign source for each segment selftests/sgx: Fix a benign linker warning x86/sgx: Add check for SGX pages to ghes_do_memory_failure() x86/sgx: Add hook to error injection address validation x86/sgx: Hook arch_memory_failure() into mainline code ...
2022-01-10Merge tag 'x86_sev_for_v5.17_rc1' of ↵Linus Torvalds2-42/+33
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV updates from Borislav Petkov: "The accumulated pile of x86/sev generalizations and cleanups: - Share the SEV string unrolling logic with TDX as TDX guests need it too - Cleanups and generalzation of code shared by SEV and TDX" * tag 'x86_sev_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Move common memory encryption code to mem_encrypt.c x86/sev: Rename mem_encrypt.c to mem_encrypt_amd.c x86/sev: Use CC_ATTR attribute to generalize string I/O unroll x86/sev: Remove do_early_exception() forward declarations x86/head64: Carve out the guest encryption postprocessing into a helper x86/sev: Get rid of excessive use of defines x86/sev: Shorten GHCB terminate macro names
2022-01-10Merge tag 'x86_paravirt_for_v5.17_rc1' of ↵Linus Torvalds2-7/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirtualization fix from Borislav Petkov: "Define the INTERRUPT_RETURN macro only when CONFIG_XEN_PV is enabled as it is its only user" * tag 'x86_paravirt_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/paravirt: Fix build PARAVIRT_XXL=y without XEN_PV
2022-01-10Merge branch 'pm-cpufreq'Rafael J. Wysocki3-1/+19
Merge cpufreq updates for 5.17-rc1: - Add new P-state driver for AMD processors (Huang Rui). - Fix initialization of min and max frequency QoS requests in the cpufreq core (Rafael Wysocki). - Fix EPP handling on Alder Lake in intel_pstate (Srinivas Pandruvada). - Make intel_pstate update cpuinfo.max_freq when notified of HWP capabilities changes and drop a redundant function call from that driver (Rafael Wysocki). - Improve IRQ support in the Qcom cpufreq driver (Ard Biesheuvel, Stephen Boyd, Vladimir Zapolskiy). - Fix double devm_remap() in the Mediatek cpufreq driver (Hector Yuan). - Introduce thermal pressure helpers for cpufreq CPU cooling (Lukasz Luba). - Make cpufreq use default_groups in kobj_type (Greg Kroah-Hartman). * pm-cpufreq: (32 commits) x86, sched: Fix undefined reference to init_freq_invariance_cppc() build error cpufreq: amd-pstate: Fix Kconfig dependencies for AMD P-State cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc comment MAINTAINERS: Add AMD P-State driver maintainer entry Documentation: amd-pstate: Add AMD P-State driver introduction cpufreq: amd-pstate: Add AMD P-State performance attributes cpufreq: amd-pstate: Add AMD P-State frequencies attributes cpufreq: amd-pstate: Add boost mode support for AMD P-State cpufreq: amd-pstate: Add trace for AMD P-State module cpufreq: amd-pstate: Introduce the support for the processors with shared memory solution cpufreq: amd-pstate: Add fast switch function for AMD P-State cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors ACPI: CPPC: Add CPPC enable register function ACPI: CPPC: Check present CPUs for determining _CPC is valid ACPI: CPPC: Implement support for SystemIO registers x86/msr: Add AMD CPPC MSR definitions x86/cpufeatures: Add AMD Collaborative Processor Performance Control feature flag cpufreq: use default_groups in kobj_type cpufreq: mediatek-hw: Fix double devm_remap in hotplug case cpufreq: intel_pstate: Update cpuinfo.max_freq on HWP_CAP changes ...
2022-01-07x86/fpu: Prepare guest FPU for dynamically enabled FPU featuresThomas Gleixner1-0/+13
To support dynamically enabled FPU features for guests prepare the guest pseudo FPU container to keep track of the currently enabled xfeatures and the guest permissions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-3-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07x86/fpu: Extend fpu_xstate_prctl() with guest permissionsThomas Gleixner3-12/+25
KVM requires a clear separation of host user space and guest permissions for dynamic XSTATE components. Add a guest permissions member to struct fpu and a separate set of prctl() arguments: ARCH_GET_XCOMP_GUEST_PERM and ARCH_REQ_XCOMP_GUEST_PERM. The semantics are equivalent to the host user space permission control except for the following constraints: 1) Permissions have to be requested before the first vCPU is created 2) Permissions are frozen when the first vCPU is created to ensure consistency. Any attempt to expand permissions via the prctl() after that point is rejected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-2-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: SVM: include CR3 in initial VMSA state for SEV-ES guestsMichael Roth2-0/+2
Normally guests will set up CR3 themselves, but some guests, such as kselftests, and potentially CONFIG_PVH guests, rely on being booted with paging enabled and CR3 initialized to a pre-allocated page table. Currently CR3 updates via KVM_SET_SREGS* are not loaded into the guest VMCB until just prior to entering the guest. For SEV-ES/SEV-SNP, this is too late, since it will have switched over to using the VMSA page prior to that point, with the VMSA CR3 copied from the VMCB initial CR3 value: 0. Address this by sync'ing the CR3 value into the VMCB save area immediately when KVM_SET_SREGS* is issued so it will find it's way into the initial VMSA. Suggested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-Id: <20211216171358.61140-10-michael.roth@amd.com> [Remove vmx_post_set_cr3; add a remark about kvm_set_cr3 not calling the new hook. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel deliveryDavid Woodhouse1-0/+1
This adds basic support for delivering 2 level event channels to a guest. Initially, it only supports delivery via the IRQ routing table, triggered by an eventfd. In order to do so, it has a kvm_xen_set_evtchn_fast() function which will use the pre-mapped shared_info page if it already exists and is still valid, while the slow path through the irqfd_inject workqueue will remap the shared_info page if necessary. It sets the bits in the shared_info page but not the vcpu_info; that is deferred to __kvm_xen_has_interrupt() which raises the vector to the appropriate vCPU. Add a 'verbose' mode to xen_shinfo_test while adding test cases for this. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211210163625.2886-5-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: x86/xen: Maintain valid mapping of Xen shared_info pageDavid Woodhouse1-1/+1
Use the newly reinstated gfn_to_pfn_cache to maintain a kernel mapping of the Xen shared_info page so that it can be accessed in atomic context. Note that we do not participate in dirty tracking for the shared info page and we do not explicitly mark it dirty every single tim we deliver an event channel interrupts. We wouldn't want to do that even if we *did* have a valid vCPU context with which to do so. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211210163625.2886-4-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: x86/pmu: Add pmc->intr to refactor kvm_perf_overflow{_intr}()Like Xu1-0/+1
Depending on whether intr should be triggered or not, KVM registers two different event overflow callbacks in the perf_event context. The code skeleton of these two functions is very similar, so the pmc->intr can be stored into pmc from pmc_reprogram_counter() which provides smaller instructions footprint against the u-architecture branch predictor. The __kvm_perf_overflow() can be called in non-nmi contexts and a flag is needed to distinguish the caller context and thus avoid a check on kvm_is_in_guest(), otherwise we might get warnings from suspicious RCU or check_preemption_disabled(). Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Like Xu <likexu@tencent.com> Message-Id: <20211130074221.93635-5-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-06x86, sched: Fix undefined reference to init_freq_invariance_cppc() build errorHuang Rui1-1/+1
The init_freq_invariance_cppc function is implemented in smpboot and depends on CONFIG_SMP. MODPOST vmlinux.symvers MODINFO modules.builtin.modinfo GEN modules.builtin LD .tmp_vmlinux.kallsyms1 ld: drivers/acpi/cppc_acpi.o: in function `acpi_cppc_processor_probe': /home/ray/brahma3/linux/drivers/acpi/cppc_acpi.c:819: undefined reference to `init_freq_invariance_cppc' make: *** [Makefile:1161: vmlinux] Error 1 See https://lore.kernel.org/lkml/484af487-7511-647e-5c5b-33d4429acdec@infradead.org/. Fixes: 41ea667227ba ("x86, sched: Calculate frequency invariance for AMD systems") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Huang Rui <ray.huang@amd.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30x86/msr: Add AMD CPPC MSR definitionsHuang Rui1-0/+17
AMD CPPC (Collaborative Processor Performance Control) function uses MSR registers to manage the performance hints. So add the MSR register macro here. Signed-off-by: Huang Rui <ray.huang@amd.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30x86/cpufeatures: Add AMD Collaborative Processor Performance Control feature ↵Huang Rui1-0/+1
flag Add Collaborative Processor Performance Control feature flag for AMD processors. This feature flag will be used on the following AMD P-State driver. The AMD P-State driver has two approaches to implement the frequency control behavior. That depends on the CPU hardware implementation. One is "Full MSR Support" and another is "Shared Memory Support". The feature flag indicates the current processors with "Full MSR Support". Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-28x86/hyperv: Fix definition of hv_ghcb_pg variableMichael Kelley1-1/+1
The percpu variable hv_ghcb_pg is incorrectly defined. The __percpu qualifier should be associated with the union hv_ghcb * (i.e., a pointer), not with the target of the pointer. This distinction makes no difference to gcc and the generated code, but sparse correctly complains. Fix the definition in the interest of general correctness in addition to making sparse happy. No functional change. Fixes: 0cc4f6d9f0b9 ("x86/hyperv: Initialize GHCB page in Isolation VM") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1640662315-22260-2-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-12-27Merge tag 'efi-urgent-for-v5.16-2' of ↵Linus Torvalds1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: "Another EFI fix for v5.16: - Prevent missing prototype warning from breaking the build under CONFIG_WERROR=y" * tag 'efi-urgent-for-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Move efifb_setup_from_dmi() prototype from arch headers
2021-12-26Merge tag 'x86_urgent_for_v5.16_rc7' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Prevent potential undefined behavior due to shifting pkey constants into the sign bit - Move the EFI memory reservation code *after* the efi= cmdline parsing has happened - Revert two commits which turned out to be the wrong direction to chase when accommodating early memblock reservations consolidation and command line parameters parsing * tag 'x86_urgent_for_v5.16_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pkey: Fix undefined behaviour with PKRU_WD_BIT x86/boot: Move EFI range reservation after cmdline parsing Revert "x86/boot: Pull up cmdline preparation and early param parsing" Revert "x86/boot: Mark prepare_command_line() __init"
2021-12-22x86/mtrr: Remove the mtrr_bp_init() stubChristoph Hellwig1-7/+1
Add an IS_ENABLED() check in setup_arch() and call pat_disable() directly if MTRRs are not supported. This allows to remove the <asm/memtype.h> include in <asm/mtrr.h>, which pull in lowlevel x86 headers that should not be included for UML builds and will cause build warnings with a following patch. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211215165612.554426-2-hch@lst.de