starfive-tech/linux.git - StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2020-09-29	Merge branch 'for-next/svm' of ↵	Will Deacon	4	-7/+113
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into for-joerg/arm-smmu/updates Pull in core arm64 changes required to enable Shared Virtual Memory (SVM) using SMMUv3. This brings us increasingly closer to being able to share page-tables directly between user-space tasks running on the CPU and their corresponding contexts on coherent devices performing DMA through the SMMU. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	iommu/arm-smmu-v3: Ensure queue is read after updating prod pointer	Zhou Wang	2	-0/+2
	Reading the 'prod' MMIO register in order to determine whether or not there is valid data beyond 'cons' for a given queue does not provide sufficient dependency ordering, as the resulting access is address dependent only on 'cons' and can therefore be speculated ahead of time, potentially allowing stale data to be read by the CPU. Use readl() instead of readl_relaxed() when updating the shadow copy of the 'prod' pointer, so that all speculated memory reads from the corresponding queue can occur only from valid slots. Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzhou1@hisilicon.com [will: Use readl() instead of explicit barrier. Update 'cons' side to match.] Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: cpufeature: Export symbol read_sanitised_ftr_reg()	Jean-Philippe Brucker	1	-0/+1
	The SMMUv3 driver would like to read the MMFR0 PARANGE field in order to share CPU page tables with devices. Allow the driver to be built as module by exporting the read_sanitized_ftr_reg() cpufeature symbol. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20200918101852.582559-7-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-29	arm64: mm: Pin down ASIDs for sharing mm with devices	Jean-Philippe Brucker	3	-7/+112
	To enable address space sharing with the IOMMU, introduce arm64_mm_context_get() and arm64_mm_context_put(), that pin down a context and ensure that it will keep its ASID after a rollover. Export the symbols to let the modular SMMUv3 driver use them. Pinning is necessary because a device constantly needs a valid ASID, unlike tasks that only require one when running. Without pinning, we would need to notify the IOMMU when we're about to use a new ASID for a task, and it would get complicated when a new task is assigned a shared ASID. Consider the following scenario with no ASID pinned: 1. Task t1 is running on CPUx with shared ASID (gen=1, asid=1) 2. Task t2 is scheduled on CPUx, gets ASID (1, 2) 3. Task tn is scheduled on CPUy, a rollover occurs, tn gets ASID (2, 1) We would now have to immediately generate a new ASID for t1, notify the IOMMU, and finally enable task tn. We are holding the lock during all that time, since we can't afford having another CPU trigger a rollover. The IOMMU issues invalidation commands that can take tens of milliseconds. It gets needlessly complicated. All we wanted to do was schedule task tn, that has no business with the IOMMU. By letting the IOMMU pin tasks when needed, we avoid stalling the slow path, and let the pinning fail when we're out of shareable ASIDs. After a rollover, the allocator expects at least one ASID to be available in addition to the reserved ones (one per CPU). So (NR_ASIDS - NR_CPUS - 1) is the maximum number of ASIDs that can be shared with the IOMMU. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200918101852.582559-5-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	KVM: arm64: pmu: Make overflow handler NMI safe	Julien Thierry	1	-1/+25
	kvm_vcpu_kick() is not NMI safe. When the overflow handler is called from NMI context, defer waking the vcpu to an irq_work queue. A vcpu can be freed while it's not running by kvm_destroy_vm(). Prevent running the irq_work for a non-existent vcpu by calling irq_work_sync() on the PMU destroy path. [Alexandru E.: Added irq_work_sync()] Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Pouloze <suzuki.poulose@arm.com> Cc: kvm@vger.kernel.org Cc: kvmarm@lists.cs.columbia.edu Link: https://lore.kernel.org/r/20200924110706.254996-6-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	arm64: perf: Defer irq_work to IPI_IRQ_WORK	Julien Thierry	1	-9/+5
	When handling events, armv8pmu_handle_irq() calls perf_event_overflow(), and subsequently calls irq_work_run() to handle any work queued by perf_event_overflow(). As perf_event_overflow() raises IPI_IRQ_WORK when queuing the work, this isn't strictly necessary and the work could be handled as part of the IPI_IRQ_WORK handler. In the common case the IPI handler will run immediately after the PMU IRQ handler, and where the PE is heavily loaded with interrupts other handlers may run first, widening the window where some counters are disabled. In practice this window is unlikely to be a significant issue, and removing the call to irq_work_run() would make the PMU IRQ handler NMI safe in addition to making it simpler, so let's do that. [Alexandru E.: Reworded commit message] Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-5-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	arm64: perf: Remove PMU locking	Julien Thierry	1	-28/+0
	The PMU is disabled and enabled, and the counters are programmed from contexts where interrupts or preemption is disabled. The functions to toggle the PMU and to program the PMU counters access the registers directly and don't access data modified by the interrupt handler. That, and the fact that they're always called from non-preemptible contexts, means that we don't need to disable interrupts or use a spinlock. [Alexandru E.: Explained why locking is not needed, removed WARN_ONs] Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-4-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	arm64: perf: Avoid PMXEV* indirection	Mark Rutland	1	-14/+85
	Currently we access the counter registers and their respective type registers indirectly. This requires us to write to PMSELR, issue an ISB, then access the relevant PMXEV* registers. This is unfortunate, because: * Under virtualization, accessing one register requires two traps to the hypervisor, even though we could access the register directly with a single trap. * We have to issue an ISB which we could otherwise avoid the cost of. * When we use NMIs, the NMI handler will have to save/restore the select register in case the code it preempted was attempting to access a counter or its type register. We can avoid these issues by directly accessing the relevant registers. This patch adds helpers to do so. In armv8pmu_enable_event() we still need the ISB to prevent the PE from reordering the write to PMINTENSET_EL1 register. If the interrupt is enabled before we disable the counter and the new event is configured, we might get an interrupt triggered by the previously programmed event overflowing, but which we wrongly attribute to the event that we are enabling. Execute an ISB after we disable the counter. In the process, remove the comment that refers to the ARMv7 PMU. [Julien T.: Don't inline read/write functions to avoid big code-size increase, remove unused read_pmevtypern function, fix counter index issue.] [Alexandru E.: Removed comment, removed trailing semicolons in macros, added ISB] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-3-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	arm64: perf: Add missing ISB in armv8pmu_enable_counter()	Alexandru Elisei	1	-0/+5
	Writes to the PMXEVTYPER_EL0 register are not self-synchronising. In armv8pmu_enable_event(), the PE can reorder configuring the event type after we have enabled the counter and the interrupt. This can lead to an interrupt being asserted because of the previous event type that we were counting using the same counter, not the one that we've just configured. The same rationale applies to writes to the PMINTENSET_EL1 register. The PE can reorder enabling the interrupt at any point in the future after we have enabled the event. Prevent both situations from happening by adding an ISB just before we enable the event counter. Fixes: 030896885ade ("arm64: Performance counters support") Reported-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox) Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20200924110706.254996-2-alexandru.elisei@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	x86: Use tracepoint_enabled() for msr tracepoints instead of open coding it	Steven Rostedt (VMware)	1	-11/+9
	7f47d8cc039f ("x86, tracing, perf: Add trace point for MSR accesses") added tracing of msr read and write, but because of complexity in having tracepoints in headers, and even more so for a core header like msr.h, not to mention the bloat a tracepoint adds to inline functions, a helper function is needed to be called from the header. Use the new tracepoint_enabled() macro in tracepoint-defs.h to test if the tracepoint is active before calling the helper function, instead of open coding the same logic, which requires knowing the internals of a tracepoint. Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-28	PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS	Thomas Gleixner	5	-5/+5
	The unconditional selection of PCI_MSI_ARCH_FALLBACKS has an unmet dependency because PCI_MSI_ARCH_FALLBACKS is defined in a 'if PCI' clause. As it is only relevant when PCI_MSI is enabled, update the affected architecture Kconfigs to make the selection of PCI_MSI_ARCH_FALLBACKS depend on 'if PCI_MSI'. Fixes: 077ee78e3928 ("PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable") Reported-by: Qian Cai <cai@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Links: https://lore.kernel.org/r/cdfd63305caa57785b0925dd24c0711ea02c8527.camel@redhat.com
2020-09-28	mtd: rawnand: Use the new ECC engine type enumeration	Miquel Raynal	25	-25/+25
	Mechanical switch from the legacy "mode" enumeration to the new "engine type" enumeration in drivers and board files. The device tree parsing is also updated to return the new enumeration from the old strings. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://lore.kernel.org/linux-mtd/20200827085208.16276-11-miquel.raynal@bootlin.com
2020-09-28	mtd: rawnand: Separate the ECC engine type and the ECC byte placement	Miquel Raynal	1	-1/+2
	The use of "syndrome" placement should not be encoded in the ECC engine mode/type. Create a "placement" field in NAND chip and change all occurrences of the NAND_ECC_HW_SYNDROME enumeration to be just NAND_ECC_HW and possibly a placement entry like NAND_ECC_PLACEMENT_INTERLEAVED. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://lore.kernel.org/linux-mtd/20200827085208.16276-10-miquel.raynal@bootlin.com
2020-09-28	arm64: perf: Add support caps under sysfs	Shaokun Zhang	3	-33/+75
	ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events, like STALL_SLOT etc, are related to it. Let's add a caps directory to /sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from PMMIR_EL1 registers in this entry. The user programs can get the slots from sysfs directly. /sys/bus/event_source/devices/armv8_pmuv3_0/caps/slots is exposed under sysfs. Both ARMv8.4-PMU and STALL_SLOT event are implemented, it returns the slots from PMMIR_EL1, otherwise it will return 0. Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1600754025-53535-1-git-send-email-zhangshaokun@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	ARM: dts: bcm2835: Change firmware compatible from simple-bus to simple-mfd	Maxime Ripard	1	-1/+1
	The current binding for the RPi firmware uses the simple-bus compatible as a fallback to benefit from its automatic probing of child nodes. However, simple-bus also comes with some constraints, like having the ranges, our case. Let's switch to simple-mfd that provides the same probing logic without those constraints. Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://lore.kernel.org/r/20200924082642.18144-1-maxime@cerno.tech Signed-off-by: Rob Herring <robh@kernel.org>
2020-09-28	KVM: x86: do not attempt TSC synchronization on guest writes	Paolo Bonzini	1	-20/+10
	KVM special-cases writes to MSR_IA32_TSC so that all CPUs have the same base for the TSC. This logic is complicated, and we do not want it to have any effect once the VM is started. In particular, if any guest started to synchronize its TSCs with writes to MSR_IA32_TSC rather than MSR_IA32_TSC_ADJUST, the additional effect of kvm_write_tsc code would be uncharted territory. Therefore, this patch makes writes to MSR_IA32_TSC behave essentially the same as writes to MSR_IA32_TSC_ADJUST when they come from the guest. A new selftest (which passes both before and after the patch) checks the current semantics of writes to MSR_IA32_TSC and MSR_IA32_TSC_ADJUST originating from both the host and the guest. Upcoming work to remove the special side effects of host-initiated writes to MSR_IA32_TSC and MSR_IA32_TSC_ADJUST will be able to build onto this test, adjusting the host side to use the new APIs and achieve the same effect. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: nSVM: delay MSR permission processing to first nested VM run	Paolo Bonzini	1	-3/+18
	Allow userspace to set up the memory map after KVM_SET_NESTED_STATE; to do so, move the call to nested_svm_vmrun_msrpm inside the KVM_REQ_GET_NESTED_STATE_PAGES handler (which is currently not used by nSVM). This is similar to what VMX does already. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: rename KVM_REQ_GET_VMCS12_PAGES	Paolo Bonzini	3	-8/+8
	We are going to use it for SVM too, so use a more generic name. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: Introduce MSR filtering	Alexander Graf	3	-1/+176
	It's not desireable to have all MSRs always handled by KVM kernel space. Some MSRs would be useful to handle in user space to either emulate behavior (like uCode updates) or differentiate whether they are valid based on the CPU model. To allow user space to specify which MSRs it wants to see handled by KVM, this patch introduces a new ioctl to push filter rules with bitmaps into KVM. Based on these bitmaps, KVM can then decide whether to reject MSR access. With the addition of KVM_CAP_X86_USER_SPACE_MSR it can also deflect the denied MSR events to user space to operate on. If no filter is populated, MSR handling stays identical to before. Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20200925143422.21718-8-graf@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: VMX: Prevent MSR passthrough when MSR access is denied	Alexander Graf	2	-52/+181
	We will introduce the concept of MSRs that may not be handled in kernel space soon. Some MSRs are directly passed through to the guest, effectively making them handled by KVM from user space's point of view. This patch introduces all logic required to ensure that MSRs that user space wants trapped are not marked as direct access for guests. Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20200925143422.21718-7-graf@amazon.com> [Replace "_idx" with "_slot". - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: SVM: Prevent MSR passthrough when MSR access is denied	Alexander Graf	2	-8/+76
	We will introduce the concept of MSRs that may not be handled in kernel space soon. Some MSRs are directly passed through to the guest, effectively making them handled by KVM from user space's point of view. This patch introduces all logic required to ensure that MSRs that user space wants trapped are not marked as direct access for guests. Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20200925143422.21718-6-graf@amazon.com> [Make terminology a bit more similar to VMX. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: Prepare MSR bitmaps for userspace tracked MSRs	Aaron Lewis	4	-70/+77
	Prepare vmx and svm for a subsequent change that ensures the MSR permission bitmap is set to allow an MSR that userspace is tracking to force a vmx_vmexit in the guest. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Oliver Upton <oupton@google.com> [agraf: rebase, adapt SVM scheme to nested changes that came in between] Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20200925143422.21718-5-graf@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: Add infrastructure for MSR filtering	Alexander Graf	4	-0/+10
	In the following commits we will add pieces of MSR filtering. To ensure that code compiles even with the feature half-merged, let's add a few stubs and struct definitions before the real patches start. Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20200925143422.21718-4-graf@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: Allow deflecting unknown MSR accesses to user space	Alexander Graf	3	-6/+135
	MSRs are weird. Some of them are normal control registers, such as EFER. Some however are registers that really are model specific, not very interesting to virtualization workloads, and not performance critical. Others again are really just windows into package configuration. Out of these MSRs, only the first category is necessary to implement in kernel space. Rarely accessed MSRs, MSRs that should be fine tunes against certain CPU models and MSRs that contain information on the package level are much better suited for user space to process. However, over time we have accumulated a lot of MSRs that are not the first category, but still handled by in-kernel KVM code. This patch adds a generic interface to handle WRMSR and RDMSR from user space. With this, any future MSR that is part of the latter categories can be handled in user space. Furthermore, it allows us to replace the existing "ignore_msrs" logic with something that applies per-VM rather than on the full system. That way you can run productive VMs in parallel to experimental ones where you don't care about proper MSR handling. Signed-off-by: Alexander Graf <graf@amazon.com> Reviewed-by: Jim Mattson <jmattson@google.com> Message-Id: <20200925143422.21718-3-graf@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: Return -ENOENT on unimplemented MSRs	Alexander Graf	1	-1/+1
	When we find an MSR that we can not handle, bubble up that error code as MSR error return code. Follow up patches will use that to expose the fact that an MSR is not handled by KVM to user space. Suggested-by: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20200925143422.21718-2-graf@amazon.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Rename vmx_uret_msr's "index" to "slot"	Sean Christopherson	2	-5/+5
	Rename "index" to "slot" in struct vmx_uret_msr to align with the terminology used by common x86's kvm_user_return_msrs, and to avoid conflating "MSR's ECX index" with "MSR's index into an array". No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-16-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Rename "vmx_msr_index" to "vmx_uret_msrs_list"	Sean Christopherson	1	-8/+8
	Rename "vmx_msr_index" to "vmx_uret_msrs_list" to associate it with the uret MSRs array, and to avoid conflating "MSR's ECX index" with "MSR's index into an array". Similarly, don't use "slot" in the name as that terminology is claimed by the common x86 "user_return_msrs" mechanism. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-15-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Rename "vmx_set_guest_msr" to "vmx_set_guest_uret_msr"	Sean Christopherson	1	-3/+4
	Add "uret" to vmx_set_guest_msr() to explicitly associate it with the guest_uret_msrs array, and to differentiate it from vmx_set_msr() as well as VMX's load/store MSRs. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-14-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Rename "find_msr_entry" to "vmx_find_uret_msr"	Sean Christopherson	3	-7/+7
	Rename "find_msr_entry" to scope it to VMX and to associate it with guest_uret_msrs. Drop the "entry" so that the function name pairs with the existing __vmx_find_uret_msr(), which intentionally uses a double underscore prefix instead of appending "index" or "slot" as those names are already claimed by other pieces of the user return MSR stack. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-13-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Add vmx_setup_uret_msr() to handle lookup and swap	Sean Christopherson	1	-31/+18
	Add vmx_setup_uret_msr() to wrap the lookup and manipulation of the uret MSRs array during setup_msrs(). In addition to consolidating code, this eliminates move_msr_up(), which while being a very literally description of the function, isn't exacly helpful in understanding the net effect of the code. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-12-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Move uret MSR lookup into update_transition_efer()	Sean Christopherson	1	-12/+19
	Move checking for the existence of MSR_EFER in the uret MSR array into update_transition_efer() so that the lookup and manipulation of the array in setup_msrs() occur back-to-back. This paves the way toward adding a helper to wrap the lookup and manipulation. To avoid unnecessary overhead, defer the lookup until the uret array would actually be modified in update_transition_efer(). EFER obviously exists on CPUs that support the dedicated VMCS fields for switching EFER, and EFER must exist for the guest and host EFER.NX value to diverge, i.e. there is no danger of attempting to read/write EFER when it doesn't exist. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-11-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Check guest support for RDTSCP before processing MSR_TSC_AUX	Sean Christopherson	1	-3/+5
	Check for RDTSCP support prior to checking if MSR_TSC_AUX is in the uret MSRs array so that the array lookup and manipulation are back-to-back. This paves the way toward adding a helper to wrap the lookup and manipulation. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-10-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Rename "__find_msr_index" to "__vmx_find_uret_msr"	Sean Christopherson	1	-8/+8
	Rename "__find_msr_index" to scope it to VMX, associate it with guest_uret_msrs, and to avoid conflating "MSR's ECX index" with "MSR's array index". Similarly, don't use "slot" in the name so as to avoid colliding the common x86's half of "user_return_msrs" (the slot in kvm_user_return_msrs is not the same slot in guest_uret_msrs). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-9-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Rename vcpu_vmx's "guest_msrs_ready" to "guest_uret_msrs_loaded"	Sean Christopherson	2	-5/+5
	Add "uret" to "guest_msrs_ready" to explicitly associate it with the "guest_uret_msrs" array, and replace "ready" with "loaded" to more precisely reflect what it tracks, e.g. "ready" could be interpreted as meaning ready for processing (setup_msrs() has run), which is wrong. "loaded" also aligns with the similar "guest_state_loaded" field. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-8-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Rename vcpu_vmx's "save_nmsrs" to "nr_active_uret_msrs"	Sean Christopherson	2	-12/+12
	Add "uret" into the name of "save_nmsrs" to explicitly associate it with the guest_uret_msrs array, and replace "save" with "active" (for lack of a better word) to better describe what is being tracked. While "save" is more or less accurate when viewed as a literal description of the field, e.g. it holds the number of MSRs that were saved into the array the last time setup_msrs() was invoked, it can easily be misinterpreted by the reader, e.g. as meaning the number of MSRs that were saved from hardware at some point in the past, or as the number of MSRs that need to be saved at some point in the future, both of which are wrong. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-7-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Rename vcpu_vmx's "nmsrs" to "nr_uret_msrs"	Sean Christopherson	2	-4/+4
	Rename vcpu_vmx.nsmrs to vcpu_vmx.nr_uret_msrs to explicitly associate it with the guest_uret_msrs array. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-6-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Rename the "shared_msr_entry" struct to "vmx_uret_msr"	Sean Christopherson	3	-35/+35
	Rename struct "shared_msr_entry" to "vmx_uret_msr" to align with x86's rename of "shared_msrs" to "user_return_msrs", and to call out that the struct is specific to VMX, i.e. not part of the generic "shared_msrs" framework. Abbreviate "user_return" as "uret" to keep line lengths marginally sane and code more or less readable. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-5-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Rename "vmx_find_msr_index" to "vmx_find_loadstore_msr_slot"	Sean Christopherson	3	-14/+14
	Add "loadstore" to vmx_find_msr_index() to differentiate it from the so called shared MSRs helpers (which will soon be renamed), and replace "index" with "slot" to better convey that the helper returns slot in the array, not the MSR index (the value that gets stuffed into ECX). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-4-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Prepend "MAX_" to MSR array size defines	Sean Christopherson	3	-9/+9
	Add "MAX" to the LOADSTORE and so called SHARED MSR defines to make it more clear that the define controls the array size, as opposed to the actual number of valid entries that are in the array. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-3-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: Rename "shared_msrs" to "user_return_msrs"	Sean Christopherson	3	-56/+60
	Rename the "shared_msrs" mechanism, which is used to defer restoring MSRs that are only consumed when running in userspace, to a more banal but less likely to be confusing "user_return_msrs". The "shared" nomenclature is confusing as it's not obvious who is sharing what, e.g. reasonable interpretations are that the guest value is shared by vCPUs in a VM, or that the MSR value is shared/common to guest and host, both of which are wrong. "shared" is also misleading as the MSR value (in hardware) is not guaranteed to be shared/reused between VMs (if that's indeed the correct interpretation of the name), as the ability to share values between VMs is simply a side effect (albiet a very nice side effect) of deferring restoration of the host value until returning from userspace. "user_return" avoids the above confusion by describing the mechanism itself instead of its effects. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923180409.32255-2-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86/mmu: Move individual kvm_mmu initialization into common helper	Sean Christopherson	1	-16/+9
	Move initialization of 'struct kvm_mmu' fields into alloc_mmu_pages() to consolidate code, and rename the helper to __kvm_mmu_create(). No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923163314.8181-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: nVMX: Read EXIT_QUAL and INTR_INFO only when needed for nested exit	Sean Christopherson	1	-3/+2
	Read vmcs.EXIT_QUALIFICATION and vmcs.VM_EXIT_INTR_INFO only if the VM-Exit is being reflected to L1 now that they are no longer passed directly to the kvm_nested_vmexit tracepoint. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923201349.16097-8-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: Use common definition for kvm_nested_vmexit tracepoint	Sean Christopherson	3	-43/+3
	Use the newly introduced TRACE_EVENT_KVM_EXIT to define the guts of kvm_nested_vmexit so that it captures and prints the same information as kvm_exit. This has the bonus side effect of fixing the interrupt info and error code printing for the case where they're invalid, e.g. if the exit was a failed VM-Entry. This also sets the stage for retrieving EXIT_QUALIFICATION and VM_EXIT_INTR_INFO in nested_vmx_reflect_vmexit() if and only if the VM-Exit is being routed to L1. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923201349.16097-7-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: Add macro wrapper for defining kvm_exit tracepoint	Sean Christopherson	1	-33/+36
	Macrofy the definition of kvm_exit so that the definition can be reused verbatim by kvm_nested_vmexit. No functional change intended. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923201349.16097-6-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: Add intr/vectoring info and error code to kvm_exit tracepoint	Sean Christopherson	4	-7/+39
	Extend the kvm_exit tracepoint to align it with kvm_nested_vmexit in terms of what information is captured. On SVM, add interrupt info and error code, while on VMX it add IDT vectoring and error code. This sets the stage for macrofying the kvm_exit tracepoint definition so that it can be reused for kvm_nested_vmexit without loss of information. Opportunistically stuff a zero for VM_EXIT_INTR_INFO if the VM-Enter failed, as the field is guaranteed to be invalid. Note, it'd be possible to further filter the interrupt/exception fields based on the VM-Exit reason, but the helper is intended only for tracepoints, i.e. an extra VMREAD or two is a non-issue, the failed VM-Enter case is just low hanging fruit. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923201349.16097-5-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: VMX: Add a helper to test for a valid error code given an intr info	Sean Christopherson	2	-3/+8
	Add a helper, is_exception_with_error_code(), to provide the simple but difficult to read code of checking for a valid exception with an error code given a vmcs.VM_EXIT_INTR_INFO value. The helper will gain another user, vmx_get_exit_info(), in a future patch. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923201349.16097-4-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: Read guest RIP from within the kvm_nested_vmexit tracepoint	Sean Christopherson	3	-5/+5
	Use kvm_rip_read() to read the guest's RIP for the nested VM-Exit tracepoint instead of having the caller pass in an argument. Params that are passed into a tracepoint are evaluated even if the tracepoint is disabled, i.e. passing in RIP for VMX incurs a VMREAD and retpoline to retrieve a value that may never be used, e.g. if the exit is due to a hardware interrupt. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923201349.16097-3-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: x86: Add RIP to the kvm_entry, i.e. VM-Enter, tracepoint	Sean Christopherson	2	-5/+7
	Add RIP to the kvm_entry tracepoint to help debug if the kvm_exit tracepoint is disabled or if VM-Enter fails, in which case the kvm_exit tracepoint won't be hit. Read RIP from within the tracepoint itself to avoid a potential VMREAD and retpoline if the guest's RIP isn't available. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923201349.16097-2-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: nVMX: WARN on attempt to switch the currently loaded VMCS	Sean Christopherson	1	-1/+1
	WARN if KVM attempts to switch to the currently loaded VMCS. Now that nested_vmx_free_vcpu() doesn't blindly call vmx_switch_vmcs(), all paths that lead to vmx_switch_vmcs() are implicitly guarded by guest vs. host mode, e.g. KVM should never emulate VMX instructions when guest mode is active, and nested_vmx_vmexit() should never be called when host mode is active. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923184452.980-8-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28	KVM: nVMX: Drop redundant VMCS switch and free_nested() call	Sean Christopherson	1	-2/+0
	Remove the explicit switch to vmcs01 and the call to free_nested() in nested_vmx_free_vcpu(). free_nested(), which is called unconditionally by vmx_leave_nested(), ensures vmcs01 is loaded prior to freeing vmcs02 and friends. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923184452.980-7-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>