summaryrefslogtreecommitdiff
path: root/arch/x86/include
diff options
context:
space:
mode:
authorLike Xu <like.xu@linux.intel.com>2019-10-27 13:52:42 +0300
committerPaolo Bonzini <pbonzini@redhat.com>2019-11-15 13:44:09 +0300
commita6da0d77e98e94fa66187a5ce3cf7e11fbf95503 (patch)
tree690f38298684baa30b9b6d6d5d4d8266dedce32d /arch/x86/include
parentc900c156c518302058a48d2efe3ca44e465cad22 (diff)
downloadlinux-a6da0d77e98e94fa66187a5ce3cf7e11fbf95503.tar.xz
KVM: x86/vPMU: Reuse perf_event to avoid unnecessary pmc_reprogram_counter
The perf_event_create_kernel_counter() in the pmc_reprogram_counter() is a heavyweight and high-frequency operation, especially when host disables the watchdog (maximum 21000000 ns) which leads to an unacceptable latency of the guest NMI handler. It limits the use of vPMUs in the guest. When a vPMC is fully enabled, the legacy reprogram_*_counter() would stop and release its existing perf_event (if any) every time EVEN in most cases almost the same requested perf_event will be created and configured again. For each vPMC, if the reuqested config ('u64 eventsel' for gp and 'u8 ctrl' for fixed) is the same as its current config AND a new sample period based on pmc->counter is accepted by host perf interface, the current event could be reused safely as a new created one does. Otherwise, do release the undesirable perf_event and reprogram a new one as usual. It's light-weight to call pmc_pause_counter (disable, read and reset event) and pmc_resume_counter (recalibrate period and re-enable event) as guest expects instead of release-and-create again on any condition. Compared to use the filterable event->attr or hw.config, a new 'u64 current_config' field is added to save the last original programed config for each vPMC. Based on this implementation, the number of calls to pmc_reprogram_counter is reduced by ~82.5% for a gp sampling event and ~99.9% for a fixed event. In the usage of multiplexing perf sampling mode, the average latency of the guest NMI handler is reduced from 104923 ns to 48393 ns (~2.16x speed up). If host disables watchdog, the minimum latecy of guest NMI handler could be speed up at ~3413x (from 20407603 to 5979 ns) and at ~786x in the average. Suggested-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Like Xu <like.xu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Diffstat (limited to 'arch/x86/include')
-rw-r--r--arch/x86/include/asm/kvm_host.h5
1 files changed, 5 insertions, 0 deletions
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6f6b8886a8eb..a87a6c98adee 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -449,6 +449,11 @@ struct kvm_pmc {
u64 eventsel;
struct perf_event *perf_event;
struct kvm_vcpu *vcpu;
+ /*
+ * eventsel value for general purpose counters,
+ * ctrl value for fixed counters.
+ */
+ u64 current_config;
};
struct kvm_pmu {