Age | Commit message (Collapse) | Author | Files | Lines |
|
Simplify the __pkvm_host_donate_hyp() and pkvm_hyp_donate_host() paths
by not using the pkvm_mem_transition machinery. As the last users of
this, also remove all the now unused code.
No functional changes intended.
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250110121936.1559655-4-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Simplify the __pkvm_host_{un}share_hyp() paths by not using the
pkvm_mem_transition machinery. As there are the last users of the
do_share()/do_unshare(), remove all the now-unused code as well.
No functional changes intended.
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250110121936.1559655-3-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Simplify the __pkvm_host_{un}share_ffa() paths by using
{check,set}_page_state_range().
No functional changes intended.
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20250110121936.1559655-2-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
|
|
* kvm-arm64/pkvm-fixed-features-6.14: (24 commits)
: .
: Complete rework of the pKVM handling of features, catching up
: with the rest of the code deals with it these days.
: Patches courtesy of Fuad Tabba. From the cover letter:
:
: "This patch series uses the vm's feature id registers to track the
: supported features, a framework similar to nested virt to set the
: trap values, and removes the need to store cptr_el2 per vcpu in
: favor of setting its value when traps are activated, as VHE mode
: does."
:
: This branch drags the arm64/for-next/cpufeature branch to solve
: ugly conflicts in -next.
: .
KVM: arm64: Fix FEAT_MTE in pKVM
KVM: arm64: Use kvm_vcpu_has_feature() directly for struct kvm
KVM: arm64: Convert the SVE guest vcpu flag to a vm flag
KVM: arm64: Remove PtrAuth guest vcpu flag
KVM: arm64: Fix the value of the CPTR_EL2 RES1 bitmask for nVHE
KVM: arm64: Refactor kvm_reset_cptr_el2()
KVM: arm64: Calculate cptr_el2 traps on activating traps
KVM: arm64: Remove redundant setting of HCR_EL2 trap bit
KVM: arm64: Remove fixed_config.h header
KVM: arm64: Rework specifying restricted features for protected VMs
KVM: arm64: Set protected VM traps based on its view of feature registers
KVM: arm64: Fix RAS trapping in pKVM for protected VMs
KVM: arm64: Initialize feature id registers for protected VMs
KVM: arm64: Use KVM extension checks for allowed protected VM capabilities
KVM: arm64: Remove KVM_ARM_VCPU_POWER_OFF from protected VMs allowed features in pKVM
KVM: arm64: Move checking protected vcpu features to a separate function
KVM: arm64: Group setting traps for protected VMs by control register
KVM: arm64: Consolidate allowed and restricted VM feature checks
arm64/sysreg: Get rid of CPACR_ELx SysregFields
arm64/sysreg: Convert *_EL12 accessors to Mapping
...
Signed-off-by: Marc Zyngier <maz@kernel.org>
# Conflicts:
# arch/arm64/kvm/fpsimd.c
# arch/arm64/kvm/hyp/nvhe/pkvm.c
|
|
* kvm-arm64/pkvm-np-guest:
: .
: pKVM support for non-protected guests using the standard MM
: infrastructure, courtesy of Quentin Perret. From the cover letter:
:
: "This series moves the stage-2 page-table management of non-protected
: guests to EL2 when pKVM is enabled. This is only intended as an
: incremental step towards a 'feature-complete' pKVM, there is however a
: lot more that needs to come on top.
:
: With that series applied, pKVM provides near-parity with standard KVM
: from a functional perspective all while Linux no longer touches the
: stage-2 page-tables itself at EL1. The majority of mm-related KVM
: features work out of the box, including MMU notifiers, dirty logging,
: RO memslots and things of that nature. There are however two gotchas:
:
: - We don't support mapping devices into guests: this requires
: additional hypervisor support for tracking the 'state' of devices,
: which will come in a later series. No device assignment until then.
:
: - Stage-2 mappings are forced to page-granularity even when backed by a
: huge page for the sake of simplicity of this series. I'm only aiming
: at functional parity-ish (from userspace's PoV) for now, support for
: HP can be added on top later as a perf improvement."
: .
KVM: arm64: Plumb the pKVM MMU in KVM
KVM: arm64: Introduce the EL1 pKVM MMU
KVM: arm64: Introduce __pkvm_tlb_flush_vmid()
KVM: arm64: Introduce __pkvm_host_mkyoung_guest()
KVM: arm64: Introduce __pkvm_host_test_clear_young_guest()
KVM: arm64: Introduce __pkvm_host_wrprotect_guest()
KVM: arm64: Introduce __pkvm_host_relax_guest_perms()
KVM: arm64: Introduce __pkvm_host_unshare_guest()
KVM: arm64: Introduce __pkvm_host_share_guest()
KVM: arm64: Introduce __pkvm_vcpu_{load,put}()
KVM: arm64: Add {get,put}_pkvm_hyp_vm() helpers
KVM: arm64: Make kvm_pgtable_stage2_init() a static inline function
KVM: arm64: Pass walk flags to kvm_pgtable_stage2_relax_perms
KVM: arm64: Pass walk flags to kvm_pgtable_stage2_mkyoung
KVM: arm64: Move host page ownership tracking to the hyp vmemmap
KVM: arm64: Make hyp_page::order a u8
KVM: arm64: Move enum pkvm_page_state to memory.h
KVM: arm64: Change the layout of enum pkvm_page_state
Signed-off-by: Marc Zyngier <maz@kernel.org>
# Conflicts:
# arch/arm64/kvm/arm.c
|
|
kvm-arm64/pkvm-fixed-features-6.14
Merge arm64/for-next/cpufeature to solve extensive conflicts
caused by the CPACR_ELx->CPACR_EL1 repainting.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The Branch Record Buffer Extension (BRBE) adds a number of system
registers and instructions which we don't currently intend to expose to
guests. Our existing logic handles this safely, but this could be
improved with some explicit handling of BRBE.
KVM currently hides BRBE from guests: the cpufeature code's
ftr_id_aa64dfr0[] table doesn't have an entry for the BRBE field, and so
this will be zero in the sanitised value of ID_AA64DFR0 exposed to
guests via read_sanitised_id_aa64dfr0_el1().
KVM currently traps BRBE usage from guests: the default configuration of
the fine-grained trap controls HDFGRTR_EL2.{nBRBDATA,nBRBCTL,nBRBIDR}
and HFGITR_EL2.{nBRBINJ_nBRBIALL} cause these to be trapped to EL2.
Well-behaved guests shouldn't try to use the registers or instructions,
but badly-behaved guests could use these, resulting in unnecessary
warnings from KVM before it injects an UNDEF, e.g.
| kvm [197]: Unsupported guest access at: 401c98
| { Op0( 2), Op1( 1), CRn( 9), CRm( 0), Op2( 0), func_read },
| kvm [197]: Unsupported guest access at: 401d04
| { Op0( 2), Op1( 1), CRn( 9), CRm( 0), Op2( 1), func_read },
| kvm [197]: Unsupported guest access at: 401d70
| { Op0( 2), Op1( 1), CRn( 9), CRm( 2), Op2( 0), func_read },
| kvm [197]: Unsupported guest access at: 401ddc
| { Op0( 2), Op1( 1), CRn( 9), CRm( 1), Op2( 0), func_read },
| kvm [197]: Unsupported guest access at: 401e48
| { Op0( 2), Op1( 1), CRn( 9), CRm( 1), Op2( 1), func_read },
| kvm [197]: Unsupported guest access at: 401eb4
| { Op0( 2), Op1( 1), CRn( 9), CRm( 1), Op2( 2), func_read },
| kvm [197]: Unsupported guest access at: 401f20
| { Op0( 2), Op1( 1), CRn( 9), CRm( 0), Op2( 2), func_read },
| kvm [197]: Unsupported guest access at: 401f8c
| { Op0( 1), Op1( 1), CRn( 7), CRm( 2), Op2( 4), func_write },
| kvm [197]: Unsupported guest access at: 401ff8
| { Op0( 1), Op1( 1), CRn( 7), CRm( 2), Op2( 5), func_write },
As with other features that we know how to handle, these warnings aren't
particularly interesting, and we can simply treat these as UNDEFINED
without any warning. Add the necessary fine-grained undef configuration
to make this happen, as suggested by Marc Zyngier:
https://lore.kernel.org/linux-arm-kernel/86r0czk6wd.wl-maz@kernel.org/
At the same time, update read_sanitised_id_aa64dfr0_el1() to hide BRBE
from guests, as we do for SPE. This will prevent accidentally exposing
BRBE to guests if/when ftr_id_aa64dfr0[] gains a BRBE entry.
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20250109223836.419240-1-robh@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Remove hard-coded strings by using the str_enabled_disabled() helper
function.
Suggested-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20250110225310.369980-2-thorsten.blum@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
ID_AA64ISAR3_EL1 is currently marked as unallocated in KVM but does have a
number of bitfields defined in it. Expose FPRCVT and FAMINMAX, two simple
instruction only extensions to guests.
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250107-arm64-2024-dpisa-v5-4-7578da51fc3d@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Refactor nvhe stack code to use NVHE_STACK_SIZE/SHIFT constants,
instead of directly using PAGE_SIZE/SHIFT. This makes the code a bit
easier to read, without introducing any functional changes.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Link: https://lore.kernel.org/r/20241112003336.1375584-1-kaleshsingh@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The hypervisor VA space size depends on both the ID map's
(IDMAP_VA_BITS) and the kernel stage-1 (VA_BITS). However, the
hypervisor stacktrace decoding is solely relying on VA_BITS. This is
especially an issue when VA_BITS < IDMAP_VA_BITS (i.e. VA_BITS is
39-bit): the hypervisor may have addresses bigger than the stacktrace is
masking.
Align this mask with hyp_va_bits.
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/r/20250107112821.416591-1-vdonnefort@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Make sure we do not trap access to Allocation Tags.
Fixes: b56680de9c64 ("KVM: arm64: Initialize trap register values in hyp in pKVM")
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20250106112421.65355-1-vladimir.murzin@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
It appears that on Qualcomm's x1e CPU, CNTVOFF_EL2 doesn't really
work, specially with HCR_EL2.E2H=1.
A non-zero offset results in a screaming virtual timer interrupt,
to the tune of a few 100k interrupts per second on a 4 vcpu VM.
This is also evidenced by this CPU's inability to correctly run
any of the timer selftests.
The only case this doesn't break is when this register is set to 0,
which breaks VM migration.
When HCR_EL2.E2H=0, the timer seems to behave normally, and does
not result in an interrupt storm.
As a workaround, use the fact that this CPU implements FEAT_ECV,
and trap all accesses to the virtual timer and counter, keeping
CNTVOFF_EL2 set to zero, and emulate accesses to CVAL/TVAL/CTL
and the counter itself, fixing up the timer to account for the
missing offset.
And if you think this is disgusting, you'd probably be right.
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241217142321.763801-12-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Inject some sanity in CNTHCTL_EL2, ensuring that we don't handle
more than we advertise to the guest.
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241217142321.763801-11-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Allow a guest hypervisor to trap accesses to CNT{P,V}CT_EL02 by
propagating these trap bits to the host trap configuration.
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241217142321.763801-10-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
For completeness, fun, and cerebral meltdown, add the virtualisation
related traps to the counter and timers.
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241217142321.763801-9-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We already deal with CNTPCT_EL0 accesses in non-HYP context.
Let's add CNTVCT_EL0 as a good measure.
This is also an opportunity to simplify things and make it
plain that this code is only for non-HYP context handling.
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241217142321.763801-8-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Similarly to handling the physical timer accesses early when FEAT_ECV
causes a trap, we try to handle the physical counter without returning
to the general sysreg handling.
More surprisingly, we introduce something similar for the virtual
counter. Although this isn't necessary yet, it will prove useful on
systems that have a broken CNTVOFF_EL2 implementation. Yes, they exist.
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241217142321.763801-7-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Although FEAT_ECV allows us to correctly emulate the timers, it also
reduces performances pretty badly.
Mitigate this by emulating the CTL/CVAL register reads in the
inner run loop, without returning to the general kernel.
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241217142321.763801-6-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Although FEAT_NV2 makes most things fast, it also makes it impossible
to correctly emulate the timers, as the sysreg accesses are redirected
to memory.
FEAT_ECV addresses this by giving a hypervisor the ability to trap
the EL02 sysregs as well as the virtual timer.
Add the required trap setting to make use of the feature, allowing
us to elide the ugly resync in the middle of the run loop.
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241217142321.763801-5-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
With FEAT_NV2, the EL0 timer state is entirely stored in memory,
meaning that the hypervisor can only provide a very poor emulation.
The only thing we can really do is to publish the interrupt state
in the guest view of CNT{P,V}_CTL_EL0, and defer everything else
to the next exit.
Only FEAT_ECV will allow us to fix it, at the cost of extra trapping.
Suggested-by: Chase Conklin <chase.conklin@arm.com>
Suggested-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241217142321.763801-4-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Emulating the timers with FEAT_NV2 is a bit odd, as the timers
can be reconfigured behind our back without the hypervisor even
noticing. In the VHE case, that's an actual regression in the
architecture...
Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241217142321.763801-3-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Add the required handling for EL2 and EL02 registers, as
well as EL1 registers used in the E2H context. This includes
handling the virtual timer accesses when CNTHCTL_EL2.EL1TVT
or CNTHCTL_EL2.EL1TVCT are set.
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20241217142321.763801-2-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Although we never supported 32bit anywhere in NV, we fail to
advertise so for EL0, probably owing to the relative lack of
hardware supporting both NV2 and 32bit EL0.
Add some sanitising to ID_AA64PFR0_EL1.EL0, and reaffirm that
"in 64bit-only we trust".
Reported-by: Oliver Upton <oliver.upton@linux.dev>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Now that we have introduced kvm_vcpu_has_feature(), use it in the
remaining code that checks for features in struct kvm, instead of
using the __vcpu_has_feature() helper.
No functional change intended.
Suggested-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-18-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The vcpu flag GUEST_HAS_SVE is per-vcpu, but it is based on what
is now a per-vm feature. Make the flag per-vm.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-17-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The vcpu flag GUEST_HAS_PTRAUTH is always associated with the
vcpu PtrAuth features, which are defined per vm rather than per
vcpu.
Remove the flag, and replace it with checks for the features
instead.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-16-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Similar to VHE, calculate the value of cptr_el2 from scratch on
activate traps. This removes the need to store cptr_el2 in every
vcpu structure. Moreover, some traps, such as whether the guest
owns the fp registers, need to be set on every vcpu run.
Reported-by: James Clark <james.clark@linaro.org>
Fixes: 5294afdbf45a ("KVM: arm64: Exclude FP ownership from kvm_vcpu_arch")
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-13-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In hVHE mode, HCR_E2H should be set for both protected and
non-protected VMs. Since commit b56680de9c64 ("KVM: arm64:
Initialize trap register values in hyp in pKVM"), this has been
fixed, and the setting of the flag here is redundant.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-12-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The few remaining items needed in fixed_config.h are better
suited for pkvm.h. Move them there and delete it.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-11-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The existing code didn't properly distinguish between signed and
unsigned features, and was difficult to read and to maintain.
Rework it using the same method used in other parts of KVM when
handling vcpu features.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-10-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Now that the VM's feature id registers are initialized with the
values of the supported features, use those values to determine
which traps to set using kvm_has_feature().
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-9-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Trap RAS in pKVM if not supported at all for protected VMs. The
RAS version doesn't matter in this case.
Fixes: 2a0c343386ae ("KVM: arm64: Initialize trap registers for protected VMs")
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-8-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The hypervisor maintains the state of protected VMs. Initialize
the values for feature ID registers for protected VMs, to be used
when setting traps and when advertising features to protected
VMs.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-7-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Use KVM extension checks as the source for determining which
capabilities are allowed for protected VMs. KVM extension checks
is the natural place for this, since it is also the interface
exposed to users.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-6-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
features in pKVM
The hypervisor is responsible for the power state of protected
VMs in pKVM. Therefore, remove KVM_ARM_VCPU_POWER_OFF from the
list of allowed features for protected VMs.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
At the moment, checks for supported vcpu features for protected
VMs are build-time bugs. In the following patch, they will become
runtime checks based on the vcpu's features registers. Therefore,
consolidate them into one function that would return an error if
it encounters an unsupported feature.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Group setting protected VM traps by control register rather than
feature id register, since some trap values (e.g., PAuth), depend
on more than one feature id register.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The definitions for features allowed and allowed with
restrictions for protected guests, which are based on feature
registers, were defined and checked for separately, even though
they are handled in the same way. This could result in missing
checks for certain features, e.g., pointer authentication,
causing traps for allowed features.
Consolidate the definitions into one. Use that new definition to
construct the guest view of the feature registers for
consistency.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241216105057.579031-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Introduce the KVM_PGT_CALL() helper macro to allow switching from the
traditional pgtable code to the pKVM version easily in mmu.c. The cost
of this 'indirection' is expected to be very minimal due to
is_protected_kvm_enabled() being backed by a static key.
With this, everything is in place to allow the delegation of
non-protected guest stage-2 page-tables to pKVM, so let's stop using the
host's kvm_s2_mmu from EL2 and enjoy the ride.
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20241218194059.3670226-19-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Introduce a set of helper functions allowing to manipulate the pKVM
guest stage-2 page-tables from EL1 using pKVM's HVC interface.
Each helper has an exact one-to-one correspondance with the traditional
kvm_pgtable_stage2_*() functions from pgtable.c, with a strictly
matching prototype. This will ease plumbing later on in mmu.c.
These callbacks track the gfn->pfn mappings in a simple rb_tree indexed
by IPA in lieu of a page-table. This rb-tree is kept in sync with pKVM's
state and is protected by the mmu_lock like a traditional stage-2
page-table.
Signed-off-by: Quentin Perret <qperret@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Link: https://lore.kernel.org/r/20241218194059.3670226-18-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Introduce a new hypercall to flush the TLBs of non-protected guests. The
host kernel will be responsible for issuing this hypercall after changing
stage-2 permissions using the __pkvm_host_relax_guest_perms() or
__pkvm_host_wrprotect_guest() paths. This is left under the host's
responsibility for performance reasons.
Note however that the TLB maintenance for all *unmap* operations still
remains entirely under the hypervisor's responsibility for security
reasons -- an unmapped page may be donated to another entity, so a stale
TLB entry could be used to leak private data.
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20241218194059.3670226-17-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Plumb the kvm_pgtable_stage2_mkyoung() callback into pKVM for
non-protected guests. It will be called later from the fault handling
path.
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20241218194059.3670226-16-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Plumb the kvm_stage2_test_clear_young() callback into pKVM for
non-protected guest. It will be later be called from MMU notifiers.
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20241218194059.3670226-15-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Introduce a new hypercall to remove the write permission from a
non-protected guest stage-2 mapping. This will be used for e.g. enabling
dirty logging.
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20241218194059.3670226-14-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Introduce a new hypercall allowing the host to relax the stage-2
permissions of mappings in a non-protected guest page-table. It will be
used later once we start allowing RO memslots and dirty logging.
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20241218194059.3670226-13-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In preparation for letting the host unmap pages from non-protected
guests, introduce a new hypercall implementing the host-unshare-guest
transition.
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20241218194059.3670226-12-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In preparation for handling guest stage-2 mappings at EL2, introduce a
new pKVM hypercall allowing to share pages with non-protected guests.
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20241218194059.3670226-11-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Rather than look-up the hyp vCPU on every run hypercall at EL2,
introduce a per-CPU 'loaded_hyp_vcpu' tracking variable which is updated
by a pair of load/put hypercalls called directly from
kvm_arch_vcpu_{load,put}() when pKVM is enabled.
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Link: https://lore.kernel.org/r/20241218194059.3670226-10-qperret@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|