summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2022-01-19Merge branch 'kvm-pi-raw-spinlock' into HEADPaolo Bonzini3-11/+37
Bring in fix for VT-d posted interrupts before further changing the code in 5.17. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-28Merge tag 'powerpc-5.16-5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: "Fix DEBUG_WX never reporting any WX mappings, due to use of an incorrect config symbol since we converted to using generic ptdump" * tag 'powerpc-5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/ptdump: Fix DEBUG_WX since generic ptdump conversion
2021-12-21powerpc/ptdump: Fix DEBUG_WX since generic ptdump conversionMichael Ellerman1-1/+1
In note_prot_wx() we bail out without reporting anything if CONFIG_PPC_DEBUG_WX is disabled. But CONFIG_PPC_DEBUG_WX was removed in the conversion to generic ptdump, we now need to use CONFIG_DEBUG_WX instead. Fixes: e084728393a5 ("powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/r/20211203124112.2912562-1-mpe@ellerman.id.au
2021-12-19Merge tag 'powerpc-5.16-4' of ↵Linus Torvalds2-10/+36
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fix a recently introduced oops at boot on 85xx in some configurations. Fix crashes when loading some livepatch modules with STRICT_MODULE_RWX. Thanks to Joe Lawrence, Russell Currey, and Xiaoming Ni" * tag 'powerpc-5.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/module_64: Fix livepatching for RO modules powerpc/85xx: Fix oops when CONFIG_FSL_PMC=n
2021-12-19Merge branch 'topic/ppc-kvm' of ↵Paolo Bonzini29-730/+1558
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into HEAD Fix conflicts between memslot overhaul and commit 511d25d6b789f ("KVM: PPC: Book3S: Suppress warnings when allocating too big memory slots") from the powerpc tree.
2021-12-14powerpc/module_64: Fix livepatching for RO modulesRussell Currey1-8/+34
Livepatching a loaded module involves applying relocations through apply_relocate_add(), which attempts to write to read-only memory when CONFIG_STRICT_MODULE_RWX=y. Work around this by performing these writes through the text poke area by using patch_instruction(). R_PPC_REL24 is the only relocation type generated by the kpatch-build userspace tool or klp-convert kernel tree that I observed applying a relocation to a post-init module. A more comprehensive solution is planned, but using patch_instruction() for R_PPC_REL24 on should serve as a sufficient fix. This does have a performance impact, I observed ~15% overhead in module_load() on POWER8 bare metal with checksum verification off. Fixes: c35717c71e98 ("powerpc: Set ARCH_HAS_STRICT_MODULE_RWX") Cc: stable@vger.kernel.org # v5.14+ Reported-by: Joe Lawrence <joe.lawrence@redhat.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Tested-by: Joe Lawrence <joe.lawrence@redhat.com> [mpe: Check return codes from patch_instruction()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211214121248.777249-1-mpe@ellerman.id.au
2021-12-14KVM: PPC: Book3S HV P9: Use kvm_arch_vcpu_get_wait() to get rcuwait objectSean Christopherson1-2/+3
Use kvm_arch_vcpu_get_wait() to get a vCPU's rcuwait object instead of using vcpu->wait directly in kvmhv_run_single_vcpu(). Functionally, this is a nop as vcpu->arch.waitp is guaranteed to point at vcpu->wait. But that is not obvious at first glance, and a future change coming in via the KVM tree, commit 510958e99721 ("KVM: Force PPC to define its own rcuwait object"), will hide vcpu->wait from architectures that define __KVM_HAVE_ARCH_WQP to prevent generic KVM from attepting to wake a vCPU with the wrong rcuwait object. Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211213174556.3871157-1-seanjc@google.com
2021-12-09KVM: powerpc: Use Makefile.kvm for common filesDavid Woodhouse1-6/+2
It's all fairly baroque but in the end, I don't think there's any reason for $(KVM)/irqchip.o to have been handled differently, as they all end up in $(kvm-y) in the end anyway, regardless of whether they get there via $(common-objs-y) and the CPU-specific object lists. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Message-Id: <20211121125451.9489-7-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08Merge branch 'kvm-on-hv-msrbm-fix' into HEADPaolo Bonzini3-4/+8
Merge bugfix for enlightened MSR Bitmap, before adding support to KVM for exposing the feature to nested guests.
2021-12-08KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt()Sean Christopherson4-4/+4
Rename kvm_vcpu_block() to kvm_vcpu_halt() in preparation for splitting the actual "block" sequences into a separate helper (to be named kvm_vcpu_block()). x86 will use the standalone block-only path to handle non-halt cases where the vCPU is not runnable. Rename block_ns to halt_ns to match the new function name. No functional change intended. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-14-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08KVM: Drop obsolete kvm_arch_vcpu_block_finish()Sean Christopherson1-1/+0
Drop kvm_arch_vcpu_block_finish() now that all arch implementations are nops. No functional change intended. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08KVM: Force PPC to define its own rcuwait objectSean Christopherson2-1/+3
Do not define/reference kvm_vcpu.wait if __KVM_HAVE_ARCH_WQP is true, and instead force the architecture (PPC) to define its own rcuwait object. Allowing common KVM to directly access vcpu->wait without a guard makes it all too easy to introduce potential bugs, e.g. kvm_vcpu_block(), kvm_vcpu_on_spin(), and async_pf_execute() all operate on vcpu->wait, not the result of kvm_arch_vcpu_get_wait(), and so may do the wrong thing for PPC. Due to PPC's shenanigans with respect to callbacks and waits (it switches to the virtual core's wait object at KVM_RUN!?!?), it's not clear whether or not this fixes any bugs. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08KVM: Keep memslots in tree-based structures instead of array-based onesMaciej S. Szmigiero4-12/+13
The current memslot code uses a (reverse gfn-ordered) memslot array for keeping track of them. Because the memslot array that is currently in use cannot be modified every memslot management operation (create, delete, move, change flags) has to make a copy of the whole array so it has a scratch copy to work on. Strictly speaking, however, it is only necessary to make copy of the memslot that is being modified, copying all the memslots currently present is just a limitation of the array-based memslot implementation. Two memslot sets, however, are still needed so the VM continues to run on the currently active set while the requested operation is being performed on the second, currently inactive one. In order to have two memslot sets, but only one copy of actual memslots it is necessary to split out the memslot data from the memslot sets. The memslots themselves should be also kept independent of each other so they can be individually added or deleted. These two memslot sets should normally point to the same set of memslots. They can, however, be desynchronized when performing a memslot management operation by replacing the memslot to be modified by its copy. After the operation is complete, both memslot sets once again point to the same, common set of memslot data. This commit implements the aforementioned idea. For tracking of gfns an ordinary rbtree is used since memslots cannot overlap in the guest address space and so this data structure is sufficient for ensuring that lookups are done quickly. The "last used slot" mini-caches (both per-slot set one and per-vCPU one), that keep track of the last found-by-gfn memslot, are still present in the new code. Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <17c0cf3663b760a0d3753d4ac08c0753e941b811.1638817641.git.maciej.szmigiero@oracle.com>
2021-12-08KVM: Use interval tree to do fast hva lookup in memslotsMaciej S. Szmigiero1-0/+1
The current memslots implementation only allows quick binary search by gfn, quick lookup by hva is not possible - the implementation has to do a linear scan of the whole memslots array, even though the operation being performed might apply just to a single memslot. This significantly hurts performance of per-hva operations with higher memslot counts. Since hva ranges can overlap between memslots an interval tree is needed for tracking them. [sean: handle interval tree updates in kvm_replace_memslot()] Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <d66b9974becaa9839be9c4e1a5de97b177b4ac20.1638817640.git.maciej.szmigiero@oracle.com>
2021-12-08KVM: Stop passing kvm_userspace_memory_region to arch memslot hooksSean Christopherson1-2/+0
Drop the @mem param from kvm_arch_{prepare,commit}_memory_region() now that its use has been removed in all architectures. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <aa5ed3e62c27e881d0d8bc0acbc1572bc336dc19.1638817640.git.maciej.szmigiero@oracle.com>
2021-12-08KVM: PPC: Avoid referencing userspace memory region in memslot updatesSean Christopherson6-23/+7
For PPC HV, get the number of pages directly from the new memslot instead of computing the same from the userspace memory region, and explicitly check for !DELETE instead of inferring the same when toggling mmio_update. The motivation for these changes is to avoid referencing the @mem param so that it can be dropped in a future commit. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <1e97fb5198be25f98ef82e63a8d770c682264cc9.1638817639.git.maciej.szmigiero@oracle.com>
2021-12-08KVM: Let/force architectures to deal with arch specific memslot dataSean Christopherson6-25/+33
Pass the "old" slot to kvm_arch_prepare_memory_region() and force arch code to handle propagating arch specific data from "new" to "old" when necessary. This is a baby step towards dynamically allocating "new" from the get go, and is a (very) minor performance boost on x86 due to not unnecessarily copying arch data. For PPC HV, copy the rmap in the !CREATE and !DELETE paths, i.e. for MOVE and FLAGS_ONLY. This is functionally a nop as the previous behavior would overwrite the pointer for CREATE, and eventually discard/ignore it for DELETE. For x86, copy the arch data only for FLAGS_ONLY changes. Unlike PPC HV, x86 needs to reallocate arch data in the MOVE case as the size of x86's allocations depend on the alignment of the memslot's gfn. Opportunistically tweak kvm_arch_prepare_memory_region()'s param order to match the "commit" prototype. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> [mss: add missing RISCV kvm_arch_prepare_memory_region() change] Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <67dea5f11bbcfd71e3da5986f11e87f5dd4013f9.1638817639.git.maciej.szmigiero@oracle.com>
2021-12-08KVM: Use 'unsigned long' as kvm_for_each_vcpu()'s indexMarc Zyngier10-24/+27
Everywhere we use kvm_for_each_vpcu(), we use an int as the vcpu index. Unfortunately, we're about to move rework the iterator, which requires this to be upgrade to an unsigned long. Let's bite the bullet and repaint all of it in one go. Signed-off-by: Marc Zyngier <maz@kernel.org> Message-Id: <20211116160403.4074052-7-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08KVM: Move wiping of the kvm->vcpus array to common codeMarc Zyngier1-9/+1
All architectures have similar loops iterating over the vcpus, freeing one vcpu at a time, and eventually wiping the reference off the vcpus array. They are also inconsistently taking the kvm->lock mutex when wiping the references from the array. Make this code common, which will simplify further changes. The locking is dropped altogether, as this should only be called when there is no further references on the kvm structure. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Message-Id: <20211116160403.4074052-2-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-02KVM: PPC: Book3S: Suppress failed alloc warning in H_COPY_TOFROM_GUESTAlexey Kardashevskiy1-1/+1
H_COPY_TOFROM_GUEST is an hcall for an upper level VM to access its nested VMs memory. The userspace can trigger WARN_ON_ONCE(!(gfp & __GFP_NOWARN)) in __alloc_pages() by constructing a tiny VM which only does H_COPY_TOFROM_GUEST with a too big GPR9 (number of bytes to copy). This silences the warning by adding __GFP_NOWARN. Spotted by syzkaller. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210901084550.1658699-1-aik@ozlabs.ru
2021-12-02KVM: PPC: Book3S: Suppress warnings when allocating too big memory slotsAlexey Kardashevskiy1-2/+6
The userspace can trigger "vmalloc size %lu allocation failure: exceeds total pages" via the KVM_SET_USER_MEMORY_REGION ioctl. This silences the warning by checking the limit before calling vzalloc() and returns ENOMEM if failed. This does not call underlying valloc helpers as __vmalloc_node() is only exported when CONFIG_TEST_VMALLOC_MODULE and __vmalloc_node_range() is not exported at all. Spotted by syzkaller. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [mpe: Use 'size' for the variable rather than 'cb'] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210901084512.1658628-1-aik@ozlabs.ru
2021-12-02KVM: PPC: Book3S HV P9: Remove unused ri_set local variableNicholas Piggin1-10/+3
ri_set is set and never used. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201052112.2137167-1-npiggin@gmail.com
2021-11-29powerpc/85xx: Fix oops when CONFIG_FSL_PMC=nXiaoming Ni1-2/+2
When CONFIG_FSL_PMC is set to n, no value is assigned to cpu_up_prepare in the mpc85xx_pm_ops structure. As a result, oops is triggered in smp_85xx_start_cpu(). smp: Bringing up secondary CPUs ... kernel tried to execute user page (0) - exploit attempt? (uid: 0) BUG: Unable to handle kernel instruction fetch (NULL pointer?) Faulting instruction address: 0x00000000 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP [00000000] 0x0 LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568 Call Trace: [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable) [c1051de8] [c0011460] __cpu_up+0xc0/0x228 [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224 [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8 [c1051eb8] [c07e67bc] smp_init+0x30/0x78 [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8 [c1051f18] [c00032d8] kernel_init+0x14/0x124 [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c Fixes: c45361abb918 ("powerpc/85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n") Reported-by: Martin Kennedy <hurricos@gmail.com> Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Tested-by: Martin Kennedy <hurricos@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211126041153.16926-1-nixiaoming@huawei.com
2021-11-27Merge tag 'powerpc-5.16-3' of ↵Linus Torvalds2-4/+7
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fix KVM using a Power9 instruction on earlier CPUs, which could lead to the host SLB being incorrectly invalidated and a subsequent host crash. Fix kernel hardlockup on vmap stack overflow on 32-bit. Thanks to Christophe Leroy, Nicholas Piggin, and Fabiano Rosas" * tag 'powerpc-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/32: Fix hardlockup on vmap stack overflow KVM: PPC: Book3S HV: Prevent POWER7/8 TLB flush flushing SLB
2021-11-25Merge tag 'asm-generic-5.16-2' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic syscall table update from Arnd Bergmann: "André Almeida sends an update for the newly added futex_waitv syscall that was initially only added to a few architectures. Some additional ones have since made it through architecture maintainer trees, this finishes the remaining ones" * tag 'asm-generic-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: futex: Wireup futex_waitv syscall
2021-11-25futex: Wireup futex_waitv syscallAndré Almeida1-0/+1
Wireup futex_waitv syscall for all remaining archs. Signed-off-by: André Almeida <andrealmeid@collabora.com> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-11-24KVM: PPC: Book3S HV P9: Remove subcore HMI handlingNicholas Piggin5-9/+67
On POWER9 and newer, rather than the complex HMI synchronisation and subcore state, have each thread un-apply the guest TB offset before calling into the early HMI handler. This allows the subcore state to be avoided, including subcore enter / exit guest, which includes an expensive divide that shows up slightly in profiles. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-54-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Stop using vc->dpdesNicholas Piggin3-12/+22
The P9 path uses vc->dpdes only for msgsndp / SMT emulation. This adds an ordering requirement between vcpu->doorbell_request and vc->dpdes for no real benefit. Use vcpu->doorbell_request directly. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-53-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Tidy kvmppc_create_dtl_entryNicholas Piggin1-25/+35
This goes further to removing vcores from the P9 path. Also avoid the memset in favour of explicitly initialising all fields. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-52-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Remove most of the vcore logicNicholas Piggin1-62/+85
The P9 path always uses one vcpu per vcore, so none of the vcore, locks, stolen time, blocking logic, shared waitq, etc., is required. Remove most of it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-51-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Avoid cpu_in_guest atomics on entry and exitNicholas Piggin3-19/+22
cpu_in_guest is set to determine if a CPU needs to be IPI'ed to exit the guest and notice the need_tlb_flush bit. This can be implemented as a global per-CPU pointer to the currently running guest instead of per-guest cpumasks, saving 2 atomics per entry/exit. P7/8 doesn't require cpu_in_guest, nor does a nested HV (only the L0 does), so move it to the P9 HV path. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-50-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Add unlikely annotation for !mmu_readyNicholas Piggin1-1/+1
The mmu will almost always be ready. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-49-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Avoid changing MSR[RI] in entry and exitNicholas Piggin1-27/+23
kvm_hstate.in_guest provides the equivalent of MSR[RI]=0 protection, and it covers the existing MSR[RI]=0 section in late entry and early exit, so clearing and setting MSR[RI] in those cases does not actually do anything useful. Remove the RI manipulation and replace it with comments. Make the in_guest memory accesses a bit closer to a proper critical section pattern. This speeds up guest entry/exit performance. This also removes the MSR[RI] warnings which aren't very interesting and would cause crashes if they hit due to causing an interrupt in non-recoverable code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-48-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Optimise hash guest SLB savingNicholas Piggin1-4/+18
slbmfee/slbmfev instructions are very expensive, moreso than a regular mfspr instruction, so minimising them significantly improves hash guest exit performance. The slbmfev is only required if slbmfee found a valid SLB entry. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-47-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Improve mfmsr performance on entryNicholas Piggin3-39/+47
Rearrange the MSR saving on entry so it does not follow the mtmsrd to disable interrupts, avoiding a possible RAW scoreboard stall. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-46-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entryNicholas Piggin3-5/+13
mftb() is expensive and one can be avoided on nested guest dispatch. If the time checking code distinguishes between the L0 timer and the nested HV timer, then both can be tested in the same place with the same mftb() value. This also nicely illustrates the relationship between the L0 and nested HV timers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-45-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exitNicholas Piggin3-37/+65
Use the existing TLB flushing logic to IPI the previous CPU and run the necessary barriers before running a guest vCPU on a new physical CPU, to do the necessary radix GTSE barriers for handling the case of an interrupted guest tlbie sequence. This requires the vCPU TLB flush sequence that is currently just done on one thread, to be expanded to ensure the other threads execute a ptesync, because causing them to exit the guest will no longer cause a ptesync by itself. This results in more IPIs than the TLB flush logic requires, but it's a significant win for common case scheduling when the vCPU remains on the same physical CPU. This saves about 520 cycles (nearly 10%) on a guest entry+exit micro benchmark on a POWER9. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-44-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV: Split P8 from P9 path guest vCPU TLB flushingNicholas Piggin3-48/+70
This creates separate functions for old and new paths for vCPU TLB flushing, which will reduce complexity of the next change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-43-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Don't restore PSSCR if not neededNicholas Piggin2-9/+24
This also moves the PSSCR update in nested entry to avoid a SPR scoreboard stall. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-42-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRsNicholas Piggin1-14/+20
Some of the DAWR SPR access is already predicated on dawr_enabled(), apply this to the remainder of the accesses. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-41-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Comment and fix MMU context switching codeNicholas Piggin3-13/+42
Tighten up partition switching code synchronisation and comments. In particular, hwsync ; isync is required after the last access that is performed in the context of a partition, before the partition is switched away from. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-40-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host SPRsNicholas Piggin5-51/+73
Linux implements SPR save/restore including storage space for registers in the task struct for process context switching. Make use of this similarly to the way we make use of the context switching fp/vec save restore. This improves code reuse, allows some stack space to be saved, and helps with avoiding VRSAVE updates if they are not required. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-39-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Demand fault TM facility registersNicholas Piggin3-10/+34
Use HFSCR facility disabling to implement demand faulting for TM, with a hysteresis counter similar to the load_fp etc counters in context switching that implement the equivalent demand faulting for userspace facilities. This speeds up guest entry/exit by avoiding the register save/restore when a guest is not frequently using them. When a guest does use them often, there will be some additional demand fault overhead, but these are not commonly used facilities. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-38-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Demand fault EBB facility registersNicholas Piggin3-8/+37
Use HFSCR facility disabling to implement demand faulting for EBB, with a hysteresis counter similar to the load_fp etc counters in context switching that implement the equivalent demand faulting for userspace facilities. This speeds up guest entry/exit by avoiding the register save/restore when a guest is not frequently using them. When a guest does use them often, there will be some additional demand fault overhead, but these are not commonly used facilities. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-37-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: More SPR speed improvementsNicholas Piggin1-30/+43
This avoids more scoreboard stalls and reduces mtSPRs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-36-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors that ↵Nicholas Piggin2-3/+6
require it Use CPU_FTR_P9_RADIX_PREFETCH_BUG to apply the workaround, to test for DD2.1 and below processors. This saves a mtSPR in guest entry. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-35-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possibleNicholas Piggin2-8/+4
This moves PMU switch to guest as late as possible in entry, and switch back to host as early as possible at exit. This helps the host get the most perf coverage of KVM entry/exit code as possible. This is slightly suboptimal for SPR scheduling point of view when the PMU is enabled, but when perf is disabled there is no real difference. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-34-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exitNicholas Piggin1-4/+23
If TM is not active, only TM register state needs to be saved and restored, avoiding several mfmsr/mtmsrd instructions and improving performance. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-33-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low level entryNicholas Piggin2-66/+109
Move register saving and loading from kvmhv_p9_guest_entry() into the HV and nested entry handlers. Accesses are scheduled to reduce mtSPR / mfSPR interleaving which reduces SPR scoreboard stalls. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-32-npiggin@gmail.com
2021-11-24KVM: PPC: Book3S HV P9: Move nested guest entry into its own functionNicholas Piggin1-58/+67
Move the part of the guest entry which is specific to nested HV into its own function. This is just refactoring. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-31-npiggin@gmail.com