summaryrefslogtreecommitdiff
path: root/arch/s390
AgeCommit message (Collapse)AuthorFilesLines
2022-09-21KVM: s390: pci: register pci hooks without interpretationMatthew Rosato2-5/+13
The kvm registration hooks must be registered even if the facilities necessary for zPCI interpretation are unavailable, as vfio-pci-zdev will expect to use the hooks regardless. This fixes an issue where vfio-pci-zdev will fail its open function because of a missing kvm_register when running on hardware that does not support zPCI interpretation. Fixes: ca922fecda6c ("KVM: s390: pci: Hook to access KVM lowlevel from VFIO") Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Link: https://lore.kernel.org/r/20220920193025.135655-1-mjrosato@linux.ibm.com Message-Id: <20220920193025.135655-1-mjrosato@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-09-21KVM: s390: pci: fix GAIT physical vs virtual pointers usageMatthew Rosato2-2/+2
The GAIT and all of its entries must be represented by physical addresses as this structure is shared with underlying firmware. We can keep a virtual address of the GAIT origin in order to handle processing in the kernel, but when traversing the entries we must again convert the physical AISB stored in that GAIT entry into a virtual address in order to process it. Note: this currently doesn't fix a real bug, since virtual addresses are indentical to physical ones. Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Acked-by: Nico Boehr <nrb@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20220907155952.87356-1-mjrosato@linux.ibm.com Message-Id: <20220907155952.87356-1-mjrosato@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-09-21KVM: s390: Pass initialized arg even if unusedJanis Schoetterl-Glausch1-3/+13
This silences smatch warnings reported by kbuild bot: arch/s390/kvm/gaccess.c:859 guest_range_to_gpas() error: uninitialized symbol 'prot'. arch/s390/kvm/gaccess.c:1064 access_guest_with_key() error: uninitialized symbol 'prot'. This is because it cannot tell that the value is not used in this case. The trans_exc* only examine prot if code is PGM_PROTECTION. Pass a dummy value for other codes. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20220825192540.1560559-1-scgl@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-09-21KVM: s390: pci: fix plain integer as NULL pointer warningsMatthew Rosato2-5/+5
Fix some sparse warnings that a plain integer 0 is being used instead of NULL. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220915175514.167899-1-mjrosato@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-09-16s390/pai: Add support for PAI Extension 1 NNPA countersThomas Richter5-4/+682
PMU device driver perf_paiext supports Processor Activity Instrumentation Extension (PAIE1), available with IBM z16: - maps a 512 byte block to lowcore address 0x1508 called PAIE1 control block. - maps a 1024 byte block at PAIE1 control block entry with index 2. - uses control register bit 14 to enable PAIE1 control block lookup. - turn PAIE1 nnpa counting on and off by setting bit 63 in PAIE1 control block entry with index 2. - creates a sample with raw data on each context switch out when at context switch some mapped counters have a value of nonzero. This device driver only supports CPU wide context, no task context is allowed. Support for counting: - one or more counters can be specified using perf stat -e pai_ext/xxx/ where xxx stands for the counter event name. Multiple invocation of this command is possible. The counter names are listed in /sys/devices/pai_ext/events directory. - one special counters can be specified using perf stat -e pai_ext/NNPA_ALL/ which returns the sum of all incremented nnpa counters. - multiple counting events can run in parallel. Support for Sampling: - one event pai_ext/NNPA_ALL/ is reserved for sampling. The event collects data at context switch out and saves them in the ring buffer. - no multiple invocations are possible. The PAIE1 nnpa counter events are system wide. No task context is supported. Therefore some restrictions documented in function paiext_busy() apply. Extend qpaci assembly instruction to query supported memory mapped nnpa counters. It returns the number of counters (no holes allowed in that range). PAIE1 nnpa counter events can not be created when a CPU hot plug add is processed. This means a CPU hot plug add does not get the necessary PAIE1 event to record PAIE1 nnpa counter increments on the newly added CPU. CPU hot plug remove removes the event and terminates the counting of PAIE1 counters immediately. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-16s390/mm: fix no previous prototype warnings in maccess.cAlexander Gordeev1-0/+1
Fix -Wmissing-prototypes warnings caused by missing maccess.h include. Reported-by: kernel test robot <lkp@intel.com> Fixes: 2f0e8aae26a2 ("s390/mm: rework memcpy_real() to avoid DAT-off mode") Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/mm: uninline copy_oldmem_kernel() functionAlexander Gordeev5-15/+19
Uninline copy_oldmem_kernel() function and make it consistent with a very similar memcpy_real() implementation, by moving to code to crash_dump.c, where it actually belongs. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/mm,ptdump: add real memory copy page markersAlexander Gordeev1-0/+7
Add "Real Memory Copy Area Start" and "Real Memory Copy Area End" markers that fence the page used for real memory copying. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/mm: rework memcpy_real() to avoid DAT-off modeAlexander Gordeev8-89/+70
Function memcpy_real() is an univeral data mover that does not require DAT mode to be able reading from a physical address. Its advantage is an ability to read from any address, even those for which no kernel virtual mapping exists. Although memcpy_real() is interrupt-safe, there are no handlers that make use of this function. The compiler instrumentation have to be disabled and separate no-DAT stack used to allow execution of the function once DAT mode is disabled. Rework memcpy_real() to overcome these shortcomings. As result, data copying (which is primarily reading out a crashed system memory by a user process) is executed on a regular stack with enabled interrupts. Also, use of memcpy_real_buf swap buffer becomes unnecessary and the swapping is eliminated. The above is achieved by using a fixed virtual address range that spans a single page and remaps that page repeatedly when memcpy_real() is called for a particular physical address. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/dump: save IPL CPU registers once DAT is availableAlexander Gordeev3-35/+34
Function smp_save_dump_cpus() collects CPU state of a crashed system for secondary CPUs and for the IPL CPU very differently. The Signal Processor stop-and-store-status orders are used for the former while Hardware System Area requests and memcpy_real() routine are called for the latter. In addition a system reset is triggered, which pins smp_save_dump_cpus() function call before CPU and device initialization. Move the collection of IPL CPU state to a later stage when DAT becomes available. That is needed to allow a follow-up rework of memcpy_real() routine. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/pci: convert high_memory to physical addressNiklas Schnelle1-1/+1
We use high_memory as a measure for amount of memory available in determining the required minimum size of our IOVA space with the assumption that one rarely maps more than the available memory for DMA. In special cases like mapping significant amounts of memory more than once this can still be tuned with the s390_iommu_apterture kernel parameter. In this use case high_memory is treated as a physical address. As high_memory is a virtual address however this means we need to convert it using virt_to_phys() before use Note that at the moment physical and virtual addresses are identical so this mismatch does not currently cause trouble. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/smp,ptdump: add absolute lowcore markersAlexander Gordeev1-0/+7
Add "Lowcore Area Start" and "Lowcore Area End" markers that fence pages where absolute lowcore resides. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/smp: rework absolute lowcore accessAlexander Gordeev14-84/+315
Temporary unsetting of the prefix page in memcpy_absolute() routine poses a risk of executing code path with unexpectedly disabled prefix page. This rework avoids the prefix page uninstalling and disabling of normal and machine check interrupts when accessing the absolute zero memory. Although memcpy_absolute() routine can access the whole memory, it is only used to update the absolute zero lowcore. This rework therefore introduces a new mechanism for the absolute zero lowcore access and scraps memcpy_absolute() routine for good. Instead, an area is reserved in the virtual memory that is used for the absolute lowcore access only. That area holds an array of 8KB virtual mappings - one per CPU. Whenever a CPU is brought online, the corresponding item is mapped to the real address of the previously installed prefix page. The absolute zero lowcore access works like this: a CPU calls the new primitive get_abs_lowcore() to obtain its 8KB mapping as a pointer to the struct lowcore. Virtual address references to that pointer get translated to the real addresses of the prefix page, which in turn gets swapped with the absolute zero memory addresses due to prefixing. Once the pointer is not needed it must be released with put_abs_lowcore() primitive: struct lowcore *abs_lc; unsigned long flags; abs_lc = get_abs_lowcore(&flags); abs_lc->... = ...; put_abs_lowcore(abs_lc, flags); To ensure the described mechanism works large segment- and region- table entries must be avoided for the 8KB mappings. Failure to do so results in usage of Region-Frame Absolute Address (RFAA) or Segment-Frame Absolute Address (SFAA) large page fields. In that case absolute addresses would be used to address the prefix page instead of the real ones and the prefixing would get bypassed. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14s390/smp: call smp_reinit_ipl_cpu() before scheduler is availableAlexander Gordeev3-2/+3
Currently smp_reinit_ipl_cpu() is a pre-SMP early initcall. That ensures no CPU is running in parallel, but still not enough to assume the code is exclusive, since the scheduling is already available. Move the function call to arch_call_rest_init() callback to ensure no thread could be preempted and allow lockless allocation of the kernel page tables. That is needed to allow a follow-up rework of the absolute lowcore access mechanism. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14Merge branch 'fixes' into featuresVasily Gorbik8-54/+68
* fixes: s390/smp: enforce lowcore protection on CPU restart s390/boot: fix absolute zero lowcore corruption on boot s390/hugetlb: fix prepare_hugepage_range() check for 2 GB hugepages s390: update defconfigs s390: fix nospec table alignments s390/mm: remove useless hugepage address alignment Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-12kernel: exit: cleanup release_thread()Kefeng Wang1-3/+0
Only x86 has own release_thread(), introduce a new weak release_thread() function to clean empty definitions in other ARCHs. Link: https://lkml.kernel.org/r/20220819014406.32266-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Brian Cain <bcain@quicinc.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Huacai Chen <chenhuacai@kernel.org> [LoongArch] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> [csky] Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jonas Bonn <jonas@southpole.se> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Xuerui Wang <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-12s390/hugetlb: switch to generic version of follow_huge_pud()Gerald Schaefer1-10/+0
When pud-sized hugepages were introduced for s390, the generic version of follow_huge_pud() was using pte_page() instead of pud_page(). This would be wrong for s390, see also commit 97534127012f ("mm/hugetlb: use pmd_page() in follow_huge_pmd()"). Therefore, and probably because not all archs were supporting pud_page() at that time, a private version of follow_huge_pud() was added for s390, correctly using pud_page(). Since commit 3a194f3f8ad01 ("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry"), the generic version of follow_huge_pud() is now also using pud_page(), and in general behaves similar to follow_huge_pmd(). Therefore we can now switch to the generic version and get rid of the s390-specific follow_huge_pud(). Link: https://lkml.kernel.org/r/20220818135717.609eef8a@thinkpad Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Haiyue Wang <haiyue.wang@intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: David Hildenbrand <david@redhat.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-10Merge tag 's390-6.0-4' of ↵Linus Torvalds2-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix absolute zero lowcore corruption on kdump when CPU0 is offline - Fix lowcore protection setup for offline CPU restart * tag 's390-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/smp: enforce lowcore protection on CPU restart s390/boot: fix absolute zero lowcore corruption on boot
2022-09-09Merge tag 'asm-generic-fixes-6.0-rc4' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull SOFTIRQ_ON_OWN_STACK rework from Arnd Bergmann: "Just one fixup patch, reworking the softirq_on_own_stack logic for preempt-rt kernels as discussed in https://lore.kernel.org/all/CAHk-=wgZSD3W2y6yczad2Am=EfHYyiPzTn3CfXxrriJf9i5W5w@mail.gmail.com/" * tag 'asm-generic-fixes-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.
2022-09-09termios: kill uapi termios.h that are identical to generic oneAl Viro1-50/+0
mandatory-y will have the generic picked for architectures that don't have uapi/asm/termios.h of their own. ia64, parisc and s390 ones are identical to generic, so... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/YxGVXpS2dWoTwoa0@ZenIV Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-09termios: get rid of non-UAPI asm/termios.hAl Viro1-12/+0
All non-UAPI asm/termios.h consist of include of UAPI counterpart and, possibly, include of linux/uaccess.h The latter can't be simply removed, even though nothing in linux/termios.h doesn't depend upon it anymore - there are several places that rely upon that indirect chain of includes to pull linux/uaccess.h. So the include needs to be lifted out of there - we lift into tty_driver.h, serdev.h and places that pull asm/termios.h, but none of * linux/uaccess.h (obvious) * net/sock.h (pulls uaccess.h) * linux/{tty,tty_driver,serdev}.h (tty.h pulls tty_driver.h) That leaves us just with the include of UAPI asm/termios.h, which is what <asm/termios.h> will resolve to if we simply remove non-UAPI header. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/YxDnKvYCHn/ogBUv@ZenIV Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-09termios: start unifying non-UAPI parts of asm/termios.hAl Viro1-14/+0
* new header (linut/termios_internal.h), pulled by the users of those suckers * defaults for INIT_C_CC and externs for conversion helpers moved over there * remove termios-base.h (empty now) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/YxDmptU7dNGZ+/Hn@ZenIV Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07s390/ptdump: add missing amode31 markersHeiko Carstens1-0/+6
Add amode31 markers which makes a ro mapping in the middle of nowhere in the kernel_page_tables output less magic. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-07s390/mm: split lowcore pages with set_memory_4k()Heiko Carstens1-2/+5
Use set_memory_4k() to split lowcore pages within the kernel mapping instead of using the quite subtle !addr check within modify_pmd_table() and modify_pud_table() to prevent large pages for address zero. With this lowcore might be mapped with 1MB / 2GB frames and only later will be split. This way this mapping is handled like every other. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-07s390/smp: enforce lowcore protection on CPU restartAlexander Gordeev1-1/+1
As result of commit 915fea04f932 ("s390/smp: enable DAT before CPU restart callback is called") the low-address protection bit gets mistakenly unset in control register 0 save area of the absolute zero memory. That area is used when manual PSW restart happened to hit an offline CPU. In this case the low-address protection for that CPU will be dropped. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Fixes: 915fea04f932 ("s390/smp: enable DAT before CPU restart callback is called") Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-07s390/boot: fix absolute zero lowcore corruption on bootAlexander Gordeev2-1/+2
Crash dump always starts on CPU0. In case CPU0 is offline the prefix page is not installed and the absolute zero lowcore is used. However, struct lowcore::mcesad is never assigned and stays zero. That leads to __machine_kdump() -> save_vx_regs() call silently stores vector registers to the absolute lowcore at 0x11b0 offset. Fixes: a62bc0739253 ("s390/kdump: add support for vector extension") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-05asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.Sebastian Andrzej Siewior1-1/+1
Remove the CONFIG_PREEMPT_RT symbol from the ifdef around do_softirq_own_stack() and move it to Kconfig instead. Enable softirq stacks based on SOFTIRQ_ON_OWN_STACK which depends on HAVE_SOFTIRQ_ON_OWN_STACK and its default value is set to !PREEMPT_RT. This ensures that softirq stacks are not used on PREEMPT_RT and avoids a 'select' statement on an option which has a 'depends' statement. Link: https://lore.kernel.org/YvN5E%2FPrHfUhggr7@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-09-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds4-16/+26
Pull kvm fixes from Paolo Bonzini: "s390: - PCI interpretation compile fixes RISC-V: - fix unused variable warnings in vcpu_timer.c - move extern sbi_ext declarations to a header x86: - check validity of argument to KVM_SET_MP_STATE - use guest's global_ctrl to completely disable guest PEBS - fix a memory leak on memory allocation failure - mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES - fix build failure with Clang integrated assembler - fix MSR interception - always flush TLBs when enabling dirty logging" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: check validity of argument to KVM_SET_MP_STATE perf/x86/core: Completely disable guest PEBS via guest's global_ctrl KVM: x86: fix memoryleak in kvm_arch_vcpu_create() KVM: x86: Mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES KVM: s390: pci: Hook to access KVM lowlevel from VFIO riscv: kvm: move extern sbi_ext declarations to a header riscv: kvm: vcpu_timer: fix unused variable warnings KVM: selftests: Fix ambiguous mov in KVM_ASM_SAFE() KVM: selftests: Fix KVM_EXCEPTION_MAGIC build with Clang KVM: VMX: Heed the 'msr' argument in msr_write_intercepted() kvm: x86: mmu: Always flush TLBs when enabling dirty logging kvm: x86: mmu: Drop the need_remote_flush() function
2022-08-30s390/mm: remove unused access parameter from do_fault_error()Heiko Carstens1-8/+7
Remove unused access parameter from do_fault_error() which also makes the code a bit more readable since quite some callers can be simplified. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30s390/delay: sync comment within __delay() with realityHeiko Carstens1-7/+4
The comment within __delay() is outdated and does not reflect anymore what the function is doing. Therefore replace the comment. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30s390: move from strlcpy with unused retval to strscpyWolfram Sang2-2/+2
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Benjamin Block <bblock@linux.ibm.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Link: https://lore.kernel.org/r/20220818205948.6360-1-wsa+renesas@sang-engineering.com Link: https://lore.kernel.org/r/20220818210102.7301-1-wsa+renesas@sang-engineering.com [gor@linux.ibm.com: squashed two changes linked above together] Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30s390/hugetlb: fix prepare_hugepage_range() check for 2 GB hugepagesGerald Schaefer1-2/+4
The alignment check in prepare_hugepage_range() is wrong for 2 GB hugepages, it only checks for 1 MB hugepage alignment. This can result in kernel crash in __unmap_hugepage_range() at the BUG_ON(start & ~huge_page_mask(h)) alignment check, for mappings created with MAP_FIXED at unaligned address. Fix this by correctly handling multiple hugepage sizes, similar to the generic version of prepare_hugepage_range(). Fixes: d08de8e2d867 ("s390/mm: add support for 2GB hugepages") Cc: <stable@vger.kernel.org> # 4.8+ Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30s390: update defconfigsHeiko Carstens3-48/+60
Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30s390: fix nospec table alignmentsJosh Poimboeuf1-0/+1
Add proper alignment for .nospec_call_table and .nospec_return_table in vmlinux. [hca@linux.ibm.com]: The problem with the missing alignment of the nospec tables exist since a long time, however only since commit e6ed91fd0768 ("s390/alternatives: remove padding generation code") and with CONFIG_RELOCATABLE=n the kernel may also crash at boot time. The above named commit reduced the size of struct alt_instr by one byte, so its new size is 11 bytes. Therefore depending on the number of cpu alternatives the size of the __alt_instructions array maybe odd, which again also causes that the addresses of the nospec tables will be odd. If the address of __nospec_call_start is odd and the kernel is compiled With CONFIG_RELOCATABLE=n the compiler may generate code that loads the address of __nospec_call_start with a 'larl' instruction. This will generate incorrect code since the 'larl' instruction only works with even addresses. In result the members of the nospec tables will be accessed with an off-by-one offset, which subsequently may lead to addressing exceptions within __nospec_revert(). Fixes: f19fbd5ed642 ("s390: introduce execute-trampolines for branches") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/8719bf1ce4a72ebdeb575200290094e9ce047bcc.1661557333.git.jpoimboe@kernel.org Cc: <stable@vger.kernel.org> # 4.16 Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30s390/mm: remove useless hugepage address alignmentGerald Schaefer1-2/+0
The failing address alignment to HPAGE_MASK in do_exception(), for hugetlb faults, was useless from the beginning. With 2 GB hugepage support it became wrong, but w/o further negative impact. Now it could have negative performance impact because it breaks the cacheline optimization for process_huge_page(). Therefore, remove it. Note that we still have failing address alignment by HW to PAGE_SIZE, for all page faults, not just hugetlb faults. So this patch will not fix UFFD_FEATURE_EXACT_ADDRESS for userfaultfd handling. It will just move the failing address for hugetlb faults a bit closer to the real address, at 4K page granularity, similar to normal page faults. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-29KVM: s390: pci: Hook to access KVM lowlevel from VFIOPierre Morel4-16/+26
We have a cross dependency between KVM and VFIO when using s390 vfio_pci_zdev extensions for PCI passthrough To be able to keep both subsystem modular we add a registering hook inside the S390 core code. This fixes a build problem when VFIO is built-in and KVM is built as a module. Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Fixes: 09340b2fca007 ("KVM: s390: pci: add routines to start/stop interpretive execution") Cc: <stable@vger.kernel.org> Acked-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Link: https://lore.kernel.org/r/20220819122945.9309-1-pmorel@linux.ibm.com Message-Id: <20220819122945.9309-1-pmorel@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-08-28Merge tag 's390-6.0-2' of ↵Linus Torvalds2-7/+19
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix double free of guarded storage and runtime instrumentation control blocks on fork() failure - Fix triggering write fault when VMA does not allow VM_WRITE * tag 's390-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: do not trigger write fault when vma does not allow VM_WRITE s390: fix double free of GS and RI CBs on fork() failure
2022-08-27provide arch_test_bit_acquire for architectures that define test_bitMikulas Patocka1-8/+2
Some architectures define their own arch_test_bit and they also need arch_test_bit_acquire, otherwise they won't compile. We also clean up the code by using the generic test_bit if that is equivalent to the arch-specific version. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-26crypto: Kconfig - simplify cipher entriesRobert Elliott1-9/+19
Shorten menu titles and make them consistent: - acronym - name - architecture features in parenthesis - no suffixes like "<something> algorithm", "support", or "hardware acceleration", or "optimized" Simplify help text descriptions, update references, and ensure that https references are still valid. Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-08-26crypto: Kconfig - simplify hash entriesRobert Elliott1-18/+24
Shorten menu titles and make them consistent: - acronym - name - architecture features in parenthesis - no suffixes like "<something> algorithm", "support", or "hardware acceleration", or "optimized" Simplify help text descriptions, update references, and ensure that https references are still valid. Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-08-26crypto: Kconfig - simplify CRC entriesRobert Elliott1-5/+4
Shorten menu titles and make them consistent: - acronym - name - architecture features in parenthesis - no suffixes like "<something> algorithm", "support", or "hardware acceleration", or "optimized" Simplify help text descriptions, update references, and ensure that https references are still valid. Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-08-26crypto: Kconfig - move s390 entries to a submenuRobert Elliott1-0/+120
Move CPU-specific crypto/Kconfig entries to arch/xxx/crypto/Kconfig and create a submenu for them under the Crypto API menu. Suggested-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-08-25s390/mm: do not trigger write fault when vma does not allow VM_WRITEGerald Schaefer1-1/+3
For non-protection pXd_none() page faults in do_dat_exception(), we call do_exception() with access == (VM_READ | VM_WRITE | VM_EXEC). In do_exception(), vma->vm_flags is checked against that before calling handle_mm_fault(). Since commit 92f842eac7ee3 ("[S390] store indication fault optimization"), we call handle_mm_fault() with FAULT_FLAG_WRITE, when recognizing that it was a write access. However, the vma flags check is still only checking against (VM_READ | VM_WRITE | VM_EXEC), and therefore also calling handle_mm_fault() with FAULT_FLAG_WRITE in cases where the vma does not allow VM_WRITE. Fix this by changing access check in do_exception() to VM_WRITE only, when recognizing write access. Link: https://lkml.kernel.org/r/20220811103435.188481-3-david@redhat.com Fixes: 92f842eac7ee3 ("[S390] store indication fault optimization") Cc: <stable@vger.kernel.org> Reported-by: David Hildenbrand <david@redhat.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-25s390: fix double free of GS and RI CBs on fork() failureBrian Foster1-6/+16
The pointers for guarded storage and runtime instrumentation control blocks are stored in the thread_struct of the associated task. These pointers are initially copied on fork() via arch_dup_task_struct() and then cleared via copy_thread() before fork() returns. If fork() happens to fail after the initial task dup and before copy_thread(), the newly allocated task and associated thread_struct memory are freed via free_task() -> arch_release_task_struct(). This results in a double free of the guarded storage and runtime info structs because the fields in the failed task still refer to memory associated with the source task. This problem can manifest as a BUG_ON() in set_freepointer() (with CONFIG_SLAB_FREELIST_HARDENED enabled) or KASAN splat (if enabled) when running trinity syscall fuzz tests on s390x. To avoid this problem, clear the associated pointer fields in arch_dup_task_struct() immediately after the new task is copied. Note that the RI flag is still cleared in copy_thread() because it resides in thread stack memory and that is where stack info is copied. Signed-off-by: Brian Foster <bfoster@redhat.com> Fixes: 8d9047f8b967c ("s390/runtime instrumentation: simplify task exit handling") Fixes: 7b83c6297d2fc ("s390/guarded storage: simplify task exit handling") Cc: <stable@vger.kernel.org> # 4.15 Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20220816155407.537372-1-bfoster@redhat.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-15s390/hypfs: avoid error message under KVMJuergen Gross2-2/+2
When booting under KVM the following error messages are issued: hypfs.7f5705: The hardware system does not support hypfs hypfs.7a79f0: Initialization of hypfs failed with rc=-61 Demote the severity of first message from "error" to "info" and issue the second message only in other error cases. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20220620094534.18967-1-jgross@suse.com [arch/s390/hypfs/hypfs_diag.c changed description] Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-08Merge tag 'bitmap-6.0-rc1' of https://github.com/norov/linuxLinus Torvalds1-30/+31
Pull bitmap updates from Yury Norov: - fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo) - optimize out non-atomic bitops on compile-time constants (Alexander Lobakin) - cleanup bitmap-related headers (Yury Norov) - x86/olpc: fix 'logical not is only applied to the left hand side' (Alexander Lobakin) - lib/nodemask: inline wrappers around bitmap (Yury Norov) * tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits) lib/nodemask: inline next_node_in() and node_random() powerpc: drop dependency on <asm/machdep.h> in archrandom.h x86/olpc: fix 'logical not is only applied to the left hand side' lib/cpumask: move some one-line wrappers to header file headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h> headers/deps: mm: Optimize <linux/gfp.h> header dependencies lib/cpumask: move trivial wrappers around find_bit to the header lib/cpumask: change return types to unsigned where appropriate cpumask: change return types to bool where appropriate lib/bitmap: change type of bitmap_weight to unsigned long lib/bitmap: change return types to bool where appropriate arm: align find_bit declarations with generic kernel iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE) lib/test_bitmap: test the tail after bitmap_to_arr64() lib/bitmap: fix off-by-one in bitmap_to_arr64() lib: test_bitmap: add compile-time optimization/evaluations assertions bitmap: don't assume compiler evaluates small mem*() builtins calls net/ice: fix initializing the bitmap in the switch code bitops: let optimize out non-atomic bitops on compile-time constants ...
2022-08-07Merge tag 's390-5.20-1' of ↵Linus Torvalds27-188/+138
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Rework copy_oldmem_page() callback to take an iov_iter. This includes a few prerequisite updates and fixes to the oldmem reading code. - Rework cpufeature implementation to allow for various CPU feature indications, which is not only limited to hardware capabilities, but also allows CPU facilities. - Use the cpufeature rework to autoload Ultravisor module when CPU facility 158 is available. - Add ELF note type for encrypted CPU state of a protected virtual CPU. The zgetdump tool from s390-tools package will decrypt the CPU state using a Customer Communication Key and overwrite respective notes to make the data accessible for crash and other debugging tools. - Use vzalloc() instead of vmalloc() + memset() in ChaCha20 crypto test. - Fix incorrect recovery of kretprobe modified return address in stacktrace. - Switch the NMI handler to use generic irqentry_nmi_enter() and irqentry_nmi_exit() helper functions. - Rework the cryptographic Adjunct Processors (AP) pass-through design to support dynamic changes to the AP matrix of a running guest as well as to implement more of the AP architecture. - Minor boot code cleanups. - Grammar and typo fixes to hmcdrv and tape drivers. * tag 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (46 commits) Revert "s390/smp: enforce lowcore protection on CPU restart" Revert "s390/smp: rework absolute lowcore access" Revert "s390/smp,ptdump: add absolute lowcore markers" s390/unwind: fix fgraph return address recovery s390/nmi: use irqentry_nmi_enter()/irqentry_nmi_exit() s390: add ELF note type for encrypted CPU state of a PV VCPU s390/smp,ptdump: add absolute lowcore markers s390/smp: rework absolute lowcore access s390/setup: rearrange absolute lowcore initialization s390/boot: cleanup adjust_to_uv_max() function s390/smp: enforce lowcore protection on CPU restart s390/tape: fix comment typo s390/hmcdrv: fix Kconfig "its" grammar s390/docs: fix warnings for vfio_ap driver doc s390/docs: fix warnings for vfio_ap driver lock usage doc s390/crash: support multi-segment iterators s390/crash: use static swap buffer for copy_to_user_real() s390/crash: move copy_to_user_real() to crash_dump.c s390/zcore: fix race when reading from hardware system area s390/crash: fix incorrect number of bytes to copy to user space ...
2022-08-06Merge tag 'vfio-v6.0-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds1-3/+3
Pull VFIO updates from Alex Williamson: - Cleanup use of extern in function prototypes (Alex Williamson) - Simplify bus_type usage and convert to device IOMMU interfaces (Robin Murphy) - Check missed return value and fix comment typos (Bo Liu) - Split migration ops from device ops and fix races in mlx5 migration support (Yishai Hadas) - Fix missed return value check in noiommu support (Liam Ni) - Hardening to clear buffer pointer to avoid use-after-free (Schspa Shi) - Remove requirement that only the same mm can unmap a previously mapped range (Li Zhe) - Adjust semaphore release vs device open counter (Yi Liu) - Remove unused arg from SPAPR support code (Deming Wang) - Rework vfio-ccw driver to better fit new mdev framework (Eric Farman, Michael Kawano) - Replace DMA unmap notifier with callbacks (Jason Gunthorpe) - Clarify SPAPR support comment relative to iommu_ops (Alexey Kardashevskiy) - Revise page pinning API towards compatibility with future iommufd support (Nicolin Chen) - Resolve issues in vfio-ccw, including use of DMA unmap callback (Eric Farman) * tag 'vfio-v6.0-rc1' of https://github.com/awilliam/linux-vfio: (40 commits) vfio/pci: fix the wrong word vfio/ccw: Check return code from subchannel quiesce vfio/ccw: Remove FSM Close from remove handlers vfio/ccw: Add length to DMA_UNMAP checks vfio: Replace phys_pfn with pages for vfio_pin_pages() vfio/ccw: Add kmap_local_page() for memcpy vfio: Rename user_iova of vfio_dma_rw() vfio/ccw: Change pa_pfn list to pa_iova list vfio/ap: Change saved_pfn to saved_iova vfio: Pass in starting IOVA to vfio_pin/unpin_pages API vfio/ccw: Only pass in contiguous pages vfio/ap: Pass in physical address of ind to ap_aqic() drm/i915/gvt: Replace roundup with DIV_ROUND_UP vfio: Make vfio_unpin_pages() return void vfio/spapr_tce: Fix the comment vfio: Replace the iommu notifier with a device list vfio: Replace the DMA unmapping notifier with a callback vfio/ccw: Move FSM open/close to MDEV open/close vfio/ccw: Refactor vfio_ccw_mdev_reset vfio/ccw: Create a CLOSE FSM event ...
2022-08-06Revert "s390/smp: enforce lowcore protection on CPU restart"Alexander Gordeev1-1/+1
This reverts commit 6f5c672d17f583b081e283927f5040f726c54598. This breaks normal crash dump when CPU0 is offline. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-06Revert "s390/smp: rework absolute lowcore access"Alexander Gordeev14-294/+83
This reverts commit 7d06fed77b7d8fc9f6cc41b4e3f2823d32532ad8. This introduced vmem_mutex locking from vmem_map_4k_page() function called from smp_reinit_ipl_cpu() with interrupts disabled. While it is a pre-SMP early initcall no other CPUs running in parallel nor other code taking vmem_mutex on this boot stage - it still needs to be fixed. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>