summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2024-01-05kexec: fix KEXEC_FILE dependenciesArnd Bergmann1-2/+2
[ Upstream commit c1ad12ee0efc07244be37f69311e6f7c4ac98e62 ] The cleanup for the CONFIG_KEXEC Kconfig logic accidentally changed the 'depends on CRYPTO=y' dependency to a plain 'depends on CRYPTO', which causes a link failure when all the crypto support is in a loadable module and kexec_file support is built-in: x86_64-linux-ld: vmlinux.o: in function `__x64_sys_kexec_file_load': (.text+0x32e30a): undefined reference to `crypto_alloc_shash' x86_64-linux-ld: (.text+0x32e58e): undefined reference to `crypto_shash_update' x86_64-linux-ld: (.text+0x32e6ee): undefined reference to `crypto_shash_final' Both s390 and x86 have this problem, while ppc64 and riscv have the correct dependency already. On riscv, the dependency is only used for the purgatory, not for the kexec_file code itself, which may be a bit surprising as it means that with CONFIG_CRYPTO=m, it is possible to enable KEXEC_FILE but then the purgatory code is silently left out. Move this into the common Kconfig.kexec file in a way that is correct everywhere, using the dependency on CRYPTO_SHA256=y only when the purgatory code is available. This requires reversing the dependency between ARCH_SUPPORTS_KEXEC_PURGATORY and KEXEC_FILE, but the effect remains the same, other than making riscv behave like the other ones. On s390, there is an additional dependency on CRYPTO_SHA256_S390, which should technically not be required but gives better performance. Remove this dependency here, noting that it was not present in the initial Kconfig code but was brought in without an explanation in commit 71406883fd357 ("s390/kexec_file: Add kexec_file_load system call"). [arnd@arndb.de: fix riscv build] Link: https://lkml.kernel.org/r/67ddd260-d424-4229-a815-e3fcfb864a77@app.fastmail.com Link: https://lkml.kernel.org/r/20231023110308.1202042-1-arnd@kernel.org Fixes: 6af5138083005 ("x86/kexec: refactor for kernel/Kconfig.kexec") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Eric DeVolder <eric_devolder@yahoo.com> Tested-by: Eric DeVolder <eric_devolder@yahoo.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Conor Dooley <conor@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-20cred: get rid of CONFIG_DEBUG_CREDENTIALSJens Axboe1-1/+0
commit ae1914174a63a558113e80d24ccac2773f9f7b2b upstream. This code is rarely (never?) enabled by distros, and it hasn't caught anything in decades. Let's kill off this legacy debug code. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-13powerpc/ftrace: Fix stack teardown in ftrace_no_traceNaveen N Rao1-2/+2
commit 4b3338aaa74d7d4ec5b6734dc298f0db94ec83d2 upstream. Commit 41a506ef71eb ("powerpc/ftrace: Create a dummy stackframe to fix stack unwind") added use of a new stack frame on ftrace entry to fix stack unwind. However, the commit missed updating the offset used while tearing down the ftrace stack when ftrace is disabled. Fix the same. In addition, the commit missed saving the correct stack pointer in pt_regs. Update the same. Fixes: 41a506ef71eb ("powerpc/ftrace: Create a dummy stackframe to fix stack unwind") Cc: stable@vger.kernel.org # v6.5+ Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231130065947.2188860-1-naveen@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-08powerpc/pseries/iommu: enable_ddw incorrectly returns direct mapping for ↵Gaurav Batra1-4/+4
SR-IOV device [ Upstream commit 3bf983e4e93ce8e6d69e9d63f52a66ec0856672e ] When a device is initialized, the driver invokes dma_supported() twice - first for streaming mappings followed by coherent mappings. For an SR-IOV device, default window is deleted and DDW created. With vPMEM enabled, TCE mappings are dynamically created for both vPMEM and SR-IOV device. There are no direct mappings. First time when dma_supported() is called with 64 bit mask, DDW is created and marked as dynamic window. The second time dma_supported() is called, enable_ddw() finds existing window for the device and incorrectly returns it as "direct mapping". This only happens when size of DDW is big enough to map max LPAR memory. This results in streaming TCEs to not get dynamically mapped, since code incorrently assumes these are already pre-mapped. The adapter initially comes up but goes down due to EEH. Fixes: 381ceda88c4c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Gaurav Batra <gbatra@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231003030802.47914-1-gbatra@linux.vnet.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-08powerpc: Don't clobber f0/vs0 during fp|altivec register saveTimothy Pearson2-0/+15
commit 5e1d824f9a283cbf90f25241b66d1f69adb3835b upstream. During floating point and vector save to thread data f0/vs0 are clobbered by the FPSCR/VSCR store routine. This has been obvserved to lead to userspace register corruption and application data corruption with io-uring. Fix it by restoring f0/vs0 after FPSCR/VSCR store has completed for all the FP, altivec, VMX register save paths. Tested under QEMU in kvm mode, running on a Talos II workstation with dual POWER9 DD2.2 CPUs. Additional detail (mpe): Typically save_fpu() is called from __giveup_fpu() which saves the FP regs and also *turns off FP* in the tasks MSR, meaning the kernel will reload the FP regs from the thread struct before letting the task use FP again. So in that case save_fpu() is free to clobber f0 because the FP regs no longer hold live values for the task. There is another case though, which is the path via: sys_clone() ... copy_process() dup_task_struct() arch_dup_task_struct() flush_all_to_thread() save_all() That path saves the FP regs but leaves them live. That's meant as an optimisation for a process that's using FP/VSX and then calls fork(), leaving the regs live means the parent process doesn't have to take a fault after the fork to get its FP regs back. The optimisation was added in commit 8792468da5e1 ("powerpc: Add the ability to save FPU without giving it up"). That path does clobber f0, but f0 is volatile across function calls, and typically programs reach copy_process() from userspace via a syscall wrapper function. So in normal usage f0 being clobbered across a syscall doesn't cause visible data corruption. But there is now a new path, because io-uring can call copy_process() via create_io_thread() from the signal handling path. That's OK if the signal is handled as part of syscall return, but it's not OK if the signal is handled due to some other interrupt. That path is: interrupt_return_srr_user() interrupt_exit_user_prepare() interrupt_exit_user_prepare_main() do_notify_resume() get_signal() task_work_run() create_worker_cb() create_io_worker() copy_process() dup_task_struct() arch_dup_task_struct() flush_all_to_thread() save_all() if (tsk->thread.regs->msr & MSR_FP) save_fpu() # f0 is clobbered and potentially live in userspace Note the above discussion applies equally to save_altivec(). Fixes: 8792468da5e1 ("powerpc: Add the ability to save FPU without giving it up") Cc: stable@vger.kernel.org # v4.6+ Closes: https://lore.kernel.org/all/480932026.45576726.1699374859845.JavaMail.zimbra@raptorengineeringinc.com/ Closes: https://lore.kernel.org/linuxppc-dev/480221078.47953493.1700206777956.JavaMail.zimbra@raptorengineeringinc.com/ Tested-by: Timothy Pearson <tpearson@raptorengineering.com> Tested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> [mpe: Reword change log to describe exact path of corruption & other minor tweaks] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/1921539696.48534988.1700407082933.JavaMail.zimbra@raptorengineeringinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-08KVM: PPC: Book3S HV: Fix KVM_RUN clobbering FP/VEC user registersNicholas Piggin1-3/+3
commit dc158d23b33df9033bcc8e7117e8591dd2f9d125 upstream. Before running a guest, the host process (e.g., QEMU) FP/VEC registers are saved if they were being used, similarly to when the kernel uses FP registers. The guest values are then loaded into regs, and the host process registers will be restored lazily when it uses FP/VEC. KVM HV has a bug here: the host process registers do get saved, but the user MSR bits remain enabled, which indicates the registers are valid for the process. After they are clobbered by running the guest, this valid indication causes the host process to take on the FP/VEC register values of the guest. Fixes: 34e119c96b2b ("KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save host SPRs") Cc: stable@vger.kernel.org # v5.17+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231122025811.2973-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28powerpc/perf: Fix disabling BHRB and instruction samplingNicholas Piggin1-3/+2
commit ea142e590aec55ba40c5affb4d49e68c713c63dc upstream. When the PMU is disabled, MMCRA is not updated to disable BHRB and instruction sampling. This can lead to those features remaining enabled, which can slow down a real or emulated CPU. Fixes: 1cade527f6e9 ("powerpc/perf: BHRB control to disable BHRB logic when not used") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231018153423.298373-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-20powerpc/vmcore: Add MMU information to vmcoreinfoAditya Gupta1-0/+3
[ Upstream commit 36e826b568e412f61d68fedc02a67b4d8b7583cc ] Since below commit, address mapping for vmemmap has changed for Radix MMU, where address mapping is stored in kernel page table itself, instead of earlier used 'vmemmap_list'. commit 368a0590d954 ("powerpc/book3s64/vmemmap: switch radix to use a different vmemmap handling function") Hence with upstream kernel, in case of Radix MMU, makedumpfile fails to do address translation for vmemmap addresses, as it depended on vmemmap_list, which can now be empty. While fixing the address translation in makedumpfile, it was identified that currently makedumpfile cannot distinguish between Hash MMU and Radix MMU, unless VMLINUX is passed with -x flag to makedumpfile. And hence fails to assign offsets and shifts correctly (such as in L4 to PGDIR offset calculation in makedumpfile). For getting the MMU, makedumpfile uses `cur_cpu_spec.mmu_features`. Add `cur_cpu_spec` symbol and offset of `mmu_features` in the `cpu_spec` struct, to VMCOREINFO, so that makedumpfile can assign the offsets correctly, without needing a VMLINUX. Also, even along with `cur_cpu_spec->mmu_features` makedumpfile has to depend on the 'MMU_FTR_TYPE_RADIX' flag in mmu_features, implying kernel developers need to be cautious of changes to 'MMU_FTR_*' defines. A more stable approach was suggested in the below thread by contributors: https://lore.kernel.org/linuxppc-dev/20230920105706.853626-1-adityag@linux.ibm.com/ The suggestion was to add whether 'RADIX_MMU' is enabled in vmcoreinfo This patch also implements the suggestion, by adding 'RADIX_MMU' in vmcoreinfo, which makedumpfile can use to get whether the crashed system had RADIX MMU (in which case 'NUMBER(RADIX_MMU)=1') or not (in which case 'NUMBER(RADIX_MMU)=0') Fixes: 368a0590d954 ("powerpc/book3s64/vmemmap: switch radix to use a different vmemmap handling function") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Aditya Gupta <adityag@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231023072612.50874-1-adityag@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20powerpc/pseries: fix potential memory leak in init_cpu_associativity()Wang Yufen1-1/+3
[ Upstream commit 95f1a128cd728a7257d78e868f1f5a145fc43736 ] If the vcpu_associativity alloc memory successfully but the pcpu_associativity fails to alloc memory, the vcpu_associativity memory leaks. Fixes: d62c8deeb6e6 ("powerpc/pseries: Provide vcpu dispatch statistics") Signed-off-by: Wang Yufen <wangyufen@huawei.com> Reviewed-by: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/1671003983-10794-1-git-send-email-wangyufen@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20powerpc/imc-pmu: Use the correct spinlock initializer.Sebastian Andrzej Siewior1-1/+1
[ Upstream commit 007240d59c11f87ac4f6cfc6a1d116630b6b634c ] The macro __SPIN_LOCK_INITIALIZER() is implementation specific. Users that desire to initialize a spinlock in a struct must use __SPIN_LOCK_UNLOCKED(). Use __SPIN_LOCK_UNLOCKED() for the spinlock_t in imc_global_refc. Fixes: 76d588dddc459 ("powerpc/imc-pmu: Fix use of mutex in IRQs disabled section") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230309134831.Nz12nqsU@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20powerpc/vas: Limit open window failure messages in log buffferHaren Myneni2-20/+18
[ Upstream commit 73b25505ce043b561028e5571d84dc82aa53c2b4 ] The VAS open window call prints error message and returns -EBUSY after the migration suspend event initiated and until the resume event completed on the destination system. It can cause the log buffer filled with these error messages if the user space issues continuous open window calls. Similar case even for DLPAR CPU remove event when no credits are available until the credits are freed or with the other DLPAR CPU add event. So changes in the patch to use pr_err_ratelimited() instead of pr_err() to display open window failure and not-available credits error messages. Use pr_fmt() and make the corresponding changes to have the consistencein prefix all pr_*() messages (vas-api.c). Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler") Signed-off-by: Haren Myneni <haren@linux.ibm.com> [mpe: Use "vas-api" as the prefix to match the file name.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231019215033.1335251-1-haren@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20powerpc: Hide empty pt_regs at base of the stackMichael Ellerman1-3/+23
[ Upstream commit d45c4b48dafb5820e5cc267ff9a6d7784d13a43c ] A thread started via eg. user_mode_thread() runs in the kernel to begin with and then may later return to userspace. While it's running in the kernel it has a pt_regs at the base of its kernel stack, but that pt_regs is all zeroes. If the thread oopses in that state, it leads to an ugly stack trace with a big block of zero GPRs, as reported by Joel: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc7-00004-gf7757129e3de-dirty #3 Hardware name: IBM PowerNV (emulated by qemu) POWER9 0x4e1200 opal:v7.0 PowerNV Call Trace: [c0000000036afb00] [c0000000010dd058] dump_stack_lvl+0x6c/0x9c (unreliable) [c0000000036afb30] [c00000000013c524] panic+0x178/0x424 [c0000000036afbd0] [c000000002005100] mount_root_generic+0x250/0x324 [c0000000036afca0] [c0000000020057d0] prepare_namespace+0x2d4/0x344 [c0000000036afd20] [c0000000020049c0] kernel_init_freeable+0x358/0x3ac [c0000000036afdf0] [c0000000000111b0] kernel_init+0x30/0x1a0 [c0000000036afe50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c --- interrupt: 0 at 0x0 NIP: 0000000000000000 LR: 0000000000000000 CTR: 0000000000000000 REGS: c0000000036afe80 TRAP: 0000 Not tainted (6.5.0-rc7-00004-gf7757129e3de-dirty) MSR: 0000000000000000 <> CR: 00000000 XER: 00000000 CFAR: 0000000000000000 IRQMASK: 0 GPR00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 NIP [0000000000000000] 0x0 LR [0000000000000000] 0x0 --- interrupt: 0 The all-zero pt_regs looks ugly and conveys no useful information, other than its presence. So detect that case and just show the presence of the frame by printing the interrupt marker, eg: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc3-00126-g18e9506562a0-dirty #301 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries Call Trace: [c000000003aabb00] [c000000001143db8] dump_stack_lvl+0x6c/0x9c (unreliable) [c000000003aabb30] [c00000000014c624] panic+0x178/0x424 [c000000003aabbd0] [c0000000020050fc] mount_root_generic+0x250/0x324 [c000000003aabca0] [c0000000020057cc] prepare_namespace+0x2d4/0x344 [c000000003aabd20] [c0000000020049bc] kernel_init_freeable+0x358/0x3ac [c000000003aabdf0] [c0000000000111b0] kernel_init+0x30/0x1a0 [c000000003aabe50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c --- interrupt: 0 at 0x0 To avoid ever suppressing a valid pt_regs make sure the pt_regs has a zero MSR and TRAP value, and is located at the very base of the stack. Fixes: 6895dfc04741 ("powerpc: copy_thread fill in interrupt frame marker and back chain") Reported-by: Joel Stanley <joel@jms.id.au> Reported-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230824064210.907266-1-mpe@ellerman.id.au Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20powerpc/xive: Fix endian conversion sizeBenjamin Gray1-1/+1
[ Upstream commit ff7a60ab1e065257a0e467c13b519f4debcd7fcf ] Sparse reports a size mismatch in the endian swap. The Opal implementation[1] passes the value as a __be64, and the receiving variable out_qsize is a u64, so the use of be32_to_cpu() appears to be an error. [1]: https://github.com/open-power/skiboot/blob/80e2b1dc73/hw/xive.c#L3854 Fixes: 88ec6b93c8e7 ("powerpc/xive: add OPAL extensions for the XIVE native exploitation support") Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231011053711.93427-2-bgray@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20powerpc/40x: Remove stale PTE_ATOMIC_UPDATES macroChristophe Leroy1-3/+0
[ Upstream commit cc8ee288f484a2a59c01ccd4d8a417d6ed3466e3 ] 40x TLB handlers were reworked by commit 2c74e2586bb9 ("powerpc/40x: Rework 40x PTE access and TLB miss") to not require PTE_ATOMIC_UPDATES anymore. Then commit 4e1df545e2fa ("powerpc/pgtable: Drop PTE_ATOMIC_UPDATES") removed all code related to PTE_ATOMIC_UPDATES. Remove left over PTE_ATOMIC_UPDATES macro. Fixes: 2c74e2586bb9 ("powerpc/40x: Rework 40x PTE access and TLB miss") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/f061db5857fcd748f84a6707aad01754686ce97e.1695659959.git.christophe.leroy@csgroup.eu Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20powerpc: Only define __parse_fpscr() when requiredChristophe Leroy1-0/+2
[ Upstream commit c7e0d9bb9154c6e6b2ac8746faba27b53393f25e ] Clang 17 reports: arch/powerpc/kernel/traps.c:1167:19: error: unused function '__parse_fpscr' [-Werror,-Wunused-function] __parse_fpscr() is called from two sites. First call is guarded by #ifdef CONFIG_PPC_FPU_REGS Second call is guarded by CONFIG_MATH_EMULATION which selects CONFIG_PPC_FPU_REGS. So only define __parse_fpscr() when CONFIG_PPC_FPU_REGS is defined. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202309210327.WkqSd5Bq-lkp@intel.com/ Fixes: b6254ced4da6 ("powerpc/signal: Don't manage floating point regs when no FPU") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/5de2998c57f3983563b27b39228ea9a7229d4110.1695385984.git.christophe.leroy@csgroup.eu Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-27Merge tag 'powerpc-6.6-6' of ↵Linus Torvalds3-11/+24
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix boot crash with FLATMEM since set_ptes() introduction - Avoid calling arch_enter/leave_lazy_mmu() in set_ptes() Thanks to Aneesh Kumar K.V and Erhard Furtner. * tag 'powerpc-6.6-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Avoid calling arch_enter/leave_lazy_mmu() in set_ptes powerpc/mm: Fix boot crash with FLATMEM
2023-10-25powerpc/mm: Avoid calling arch_enter/leave_lazy_mmu() in set_ptesAneesh Kumar K.V1-10/+22
With commit 9fee28baa601 ("powerpc: implement the new page table range API") we added set_ptes to powerpc architecture. The implementation included calling arch_enter/leave_lazy_mmu() calls. The patch removes the usage of arch_enter/leave_lazy_mmu() because set_pte is not supposed to be used when updating a pte entry. Powerpc architecture uses this rule to skip the expensive tlb invalidate which is not needed when you are setting up the pte for the first time. See commit 56eecdb912b5 ("mm: Use ptep/pmdp_set_numa() for updating _PAGE_NUMA bit") for more details The patch also makes sure we are not using the interface to update a valid/present pte entry by adding VM_WARN_ON check all the ptes we are setting up. Furthermore, we add a comment to set_pte_filter to clarify it can only update folio-related flags and cannot filter pfn specific details in pte filtering. Removal of arch_enter/leave_lazy_mmu() also will avoid nesting of these functions that are not supported. For ex: remap_pte_range() -> arch_enter_lazy_mmu() -> set_ptes() -> arch_enter_lazy_mmu() -> arch_leave_lazy_mmu() -> arch_leave_lazy_mmu() Fixes: 9fee28baa601 ("powerpc: implement the new page table range API") Signed-off-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231024143604.16749-1-aneesh.kumar@linux.ibm.com
2023-10-23powerpc/mm: Fix boot crash with FLATMEMMichael Ellerman2-1/+2
Erhard reported that his G5 was crashing with v6.6-rc kernels: mpic: Setting up HT PICs workarounds for U3/U4 BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe Faulting instruction address: 0xc00000000005dc40 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Tainted: G T 6.6.0-rc3-PMacGS #1 Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac NIP: c00000000005dc40 LR: c000000000066660 CTR: c000000000007730 REGS: c0000000022bf510 TRAP: 0380 Tainted: G T (6.6.0-rc3-PMacGS) MSR: 9000000000001032 <SF,HV,ME,IR,DR,RI> CR: 44004242 XER: 00000000 IRQMASK: 3 GPR00: 0000000000000000 c0000000022bf7b0 c0000000010c0b00 00000000000001ac GPR04: 0000000003c80000 0000000000000300 c0000000f20001ae 0000000000000300 GPR08: 0000000000000006 feffbb62ffec65ff 0000000000000001 0000000000000000 GPR12: 9000000000001032 c000000002362000 c000000000f76b80 000000000349ecd8 GPR16: 0000000002367ba8 0000000002367f08 0000000000000006 0000000000000000 GPR20: 00000000000001ac c000000000f6f920 c0000000022cd985 000000000000000c GPR24: 0000000000000300 00000003b0a3691d c0003e008030000e 0000000000000000 GPR28: c00000000000000c c0000000f20001ee feffbb62ffec65fe 00000000000001ac NIP hash_page_do_lazy_icache+0x50/0x100 LR __hash_page_4K+0x420/0x590 Call Trace: hash_page_mm+0x364/0x6f0 do_hash_fault+0x114/0x2b0 data_access_common_virt+0x198/0x1f0 --- interrupt: 300 at mpic_init+0x4bc/0x10c4 NIP: c000000002020a5c LR: c000000002020a04 CTR: 0000000000000000 REGS: c0000000022bf9f0 TRAP: 0300 Tainted: G T (6.6.0-rc3-PMacGS) MSR: 9000000000001032 <SF,HV,ME,IR,DR,RI> CR: 24004248 XER: 00000000 DAR: c0003e008030000e DSISR: 40000000 IRQMASK: 1 ... NIP mpic_init+0x4bc/0x10c4 LR mpic_init+0x464/0x10c4 --- interrupt: 300 pmac_setup_one_mpic+0x258/0x2dc pmac_pic_init+0x28c/0x3d8 init_IRQ+0x90/0x140 start_kernel+0x57c/0x78c start_here_common+0x1c/0x20 A bisect pointed to the breakage beginning with commit 9fee28baa601 ("powerpc: implement the new page table range API"). Analysis of the oops pointed to a struct page with a corrupted compound_head being loaded via page_folio() -> _compound_head() in hash_page_do_lazy_icache(). The access by the mpic code is to an MMIO address, so the expectation is that the struct page for that address would be initialised by init_unavailable_range(), as pointed out by Aneesh. Instrumentation showed that was not the case, which eventually lead to the realisation that pfn_valid() was returning false for that address, causing the struct page to not be initialised. Because the system is using FLATMEM, the version of pfn_valid() in memory_model.h is used: static inline int pfn_valid(unsigned long pfn) { ... return pfn >= pfn_offset && (pfn - pfn_offset) < max_mapnr; } Which relies on max_mapnr being initialised. Early in boot max_mapnr is zero meaning no PFNs are valid. max_mapnr is initialised in mem_init() called via: start_kernel() mm_core_init() # init/main.c:928 mem_init() But that is too late for the usage in init_unavailable_range() called via: start_kernel() setup_arch() # init/main.c:893 paging_init() free_area_init() init_unavailable_range() Although max_mapnr is currently set in mem_init(), the value is actually already available much earlier, as soon as mem_topology_setup() has completed, which is also before paging_init() is called. So move the initialisation there, which causes paging_init() to correctly initialise the struct page and fixes the bug. This bug seems to have been lurking for years, but went unnoticed because the pre-folio code was inspecting the uninitialised page->flags but not dereferencing it. Thanks to Erhard and Aneesh for help debugging. Reported-by: Erhard Furtner <erhard_f@mailbox.org> Closes: https://lore.kernel.org/all/20230929132750.3cd98452@yea/ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231023112500.1550208-1-mpe@ellerman.id.au
2023-10-22Merge tag 'powerpc-6.6-5' of ↵Linus Torvalds3-9/+5
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix stale propagated yield_cpu in qspinlocks leading to lockups - Fix broken hugepages on some configs due to ARCH_FORCE_MAX_ORDER - Fix a spurious warning when copros are in use at exit time Thanks to Nicholas Piggin, Christophe Leroy, Nysal Jan K.A Sachin Sant, and Shrikanth Hegde. * tag 'powerpc-6.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/qspinlock: Fix stale propagated yield_cpu powerpc/64s/radix: Don't warn on copros in radix__tlb_flush() powerpc/mm: Allow ARCH_FORCE_MAX_ORDER up to 12
2023-10-18powerpc/qspinlock: Fix stale propagated yield_cpuNicholas Piggin1-0/+3
yield_cpu is a sample of a preempted lock holder that gets propagated back through the queue. Queued waiters use this to yield to the preempted lock holder without continually sampling the lock word (which would defeat the purpose of MCS queueing by bouncing the cache line). The problem is that yield_cpu can become stale. It can take some time to be passed down the chain, and if any queued waiter gets preempted then it will cease to propagate the yield_cpu to later waiters. This can result in yielding to a CPU that no longer holds the lock, which is bad, but particularly if it is currently in H_CEDE (idle), then it appears to be preempted and some hypervisors (PowerVM) can cause very long H_CONFER latencies waiting for H_CEDE wakeup. This results in latency spikes and hard lockups on oversubscribed partitions with lock contention. This is a minimal fix. Before yielding to yield_cpu, sample the lock word to confirm yield_cpu is still the owner, and bail out of it is not. Thanks to a bunch of people who reported this and tracked down the exact problem using tracepoints and dispatch trace logs. Fixes: 28db61e207ea ("powerpc/qspinlock: allow propagation of yield CPU down the queue") Cc: stable@vger.kernel.org # v6.2+ Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reported-by: Laurent Dufour <ldufour@linux.ibm.com> Reported-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com> Debugged-by: "Nysal Jan K.A" <nysal@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231016124305.139923-2-npiggin@gmail.com
2023-10-18powerpc/64s/radix: Don't warn on copros in radix__tlb_flush()Michael Ellerman1-8/+1
Sachin reported a warning when running the inject-ra-err selftest: # selftests: powerpc/mce: inject-ra-err Disabling lock debugging due to kernel taint MCE: CPU19: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered] MCE: CPU19: PID: 5254 Comm: inject-ra-err NIP: [0000000010000e48] MCE: CPU19: Initiator CPU MCE: CPU19: Unknown ------------[ cut here ]------------ WARNING: CPU: 19 PID: 5254 at arch/powerpc/mm/book3s64/radix_tlb.c:1221 radix__tlb_flush+0x160/0x180 CPU: 19 PID: 5254 Comm: inject-ra-err Kdump: loaded Tainted: G M E 6.6.0-rc3-00055-g9ed22ae6be81 #4 Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries ... NIP radix__tlb_flush+0x160/0x180 LR radix__tlb_flush+0x104/0x180 Call Trace: radix__tlb_flush+0xf4/0x180 (unreliable) tlb_finish_mmu+0x15c/0x1e0 exit_mmap+0x1a0/0x510 __mmput+0x60/0x1e0 exit_mm+0xdc/0x170 do_exit+0x2bc/0x5a0 do_group_exit+0x4c/0xc0 sys_exit_group+0x28/0x30 system_call_exception+0x138/0x330 system_call_vectored_common+0x15c/0x2ec And bisected it to commit e43c0a0c3c28 ("powerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs"), which added a warning in radix__tlb_flush() if mm->context.copros is still elevated. However it's possible for the copros count to be elevated if a process exits without first closing file descriptors that are associated with a copro, eg. VAS. If the process exits with a VAS file still open, the release callback is queued up for exit_task_work() via: exit_files() put_files_struct() close_files() filp_close() fput() And called via: exit_task_work() ____fput() __fput() file->f_op->release(inode, file) coproc_release() vas_user_win_ops->close_win() vas_deallocate_window() mm_context_remove_vas_window() mm_context_remove_copro() But that is after exit_mm() has been called from do_exit() and triggered the warning. Fix it by dropping the warning, and always calling __flush_all_mm(). In the normal case of no copros, that will result in a call to _tlbiel_pid(mm->context.id, RIC_FLUSH_ALL) just as the current code does. If the copros count is elevated then it will cause a global flush, which should flush translations from any copros. Note that the process table entry was cleared in arch_exit_mmap(), so copros should not be able to fetch any new translations. Fixes: e43c0a0c3c28 ("powerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Closes: https://lore.kernel.org/all/A8E52547-4BF1-47CE-8AEA-BC5A9D7E3567@linux.ibm.com/ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Link: https://msgid.link/20231017121527.1574104-1-mpe@ellerman.id.au
2023-10-15Merge tag 'powerpc-6.6-4' of ↵Linus Torvalds6-12/+17
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix softlockup/crash when using hcall tracing - Fix pte_access_permitted() for PAGE_NONE on 8xx - Fix inverted pte_young() test in __ptep_test_and_clear_young() on 64-bit BookE - Fix unhandled math emulation exception on 85xx - Fix kernel crash on syscall return on 476 Thanks to Athira Rajeev, Christophe Leroy, Eddie James, and Naveen N Rao. * tag 'powerpc-6.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/47x: Fix 47x syscall return crash powerpc/85xx: Fix math emulation exception powerpc/64e: Fix wrong test in __ptep_test_and_clear_young() powerpc/8xx: Fix pte_access_permitted() for PAGE_NONE powerpc/pseries: Remove unused r0 in the hcall tracing code powerpc/pseries: Fix STK_PARAM access in the hcall tracing code
2023-10-15powerpc/mm: Allow ARCH_FORCE_MAX_ORDER up to 12Michael Ellerman1-1/+1
Christophe reported that the change to ARCH_FORCE_MAX_ORDER to limit the range to 10 had broken his ability to configure hugepages: # echo 1 > /sys/kernel/mm/hugepages/hugepages-8192kB/nr_hugepages sh: write error: Invalid argument Several of the powerpc defconfigs previously set the ARCH_FORCE_MAX_ORDER value to 12, via the definition in arch/powerpc/configs/fsl-emb-nonhw.config, used by: mpc85xx_defconfig mpc85xx_smp_defconfig corenet32_smp_defconfig corenet64_smp_defconfig mpc86xx_defconfig mpc86xx_smp_defconfig Fix it by increasing the allowed range to 12 to restore the previous behaviour. Fixes: 358e526a1648 ("powerpc/mm: Reinstate ARCH_FORCE_MAX_ORDER ranges") Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Closes: https://lore.kernel.org/all/8011d806-5b30-bf26-2bfe-a08c39d57e20@csgroup.eu/ Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230824122849.942072-1-mpe@ellerman.id.au
2023-10-11powerpc/47x: Fix 47x syscall return crashMichael Ellerman1-3/+5
Eddie reported that newer kernels were crashing during boot on his 476 FSP2 system: kernel tried to execute user page (b7ee2000) - exploit attempt? (uid: 0) BUG: Unable to handle kernel instruction fetch Faulting instruction address: 0xb7ee2000 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=4K FSP-2 Modules linked in: CPU: 0 PID: 61 Comm: mount Not tainted 6.1.55-d23900f.ppcnf-fsp2 #1 Hardware name: ibm,fsp2 476fpe 0x7ff520c0 FSP-2 NIP:  b7ee2000 LR: 8c008000 CTR: 00000000 REGS: bffebd83 TRAP: 0400   Not tainted (6.1.55-d23900f.ppcnf-fs p2) MSR:  00000030 <IR,DR>  CR: 00001000  XER: 20000000 GPR00: c00110ac bffebe63 bffebe7e bffebe88 8c008000 00001000 00000d12 b7ee2000 GPR08: 00000033 00000000 00000000 c139df10 48224824 1016c314 10160000 00000000 GPR16: 10160000 10160000 00000008 00000000 10160000 00000000 10160000 1017f5b0 GPR24: 1017fa50 1017f4f0 1017fa50 1017f740 1017f630 00000000 00000000 1017f4f0 NIP [b7ee2000] 0xb7ee2000 LR [8c008000] 0x8c008000 Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace 0000000000000000 ]--- The problem is in ret_from_syscall where the check for icache_44x_need_flush is done. When the flush is needed the code jumps out-of-line to do the flush, and then intends to jump back to continue the syscall return. However the branch back to label 1b doesn't return to the correct location, instead branching back just prior to the return to userspace, causing bogus register values to be used by the rfi. The breakage was introduced by commit 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") which inadvertently removed the "1" label and reused it elsewhere. Fix it by adding named local labels in the correct locations. Note that the return label needs to be outside the ifdef so that CONFIG_PPC_47x=n compiles. Fixes: 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") Cc: stable@vger.kernel.org # v5.12+ Reported-by: Eddie James <eajames@linux.ibm.com> Tested-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/linuxppc-dev/fdaadc46-7476-9237-e104-1d2168526e72@linux.ibm.com/ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://msgid.link/20231010114750.847794-1-mpe@ellerman.id.au
2023-10-10powerpc/85xx: Fix math emulation exceptionChristophe Leroy1-1/+1
Booting mpc85xx_defconfig kernel on QEMU leads to: Bad trap at PC: fe9bab0, SR: 2d000, vector=800 awk[82]: unhandled trap (5) at 0 nip fe9bab0 lr fe9e01c code 5 in libc-2.27.so[fe5a000+17a000] awk[82]: code: 3aa00000 3a800010 4bffe03c 9421fff0 7ca62b78 38a00000 93c10008 83c10008 awk[82]: code: 38210010 4bffdec8 9421ffc0 7c0802a6 <fc00048e> d8010008 4815190d 93810030 Trace/breakpoint trap WARNING: no useful console This is because allthough CONFIG_MATH_EMULATION is selected, Exception 800 calls unknown_exception(). Call emulation_assist_interrupt() instead. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/066caa6d9480365da9b8ed83692d7101e10ac5f8.1695657339.git.christophe.leroy@csgroup.eu
2023-10-09powerpc/64e: Fix wrong test in __ptep_test_and_clear_young()Christophe Leroy1-1/+1
Commit 45201c879469 ("powerpc/nohash: Remove hash related code from nohash headers.") replaced: if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0) return 0; By: if (pte_young(*ptep)) return 0; But it should be: if (!pte_young(*ptep)) return 0; Fix it. Fixes: 45201c879469 ("powerpc/nohash: Remove hash related code from nohash headers.") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/8bb7f06494e21adada724ede47a4c3d97e879d40.1695659959.git.christophe.leroy@csgroup.eu
2023-10-09powerpc/8xx: Fix pte_access_permitted() for PAGE_NONEChristophe Leroy2-0/+9
On 8xx, PAGE_NONE is handled by setting _PAGE_NA instead of clearing _PAGE_USER. But then pte_user() returns 1 also for PAGE_NONE. As _PAGE_NA prevent reads, add a specific version of pte_read() that returns 0 when _PAGE_NA is set instead of always returning 1. Fixes: 351750331fc1 ("powerpc/mm: Introduce _PAGE_NA") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/57bcfbe578e43123f9ed73e040229b80f1ad56ec.1695659959.git.christophe.leroy@csgroup.eu
2023-10-01Merge tag 'mm-hotfixes-stable-2023-10-01-08-34' of ↵Linus Torvalds5-5/+12
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Fourteen hotfixes, eleven of which are cc:stable. The remainder pertain to issues which were introduced after 6.5" * tag 'mm-hotfixes-stable-2023-10-01-08-34' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: Crash: add lock to serialize crash hotplug handling selftests/mm: fix awk usage in charge_reserved_hugetlb.sh and hugetlb_reparenting_test.sh that may cause error mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified mm/damon/vaddr-test: fix memory leak in damon_do_test_apply_three_regions() mm, memcg: reconsider kmem.limit_in_bytes deprecation mm: zswap: fix potential memory corruption on duplicate store arm64: hugetlb: fix set_huge_pte_at() to work with all swap entries mm: hugetlb: add huge page size param to set_huge_pte_at() maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states maple_tree: add mas_is_active() to detect in-tree walks nilfs2: fix potential use after free in nilfs_gccache_submit_read_data() mm: abstract moving to the next PFN mm: report success more often from filemap_map_folio_range() fs: binfmt_elf_efpic: fix personality for ELF-FDPIC
2023-09-30powerpc/pseries: Remove unused r0 in the hcall tracing codeAthira Rajeev1-4/+0
In the plpar_hcall trace code, currently we use r0 to store the value of r4. But this value is not used subsequently in the code. Hence remove this unused save to r0 in plpar_hcall and plpar_hcall9 Suggested-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230929172337.7906-2-atrajeev@linux.vnet.ibm.com
2023-09-30powerpc/pseries: Fix STK_PARAM access in the hcall tracing codeAthira Rajeev1-3/+1
In powerpc pseries system, below behaviour is observed while enabling tracing on hcall: # cd /sys/kernel/debug/tracing/ # cat events/powerpc/hcall_exit/enable 0 # echo 1 > events/powerpc/hcall_exit/enable # ls -bash: fork: Bad address Above is from power9 lpar with latest kernel. Past this, softlockup is observed. Initially while attempting via perf_event_open to use "PERF_TYPE_TRACEPOINT", kernel panic was observed. perf config used: ================ memset(&pe[1],0,sizeof(struct perf_event_attr)); pe[1].type=PERF_TYPE_TRACEPOINT; pe[1].size=96; pe[1].config=0x26ULL; /* 38 raw_syscalls/sys_exit */ pe[1].sample_type=0; /* 0 */ pe[1].read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING|PERF_FORMAT_ID|PERF_FORMAT_GROUP|0x10ULL; /* 1f */ pe[1].inherit=1; pe[1].precise_ip=0; /* arbitrary skid */ pe[1].wakeup_events=0; pe[1].bp_type=HW_BREAKPOINT_EMPTY; pe[1].config1=0x1ULL; Kernel panic logs: ================== Kernel attempted to read user page (8) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000008 Faulting instruction address: 0xc0000000004c2814 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: nfnetlink bonding tls rfkill sunrpc dm_service_time dm_multipath pseries_rng xts vmx_crypto xfs libcrc32c sd_mod t10_pi crc64_rocksoft crc64 sg ibmvfc scsi_transport_fc ibmveth dm_mirror dm_region_hash dm_log dm_mod fuse CPU: 0 PID: 1431 Comm: login Not tainted 6.4.0+ #1 Hardware name: IBM,8375-42A POWER9 (raw) 0x4e0202 0xf000005 of:IBM,FW950.30 (VL950_892) hv:phyp pSeries NIP page_remove_rmap+0x44/0x320 LR wp_page_copy+0x384/0xec0 Call Trace: 0xc00000001416e400 (unreliable) wp_page_copy+0x384/0xec0 __handle_mm_fault+0x9d4/0xfb0 handle_mm_fault+0xf0/0x350 ___do_page_fault+0x48c/0xc90 hash__do_page_fault+0x30/0x70 do_hash_fault+0x1a4/0x330 data_access_common_virt+0x198/0x1f0 --- interrupt: 300 at 0x7fffae971abc git bisect tracked this down to below commit: 'commit baa49d81a94b ("powerpc/pseries: hvcall stack frame overhead")' This commit changed STACK_FRAME_OVERHEAD (112 ) to STACK_FRAME_MIN_SIZE (32 ) since 32 bytes is the minimum size for ELFv2 stack. With the latest kernel, when running on ELFv2, STACK_FRAME_MIN_SIZE is used to allocate stack size. During plpar_hcall_trace, first call is made to HCALL_INST_PRECALL which saves the registers and allocates new stack frame. In the plpar_hcall_trace code, STK_PARAM is accessed at two places. 1. To save r4: std r4,STK_PARAM(R4)(r1) 2. To access r4 back: ld r12,STK_PARAM(R4)(r1) HCALL_INST_PRECALL precall allocates a new stack frame. So all the stack parameter access after the precall, needs to be accessed with +STACK_FRAME_MIN_SIZE. So the store instruction should be: std r4,STACK_FRAME_MIN_SIZE+STK_PARAM(R4)(r1) If the "std" is not updated with STACK_FRAME_MIN_SIZE, we will end up with overwriting stack contents and cause corruption. But instead of updating 'std', we can instead remove it since HCALL_INST_PRECALL already saves it to the correct location. similarly load instruction should be: ld r12,STACK_FRAME_MIN_SIZE+STK_PARAM(R4)(r1) Fix the load instruction to correctly access the stack parameter with +STACK_FRAME_MIN_SIZE and remove the store of r4 since the precall saves it correctly. Cc: stable@vger.kernel.org # v6.2+ Fixes: baa49d81a94b ("powerpc/pseries: hvcall stack frame overhead") Co-developed-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230929172337.7906-1-atrajeev@linux.vnet.ibm.com
2023-09-30mm: hugetlb: add huge page size param to set_huge_pte_at()Ryan Roberts5-5/+12
Patch series "Fix set_huge_pte_at() panic on arm64", v2. This series fixes a bug in arm64's implementation of set_huge_pte_at(), which can result in an unprivileged user causing a kernel panic. The problem was triggered when running the new uffd poison mm selftest for HUGETLB memory. This test (and the uffd poison feature) was merged for v6.5-rc7. Ideally, I'd like to get this fix in for v6.6 and I've cc'ed stable (correctly this time) to get it backported to v6.5, where the issue first showed up. Description of Bug ================== arm64's huge pte implementation supports multiple huge page sizes, some of which are implemented in the page table with multiple contiguous entries. So set_huge_pte_at() needs to work out how big the logical pte is, so that it can also work out how many physical ptes (or pmds) need to be written. It previously did this by grabbing the folio out of the pte and querying its size. However, there are cases when the pte being set is actually a swap entry. But this also used to work fine, because for huge ptes, we only ever saw migration entries and hwpoison entries. And both of these types of swap entries have a PFN embedded, so the code would grab that and everything still worked out. But over time, more calls to set_huge_pte_at() have been added that set swap entry types that do not embed a PFN. And this causes the code to go bang. The triggering case is for the uffd poison test, commit 99aa77215ad0 ("selftests/mm: add uffd unit test for UFFDIO_POISON"), which causes a PTE_MARKER_POISONED swap entry to be set, coutesey of commit 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs") - added in v6.5-rc7. Although review shows that there are other call sites that set PTE_MARKER_UFFD_WP (which also has no PFN), these don't trigger on arm64 because arm64 doesn't support UFFD WP. If CONFIG_DEBUG_VM is enabled, we do at least get a BUG(), but otherwise, it will dereference a bad pointer in page_folio(): static inline struct folio *hugetlb_swap_entry_to_folio(swp_entry_t entry) { VM_BUG_ON(!is_migration_entry(entry) && !is_hwpoison_entry(entry)); return page_folio(pfn_to_page(swp_offset_pfn(entry))); } Fix === The simplest fix would have been to revert the dodgy cleanup commit 18f3962953e4 ("mm: hugetlb: kill set_huge_swap_pte_at()"), but since things have moved on, this would have required an audit of all the new set_huge_pte_at() call sites to see if they should be converted to set_huge_swap_pte_at(). As per the original intent of the change, it would also leave us open to future bugs when people invariably get it wrong and call the wrong helper. So instead, I've added a huge page size parameter to set_huge_pte_at(). This means that the arm64 code has the size in all cases. It's a bigger change, due to needing to touch the arches that implement the function, but it is entirely mechanical, so in my view, low risk. I've compile-tested all touched arches; arm64, parisc, powerpc, riscv, s390, sparc (and additionally x86_64). I've additionally booted and run mm selftests against arm64, where I observe the uffd poison test is fixed, and there are no other regressions. This patch (of 2): In order to fix a bug, arm64 needs to be told the size of the huge page for which the pte is being set in set_huge_pte_at(). Provide for this by adding an `unsigned long sz` parameter to the function. This follows the same pattern as huge_pte_clear(). This commit makes the required interface modifications to the core mm as well as all arches that implement this function (arm64, parisc, powerpc, riscv, s390, sparc). The actual arm64 bug will be fixed in a separate commit. No behavioral changes intended. Link: https://lkml.kernel.org/r/20230922115804.2043771-1-ryan.roberts@arm.com Link: https://lkml.kernel.org/r/20230922115804.2043771-2-ryan.roberts@arm.com Fixes: 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs") Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> [powerpc 8xx] Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> [vmalloc change] Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Xu <peterx@redhat.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> [6.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-22powerpc/stacktrace: Fix arch_stack_walk_reliable()Michael Ellerman1-22/+5
The changes to copy_thread() made in commit eed7c420aac7 ("powerpc: copy_thread differentiate kthreads and user mode threads") inadvertently broke arch_stack_walk_reliable() because it has knowledge of the stack layout. Fix it by changing the condition to match the new logic in copy_thread(). The changes make the comments about the stack layout incorrect, rather than rephrasing them just refer the reader to copy_thread(). Also the comment about the stack backchain is no longer true, since commit edbd0387f324 ("powerpc: copy_thread add a back chain to the switch stack frame"), so remove that as well. Fixes: eed7c420aac7 ("powerpc: copy_thread differentiate kthreads and user mode threads") Reported-by: Joe Lawrence <joe.lawrence@redhat.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230921232441.1181843-1-mpe@ellerman.id.au
2023-09-18powerpc/dexcr: Move HASHCHK trap handlerBenjamin Gray1-20/+36
Syzkaller reported a sleep in atomic context bug relating to the HASHCHK handler logic: BUG: sleeping function called from invalid context at arch/powerpc/kernel/traps.c:1518 in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 25040, name: syz-executor preempt_count: 0, expected: 0 RCU nest depth: 0, expected: 0 no locks held by syz-executor/25040. irq event stamp: 34 hardirqs last enabled at (33): [<c000000000048b38>] prep_irq_for_enabled_exit arch/powerpc/kernel/interrupt.c:56 [inline] hardirqs last enabled at (33): [<c000000000048b38>] interrupt_exit_user_prepare_main+0x148/0x600 arch/powerpc/kernel/interrupt.c:230 hardirqs last disabled at (34): [<c00000000003e6a4>] interrupt_enter_prepare+0x144/0x4f0 arch/powerpc/include/asm/interrupt.h:176 softirqs last enabled at (0): [<c000000000281954>] copy_process+0x16e4/0x4750 kernel/fork.c:2436 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 15 PID: 25040 Comm: syz-executor Not tainted 6.5.0-rc5-00001-g3ccdff6bb06d #3 Hardware name: IBM,9105-22A POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1040.00 (NL1040_021) hv:phyp pSeries Call Trace: [c0000000a8247ce0] [c00000000032b0e4] __might_resched+0x3b4/0x400 kernel/sched/core.c:10189 [c0000000a8247d80] [c0000000008c7dc8] __might_fault+0xa8/0x170 mm/memory.c:5853 [c0000000a8247dc0] [c00000000004160c] do_program_check+0x32c/0xb20 arch/powerpc/kernel/traps.c:1518 [c0000000a8247e50] [c000000000009b2c] program_check_common_virt+0x3bc/0x3c0 To determine if a trap was caused by a HASHCHK instruction, we inspect the user instruction that triggered the trap. However this may sleep if the page needs to be faulted in (get_user_instr() reaches __get_user(), which calls might_fault() and triggers the bug message). Move the HASHCHK handler logic to after we allow IRQs, which is fine because we are only interested in HASHCHK if it's a user space trap. Fixes: 5bcba4e6c13f ("powerpc/dexcr: Handle hashchk exception") Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230915034604.45393-1-bgray@linux.ibm.com
2023-09-18powerpc/82xx: Select FSL_SOCChristophe Leroy1-2/+1
It used to be impossible to select CONFIG_CPM2 without selecting CONFIG_FSL_SOC at the same time because CONFIG_CPM2 was dependent on CONFIG_8260 and CONFIG_8260 was selecting CONFIG_FSL_SOC. But after commit eb5aa2137275 ("powerpc/82xx: Remove CONFIG_8260 and CONFIG_8272") CONFIG_CPM2 depends on CONFIG_PPC_82xx instead but CONFIG_PPC_82xx doesn't directly selects CONFIG_FSL_SOC. Fix it by forcing CONFIG_PPC_82xx to select CONFIG_FSL_SOC just like already done by PPC_8xx, PPC_MPC512x, PPC_83xx, PPC_86xx. Reported-by: Randy Dunlap <rdunlap@infradead.org> Fixes: eb5aa2137275 ("powerpc/82xx: Remove CONFIG_8260 and CONFIG_8272") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/7ab513546148ebe33ddd4b0ea92c7bfd3cce3ad7.1694705016.git.christophe.leroy@csgroup.eu
2023-09-18powerpc: Fix build issue with LD_DEAD_CODE_DATA_ELIMINATION and ↵Naveen N Rao1-1/+1
FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY We recently added support for -fpatchable-function-entry and it is enabled by default on ppc32 (ppc64 needs gcc v13.1.0). When building the kernel for ppc32 and also enabling CONFIG_LD_DEAD_CODE_DATA_ELIMINATION, we see the below build error with older gcc versions: powerpc-linux-gnu-ld: init/main.o(__patchable_function_entries): error: need linked-to section for --gc-sections This error is thrown since __patchable_function_entries section would be garbage collected with --gc-sections since it does not reference any other kept sections. This has subsequently been fixed with: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=b7d072167715829eed0622616f6ae0182900de3e Disable LD_DEAD_CODE_DATA_ELIMINATION for gcc versions before v11.1.0 if using -fpatchable-function-entry to avoid this bug. Fixes: 0f71dcfb4aef ("powerpc/ftrace: Add support for -fpatchable-function-entry") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230913134129.2782088-1-naveen@kernel.org
2023-09-18powerpc/watchpoints: Annotate atomic context in more placesBenjamin Gray1-0/+9
It can be easy to miss that the notifier mechanism invokes the callbacks in an atomic context, so add some comments to that effect on the two handlers we register here. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230829063457.54157-4-bgray@linux.ibm.com
2023-09-18powerpc/watchpoint: Disable pagefaults when getting user instructionBenjamin Gray1-1/+6
This is called in an atomic context, so is not allowed to sleep if a user page needs to be faulted in and has nowhere it can be deferred to. The pagefault_disabled() function is documented as preventing user access methods from sleeping. In practice the page will be mapped in nearly always because we are reading the instruction that just triggered the watchpoint trap. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230829063457.54157-3-bgray@linux.ibm.com
2023-09-18powerpc/watchpoints: Disable preemption in thread_change_pc()Benjamin Gray1-1/+6
thread_change_pc() uses CPU local data, so must be protected from swapping CPUs while it is reading the breakpoint struct. The error is more noticeable after 1e60f3564bad ("powerpc/watchpoints: Track perf single step directly on the breakpoint"), which added an unconditional __this_cpu_read() call in thread_change_pc(). However the existing __this_cpu_read() that runs if a breakpoint does need to be re-inserted has the same issue. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230829063457.54157-2-bgray@linux.ibm.com
2023-09-18powerpc/perf/hv-24x7: Update domain value checkKajol Jain1-1/+1
Valid domain value is in range 1 to HV_PERF_DOMAIN_MAX. Current code has check for domain value greater than or equal to HV_PERF_DOMAIN_MAX. But the check for domain value 0 is missing. Fix this issue by adding check for domain value 0. Before: # ./perf stat -v -e hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ sleep 1 Using CPUID 00800200 Control descriptor is not initialized Error: The sys_perf_event_open() syscall returned with 5 (Input/output error) for event (hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/). /bin/dmesg | grep -i perf may provide additional information. Result from dmesg: [ 37.819387] hv-24x7: hcall failed: [0 0x60040000 0x100 0] => ret 0xfffffffffffffffc (-4) detail=0x2000000 failing ix=0 After: # ./perf stat -v -e hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ sleep 1 Using CPUID 00800200 Control descriptor is not initialized Warning: hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ event is not supported by the kernel. failed to read counter hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ Fixes: ebd4a5a3ebd9 ("powerpc/perf/hv-24x7: Minor improvements") Reported-by: Krishan Gopal Sarawast <krishang@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230825055601.360083-1-kjain@linux.ibm.com
2023-09-05Merge tag 'ata-6.6-rc1' of ↵Linus Torvalds1-18/+0
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata updates from Damien Le Moal: - Fix OF include file for ata platform drivers (Rob) - Simplify various ahci, sata and pata platform drivers using the function devm_platform_ioremap_resource() (Yangtao) - Cleanup libata time related argument types (e.g. timeouts values) (Sergey) - Cleanup libata code around error handling as all ata drivers now define a error_handler operation (Hannes and Niklas) - Remove functions intended for libsas that are in fact unused (Niklas) - Change the remove device callback of platform drivers to a null function (Uwe) - Simplify the pata_imx driver using devm_clk_get_enabled() (Li) - Remove old and uinused remnants of the ide code in arm, parisc, powerpc, sparc and m68k architectures and associated drivers (pata_buddha, pata_falcon and pata_gayle) (Geert) - Add missing MODULE_DESCRIPTION() in the sata_gemini and pata_ftide010 drivers (me) - Several fixes for the pata_ep93xx and pata_falcon drivers (Nikita, Michael) - Add Elkhart Lake AHCI controller support to the ahci driver (Werner) - Disable NCQ trim on Micron 1100 drives (Pawel) * tag 'ata-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (60 commits) ata: libata-core: Disable NCQ_TRIM on Micron 1100 drives ata: ahci: Add Elkhart Lake AHCI controller ata: pata_falcon: add data_swab option to byte-swap disk data ata: pata_falcon: fix IO base selection for Q40 ata: pata_ep93xx: use soc_device_match for UDMA modes ata: pata_ep93xx: fix error return code in probe ata: sata_gemini: Add missing MODULE_DESCRIPTION ata: pata_ftide010: Add missing MODULE_DESCRIPTION m68k: Remove <asm/ide.h> ata: pata_gayle: Remove #include <asm/ide.h> ata: pata_falcon: Remove #include <asm/ide.h> ata: pata_buddha: Remove #include <asm/ide.h> asm-generic: Remove ide_iops.h sparc: Remove <asm/ide.h> powerpc: Remove <asm/ide.h> parisc: Remove <asm/ide.h> ARM: Remove <asm/ide.h> ata: pata_imx: Use helper function devm_clk_get_enabled() ata: sata_rcar: Convert to platform remove callback returning void ata: sata_mv: Convert to platform remove callback returning void ...
2023-09-05Merge tag 'kbuild-v6.6' of ↵Linus Torvalds2-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Enable -Wenum-conversion warning option - Refactor the rpm-pkg target - Fix scripts/setlocalversion to consider annotated tags for rt-kernel - Add a jump key feature for the search menu of 'make nconfig' - Support Qt6 for 'make xconfig' - Enable -Wformat-overflow, -Wformat-truncation, -Wstringop-overflow, and -Wrestrict warnings for W=1 builds - Replace <asm/export.h> with <linux/export.h> for alpha, ia64, and sparc - Support DEB_BUILD_OPTIONS=parallel=N for the debian source package - Refactor scripts/Makefile.modinst and fix some modules_sign issues - Add a new Kconfig env variable to warn symbols that are not defined anywhere - Show help messages of config fragments in 'make help' * tag 'kbuild-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (62 commits) kconfig: fix possible buffer overflow kbuild: Show marked Kconfig fragments in "help" kconfig: add warn-unknown-symbols sanity check kbuild: dummy-tools: make MPROFILE_KERNEL checks work on BE Documentation/llvm: refresh docs modpost: Skip .llvm.call-graph-profile section check kbuild: support modules_sign for external modules as well kbuild: support 'make modules_sign' with CONFIG_MODULE_SIG_ALL=n kbuild: move more module installation code to scripts/Makefile.modinst kbuild: reduce the number of mkdir calls during modules_install kbuild: remove $(MODLIB)/source symlink kbuild: move depmod rule to scripts/Makefile.modinst kbuild: add modules_sign to no-{compiler,sync-config}-targets kbuild: do not run depmod for 'make modules_sign' kbuild: deb-pkg: support DEB_BUILD_OPTIONS=parallel=N in debian/rules alpha: remove <asm/export.h> alpha: replace #include <asm/export.h> with #include <linux/export.h> ia64: remove <asm/export.h> ia64: replace #include <asm/export.h> with #include <linux/export.h> sparc: remove <asm/export.h> ...
2023-09-03kbuild: Show marked Kconfig fragments in "help"Kees Cook2-1/+4
Currently the Kconfig fragments in kernel/configs and arch/*/configs that aren't used internally aren't discoverable through "make help", which consists of hard-coded lists of config fragments. Instead, list all the fragment targets that have a "# Help: " comment prefix so the targets can be generated dynamically. Add logic to the Makefile to search for and display the fragment and comment. Add comments to fragments that are intended to be direct targets. Signed-off-by: Kees Cook <keescook@chromium.org> Co-developed-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-09-02Merge tag 'v6.6-p2' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a random config build failure on powerpc" * tag 'v6.6-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: powerpc/chacha20,poly1305-p10 - Add dependency on VSX
2023-09-01Merge tag 'tty-6.6-rc1' of ↵Linus Torvalds5-32/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver updates from Greg KH: "Here is the big set of tty and serial driver changes for 6.6-rc1. Lots of cleanups in here this cycle, and some driver updates. Short summary is: - Jiri's continued work to make the tty code and apis be a bit more sane with regards to modern kernel coding style and types - cpm_uart driver updates - n_gsm updates and fixes - meson driver updates - sc16is7xx driver updates - 8250 driver updates for different hardware types - qcom-geni driver fixes - tegra serial driver change - stm32 driver updates - synclink_gt driver cleanups - tty structure size reduction All of these have been in linux-next this week with no reported issues. The last bit of cleanups from Jiri and the tty structure size reduction came in last week, a bit late but as they were just style changes and size reductions, I figured they should get into this merge cycle so that others can work on top of them with no merge conflicts" * tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (199 commits) tty: shrink the size of struct tty_struct by 40 bytes tty: n_tty: deduplicate copy code in n_tty_receive_buf_real_raw() tty: n_tty: extract ECHO_OP processing to a separate function tty: n_tty: unify counts to size_t tty: n_tty: use u8 for chars and flags tty: n_tty: simplify chars_in_buffer() tty: n_tty: remove unsigned char casts from character constants tty: n_tty: move newline handling to a separate function tty: n_tty: move canon handling to a separate function tty: n_tty: use MASK() for masking out size bits tty: n_tty: make n_tty_data::num_overrun unsigned tty: n_tty: use time_is_before_jiffies() in n_tty_receive_overrun() tty: n_tty: use 'num' for writes' counts tty: n_tty: use output character directly tty: n_tty: make flow of n_tty_receive_buf_common() a bool Revert "tty: serial: meson: Add a earlycon for the T7 SoC" Documentation: devices.txt: Fix minors for ttyCPM* Documentation: devices.txt: Remove ttySIOC* Documentation: devices.txt: Remove ttyIOC* serial: 8250_bcm7271: improve bcm7271 8250 port ...
2023-09-01powerpc: Fix pud_mkwrite() definition after pte_mkwrite() API changesIngo Molnar1-1/+1
Fix up missed semantic mis-merge between commits 161e393c0f63 ("mm: Make pte_mkwrite() take a VMA") 27af67f35631 ("powerpc/book3s64/mm: enable transparent pud hugepage") where the newly introduced powerpc use of 'pte_mkwrite()' needs to use the 'novma()' versions as per commit 2f0584f3f4bd ("mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()"). Fixes: df57721f9a63 ("Merge tag 'x86_shstk_for_6.6-rc1' of [...]") Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-08-31Merge tag 'powerpc-6.6-1' of ↵Linus Torvalds267-2621/+3366
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the configured SMT state when hotplugging CPUs into the system - Combine final TLB flush and lazy TLB mm shootdown IPIs when using the Radix MMU to avoid a broadcast TLBIE flush on exit - Drop the exclusion between ptrace/perf watchpoints, and drop the now unused associated arch hooks - Add support for the "nohlt" command line option to disable CPU idle - Add support for -fpatchable-function-entry for ftrace, with GCC >= 13.1 - Rework memory block size determination, and support 256MB size on systems with GPUs that have hotpluggable memory - Various other small features and fixes Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam Menghani, Geoff Levand, Hari Bathini, Immad Mir, Jialin Zhang, Joel Stanley, Jordan Niethe, Justin Stitt, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent Dufour, Liang He, Linus Walleij, Mahesh Salgaonkar, Masahiro Yamada, Michal Suchanek, Nageswara R Sastry, Nathan Chancellor, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick Desaulniers, Omar Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell Currey, Sourabh Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav Jain, Xiongfeng Wang, Yuan Tan, Zhang Rui, and Zheng Zengkai. * tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (135 commits) macintosh/ams: linux/platform_device.h is needed powerpc/xmon: Reapply "Relax frame size for clang" powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory attached powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled powerpc/iommu: Fix notifiers being shared by PCI and VIO buses powerpc/mpc5xxx: Add missing fwnode_handle_put() powerpc/config: Disable SLAB_DEBUG_ON in skiroot powerpc/pseries: Remove unused hcall tracing instruction powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=n powerpc: dts: add missing space before { powerpc/eeh: Use pci_dev_id() to simplify the code powerpc/64s: Move CPU -mtune options into Kconfig powerpc/powermac: Fix unused function warning powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT powerpc: Don't include lppaca.h in paca.h powerpc/pseries: Move hcall_vphn() prototype into vphn.h powerpc/pseries: Move VPHN constants into vphn.h cxl: Drop unused detach_spa() powerpc: Drop zalloc_maybe_bootmem() powerpc/powernv: Use struct opal_prd_msg in more places ...
2023-08-31Merge tag 'x86_shstk_for_6.6-rc1' of ↵Linus Torvalds5-8/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 shadow stack support from Dave Hansen: "This is the long awaited x86 shadow stack support, part of Intel's Control-flow Enforcement Technology (CET). CET consists of two related security features: shadow stacks and indirect branch tracking. This series implements just the shadow stack part of this feature, and just for userspace. The main use case for shadow stack is providing protection against return oriented programming attacks. It works by maintaining a secondary (shadow) stack using a special memory type that has protections against modification. When executing a CALL instruction, the processor pushes the return address to both the normal stack and to the special permission shadow stack. Upon RET, the processor pops the shadow stack copy and compares it to the normal stack copy. For more information, refer to the links below for the earlier versions of this patch set" Link: https://lore.kernel.org/lkml/20220130211838.8382-1-rick.p.edgecombe@intel.com/ Link: https://lore.kernel.org/lkml/20230613001108.3040476-1-rick.p.edgecombe@intel.com/ * tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits) x86/shstk: Change order of __user in type x86/ibt: Convert IBT selftest to asm x86/shstk: Don't retry vm_munmap() on -EINTR x86/kbuild: Fix Documentation/ reference x86/shstk: Move arch detail comment out of core mm x86/shstk: Add ARCH_SHSTK_STATUS x86/shstk: Add ARCH_SHSTK_UNLOCK x86: Add PTRACE interface for shadow stack selftests/x86: Add shadow stack test x86/cpufeatures: Enable CET CR4 bit for shadow stack x86/shstk: Wire in shadow stack interface x86: Expose thread features in /proc/$PID/status x86/shstk: Support WRSS for userspace x86/shstk: Introduce map_shadow_stack syscall x86/shstk: Check that signal frame is shadow stack mem x86/shstk: Check that SSP is aligned on sigreturn x86/shstk: Handle signals for shadow stack x86/shstk: Introduce routines modifying shstk x86/shstk: Handle thread shadow stack x86/shstk: Add user-mode shadow stack support ...
2023-08-30Merge tag 'integrity-v6.6' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity subsystem updates from Mimi Zohar: - With commit 099f26f22f58 ("integrity: machine keyring CA configuration") certificates may be loaded onto the IMA keyring, directly or indirectly signed by keys on either the "builtin" or the "machine" keyrings. With the ability for the system/machine owner to sign the IMA policy itself without needing to recompile the kernel, update the IMA architecture specific policy rules to require the IMA policy itself be signed. [ As commit 099f26f22f58 was upstreamed in linux-6.4, updating the IMA architecture specific policy now to require signed IMA policies may break userspace expectations. ] - IMA only checked the file data hash was not on the system blacklist keyring for files with an appended signature (e.g. kernel modules, Power kernel image). Check all file data hashes regardless of how it was signed - Code cleanup, and a kernel-doc update * tag 'integrity-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: kexec_lock: Replace kexec_mutex() by kexec_lock() in two comments ima: require signed IMA policy when UEFI secure boot is enabled integrity: Always reference the blacklist keyring with appraisal ima: Remove deprecated IMA_TRUSTED_KEYRING Kconfig
2023-08-30crypto: powerpc/chacha20,poly1305-p10 - Add dependency on VSXHerbert Xu1-2/+2
Add dependency on VSX as otherwise the build will fail without it. Fixes: 161fca7e3e90 ("crypto: powerpc - Add chacha20/poly1305-p10 to Kconfig and Makefile") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308251906.SYawej6g-lkp@intel.com/ Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-30Merge tag 'dma-mapping-6.6-2023-08-29' of ↵Linus Torvalds1-0/+1
git://git.infradead.org/users/hch/dma-mapping Pull dma-maping updates from Christoph Hellwig: - allow dynamic sizing of the swiotlb buffer, to cater for secure virtualization workloads that require all I/O to be bounce buffered (Petr Tesarik) - move a declaration to a header (Arnd Bergmann) - check for memory region overlap in dma-contiguous (Binglei Wang) - remove the somewhat dangerous runtime swiotlb-xen enablement and unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross) - per-node CMA improvements (Yajun Deng) * tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: optimize get_max_slots() swiotlb: move slot allocation explanation comment where it belongs swiotlb: search the software IO TLB only if the device makes use of it swiotlb: allocate a new memory pool when existing pools are full swiotlb: determine potential physical address limit swiotlb: if swiotlb is full, fall back to a transient memory pool swiotlb: add a flag whether SWIOTLB is allowed to grow swiotlb: separate memory pool data from other allocator data swiotlb: add documentation and rename swiotlb_do_find_slots() swiotlb: make io_tlb_default_mem local to swiotlb.c swiotlb: bail out of swiotlb_init_late() if swiotlb is already allocated dma-contiguous: check for memory region overlap dma-contiguous: support numa CMA for specified node dma-contiguous: support per-numa CMA for all architectures dma-mapping: move arch_dma_set_mask() declaration to header swiotlb: unexport is_swiotlb_active x86: always initialize xen-swiotlb when xen-pcifront is enabling xen/pci: add flag for PCI passthrough being possible