summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2026-01-26m68k: sun3: Replace vsprintf() with bounded vsnprintf()Thorsten Blum1-2/+2
vsprintf() performs no bounds checking and can overflow - replace it with the safer vsnprintf(). Also remove the useless '+ 1' that is a leftover of commit 66ed28ea096c ("m68k: sun3: Remove unused vsprintf() return value in prom_printf()"). Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://patch.msgid.link/20260117202152.1036278-2-thorsten.blum@linux.dev Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2026-01-26riscv/mm: update write protect to work on shadow stacksDeepak Gupta1-2/+10
'fork' implements copy-on-write (COW) by making pages readonly in both child and parent. ptep_set_wrprotect() and pte_wrprotect() clear _PAGE_WRITE in PTE. The assumption is that the page is readable and, on a fault, copy-on-write happens. To implement COW on shadow stack pages, clearing the W bit makes them XWR = 000. This will result in the wrong PTE setting, which allows no permissions, but with V=1 and the PFN field pointing to the final page. Instead, the desired behavior is to turn it into a readable page, take an access (load/store) fault on sspush/sspop (shadow stack) and then perform COW on such pages. This way regular reads would still be allowed and not lead to COW maintaining current behavior of COW on non-shadow stack but writeable memory. On the other hand, this doesn't interfere with existing COW for read-write memory. The assumption is always that _PAGE_READ must have been set, and thus, setting _PAGE_READ is harmless. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-9-b55691eacf4f@rivosinc.com [pjw@kernel.org: clarify patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26riscv/mm: teach pte_mkwrite to manufacture shadow stack PTEsDeepak Gupta2-0/+23
pte_mkwrite() creates PTEs with WRITE encodings for the underlying architecture. The underlying architecture can have two types of writeable mappings: one that can be written using regular store instructions, and another one that can only be written using specialized store instructions (like shadow stack stores). pte_mkwrite can select write PTE encoding based on VMA range (i.e. VM_SHADOW_STACK) Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-8-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26riscv/mm: manufacture shadow stack ptesDeepak Gupta1-0/+10
This patch implements the creation of a shadow stack pte on riscv. Creating shadow stack PTE on riscv means clearing RWX and then setting W=1. Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-7-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26riscv/mm: ensure PROT_WRITE leads to VM_READ | VM_WRITEDeepak Gupta4-1/+38
'arch_calc_vm_prot_bits' is implemented on risc-v to return VM_READ | VM_WRITE if PROT_WRITE is specified. Similarly 'riscv_sys_mmap' is updated to convert all incoming PROT_WRITE to (PROT_WRITE | PROT_READ). This is to make sure that any existing apps using PROT_WRITE still work. Earlier 'protection_map[VM_WRITE]' used to pick read-write PTE encodings. Now 'protection_map[VM_WRITE]' will always pick PAGE_SHADOWSTACK PTE encodings for shadow stack. The above changes ensure that existing apps continue to work because underneath, the kernel will be picking 'protection_map[VM_WRITE|VM_READ]' PTE encodings. Reviewed-by: Zong Li <zong.li@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-6-b55691eacf4f@rivosinc.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26riscv: Add usercfi state for task and save/restore of CSR_SSP on trap entry/exitDeepak Gupta5-0/+62
Carve out space in the RISC-V architecture-specific thread struct for cfi status and shadow stack in usermode. This patch: - defines a new structure cfi_status with status bit for cfi feature - defines shadow stack pointer, base and size in cfi_status structure - defines offsets to new member fields in thread in asm-offsets.c - saves and restores shadow stack pointer on trap entry (U --> S) and exit (S --> U) Shadow stack save/restore is gated on feature availability and is implemented using alternatives. CSR_SSP can be context-switched in 'switch_to' as well, but as soon as kernel shadow stack support gets rolled in, the shadow stack pointer will need to be switched at trap entry/exit point (much like 'sp'). It can be argued that a kernel using a shadow stack deployment scenario may not be as prevalent as user mode using this feature. But even if there is some minimal deployment of kernel shadow stack, that means that it needs to be supported. Thus save/restore of shadow stack pointer is implemented in entry.S instead of in 'switch_to.h'. Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Zong Li <zong.li@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-5-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26riscv: add Zicfiss / Zicfilp extension CSR and bit definitionsDeepak Gupta1-0/+14
The Zicfiss and Zicfilp extensions are enabled via b3 and b2 in *envcfg CSRs. menvcfg controls enabling for S/HS mode. henvcfg controls enabling for VS. senvcfg controls enabling for U/VU mode. The Zicfilp extension extends *status CSRs to hold an 'expected landing pad' bit. A trap or interrupt can occur between an indirect jmp/call and target instruction. The 'expected landing pad' bit from the CPU is recorded into the xstatus CSR so that when the supervisor performs xret, the 'expected landing pad' state of the CPU can be restored. Zicfiss adds one new CSR, CSR_SSP, which contains the current shadow stack pointer. Signed-off-by: Deepak Gupta <debug@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-4-b55691eacf4f@rivosinc.com [pjw@kernel.org: grouped CSR_SSP macro with the other CSR macros; clarified patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26riscv: zicfiss / zicfilp enumerationDeepak Gupta3-0/+36
This patch adds support for detecting the RISC-V ISA extensions Zicfiss and Zicfilp. Zicfiss and Zicfilp stand for the unprivileged integer spec extensions for shadow stack and indirect branch tracking, respectively. This patch looks for Zicfiss and Zicfilp in the device tree and accordingly lights up the corresponding bits in the cpu feature bitmap. Furthermore this patch adds detection utility functions to return whether shadow stack or landing pads are supported by the cpu. Reviewed-by: Zong Li <zong.li@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-3-b55691eacf4f@rivosinc.com [pjw@kernel.org: updated to apply; cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26riscv: defconfig: enable NLS_ISO8859_1Javier Carrasco1-1/+1
NLS_ISO8859_1 was enabled as a module with commit efe1e08bca9a ("riscv: defconfig: enable NLS_CODEPAGE_437, NLS_ISO8859_1"), but the NLS_CODEPAGE_437 counterpart is selected as built-in. The commit does not explain the reason behind, and it is not consistent with the defconfig for ARM64 that also enables these modules to mount EFI system partitions. Select NLS_ISO8859_1 as built-in to provide both requirements within the kernel image. Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20260111-nls_iso8859_1_y_riscv-v1-1-2c992bb2c00d@gmail.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26riscv: mm: define copy_user_page() as copy_page()Florian Schmaus1-2/+1
Currently, the implementation of copy_user_page() is identical to copy_page(). Align riscv with other architectures (alpha, arc, arm64, hexagon, longarch, m68k, openrisc, s390, um, xtensa) and map copy_user_page() to copy_page() given that their implementation is identical. In addition to following a common pattern, this centralizes the implementation. Any changes to the underlying page copy logic (e.g., for CHERI) will now automatically propagate to copy_user_page(). Signed-off-by: Florian Schmaus <florian.schmaus@codasip.com> Link: https://patch.msgid.link/20260113134025.905627-1-florian.schmaus@codasip.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26errata/sifive: remove unreliable warn_miss_errataAndreas Schwab1-18/+0
When both the SiFive and MIPS errata are enabled then sifive_errata_patch_func emits a wrong and misleading warning claiming that the SiFive errata haven't been applied. This happens because sifive_errata_patch_func is being called twice, once for the kernel image and once for the vdso image. The vdso image has alternative entries for the MIPS errata, but none for the SiFive errata. Signed-off-by: Andreas Schwab <schwab@suse.de> Link: https://patch.msgid.link/mvmv7i8q8gg.fsf@suse.de Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26riscv: fix minor typo in syscall.h commentAustin Kim1-1/+1
Some developers may be confused because RISC-V does not have a register named r0. Also, orig_r0 is not available in pt_regs structure, which is specific to riscv. So we had better fix this minor typo. Signed-off-by: Austin Kim <austin.kim@lge.com> Link: https://patch.msgid.link/aW3Z4zTBvGJpk7a7@adminpc-PowerEdge-R7525 Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-26riscv: signal: fix some warnings reported by sparsePaul Walmsley1-3/+3
Clean up a few warnings reported by sparse in arch/riscv/kernel/signal.c. These come from code that was added recently; they were missed when I initially reviewed the patch. Fixes: 818d78ba1b3f ("riscv: signal: abstract header saving for setup_sigcontext") Cc: Andy Chiu <andybnac@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202601171848.ydLTJYrz-lkp@intel.com/ [pjw@kernel.org: updated to apply] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-25KVM: arm64: Simplify PAGE_S2_MEMATTRMarc Zyngier2-5/+4
Restore PAGE_S2_MEMATTR() to its former glory, keeping the use of FWB as an implementation detail. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260123191637.715429-6-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-25KVM: arm64: Kill KVM_PGTABLE_S2_NOFWBMarc Zyngier2-20/+8
Nobody is using this flag anymore, so remove it. This allows some cleanup by removing stage2_has_fwb(), which is can be replaced by a direct check on the capability. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260123191637.715429-5-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-25KVM: arm64: Switch pKVM host S2 over to KVM_PGTABLE_S2_AS_S1Marc Zyngier1-1/+3
Since we have the basics to use the S1 memory attributes as the final ones with FWB, flip the host over to that when FWB is present. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260123191637.715429-4-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-25KVM: arm64: Add KVM_PGTABLE_S2_AS_S1 flagMarc Zyngier2-1/+15
Plumb the MT_S2{,_FWB}_AS_S1 memory types into the KVM_S2_MEMATTR() macro with a new KVM_PGTABLE_S2_AS_S1 flag. Nobody selects it yet. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260123191637.715429-3-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-25arm64: Add MT_S2{,_FWB}_AS_S1 encodingsMarc Zyngier1-3/+8
pKVM usage of S2 translation on the host is purely for isolation purposes, not translation. To that effect, the memory attributes being used must be that of S1. With FWB=0, this is easily achieved by using the Normal Cacheable type (which is the weakest possible memory type) at S2, and let S1 pick something stronger as required. With FWB=1, the attributes are combined in a different way, and we cannot arbitrarily use Normal Cacheable. We can, however, use a memattr encoding that indicates that the final attributes are that of Stage-1. Add these encoding and a few pointers to the relevant parts of the specification. It might come handy some day. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260123191637.715429-2-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-25Merge tag 'riscv-for-linus-6.19-rc7' of ↵Linus Torvalds4-5/+19
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: "The notable changes here are the three RISC-V timer compare register update sequence patches. These only apply to RV32 systems and are related to the 64-bit timer compare value being split across two separate 32-bit registers. We weren't using the appropriate three-write sequence, documented in the RISC-V ISA specifications, to avoid spurious timer interrupts during the update sequence; so, these patches now use the recommended sequence. This doesn't affect 64-bit RISC-V systems, since the timer compare value fits inside a single register and can be updated with a single write. - Fix the RISC-V timer compare register update sequence on RV32 systems to use the recommended sequence in the RISC-V ISA manual This avoids spurious interrupts during updates - Add a dependence on the new CONFIG_CACHEMAINT_FOR_DMA Kconfig symbol for Renesas and StarFive RISC-V SoCs - Add a temporary workaround for a Clang compiler bug caused by using asm_goto_output for get_user() - Clarify our documentation to specifically state a particular ISA specification version for a chapter number reference" * tag 'riscv-for-linus-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Add intermediate cast to 'unsigned long' in __get_user_asm riscv: Use 64-bit variable for output in __get_user_asm soc: renesas: Fix missing dependency on new CONFIG_CACHEMAINT_FOR_DMA riscv: ERRATA_STARFIVE_JH7100: Fix missing dependency on new CONFIG_CACHEMAINT_FOR_DMA riscv: suspend: Fix stimecmp update hazard on RV32 riscv: kvm: Fix vstimecmp update hazard on RV32 riscv: clocksource: Fix stimecmp update hazard on RV32 Documentation: riscv: uabi: Clarify ISA spec version for canonical order
2026-01-25bpf,x86: add fsession support for x86_64Menglong Dong1-12/+40
Add BPF_TRACE_FSESSION supporting to x86_64, including: 1. clear the return value in the stack before fentry to make the fentry of the fsession can only get 0 with bpf_get_func_ret(). 2. clear all the session cookies' value in the stack. 2. store the index of the cookie to ctx[-1] before the calling to fsession 3. store the "is_return" flag to ctx[-1] before the calling to fexit of the fsession. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Co-developed-by: Leon Hwang <leon.hwang@linux.dev> Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Link: https://lore.kernel.org/r/20260124062008.8657-8-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-25bpf,x86: introduce emit_store_stack_imm64() for trampolineMenglong Dong1-12/+14
Introduce the helper emit_store_stack_imm64(), which is used to store a imm64 to the stack with the help of a register. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20260124062008.8657-7-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-24Merge tag 'perf-urgent-2026-01-24' of ↵Linus Torvalds1-2/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events fixes from Ingo Molnar: - Fix mmap_count warning & bug when creating a group member event with the PERF_FLAG_FD_OUTPUT flag - Disable the sample period == 1 branch events BTS optimization on guests, because BTS is not virtualized * tag 'perf-urgent-2026-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Do not enable BTS for guests perf: Fix refcount warning on event->mmap_count increment
2026-01-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds17-38/+73
Pull arm64 kvm fixes from Paolo Bonzini: - Ensure early return semantics are preserved for pKVM fault handlers - Fix case where the kernel runs with the guest's PAN value when CONFIG_ARM64_PAN is not set - Make stage-1 walks to set the access flag respect the access permission of the underlying stage-2, when enabled - Propagate computed FGT values to the pKVM view of the vCPU at vcpu_load() - Correctly program PXN and UXN privilege bits for hVHE's stage-1 page tables - Check that the VM is actually using VGICv3 before accessing the GICv3 CPU interface - Delete some unused code * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: arm64: Invert KVM_PGTABLE_WALK_HANDLE_FAULT to fix pKVM walkers KVM: arm64: Don't blindly set set PSTATE.PAN on guest exit KVM: arm64: nv: Respect stage-2 write permssion when setting stage-1 AF KVM: arm64: Remove unused vcpu_{clear,set}_wfx_traps() KVM: arm64: Remove unused parameter in synchronize_vcpu_pstate() KVM: arm64: Remove extra argument for __pvkm_host_{share,unshare}_hyp() KVM: arm64: Inject UNDEF for a register trap without accessor KVM: arm64: Copy FGT traps to unprotected pKVM VCPU on VCPU load KVM: arm64: Fix EL2 S1 XN handling for hVHE setups KVM: arm64: gic: Check for vGICv3 when clearing TWI
2026-01-24alpha: fix user-space corruption during memory compactionMagnus Lindholm4-3/+148
Alpha systems can suffer sporadic user-space crashes and heap corruption when memory compaction is enabled. Symptoms include SIGSEGV, glibc allocator failures (e.g. "unaligned tcache chunk"), and compiler internal errors. The failures disappear when compaction is disabled or when using global TLB invalidation. The root cause is insufficient TLB shootdown during page migration. Alpha relies on ASN-based MM context rollover for instruction cache coherency, but this alone is not sufficient to prevent stale data or instruction translations from surviving migration. Fix this by introducing a migration-specific helper that combines: - MM context invalidation (ASN rollover), - immediate per-CPU TLB invalidation (TBI), - synchronous cross-CPU shootdown when required. The helper is used only by migration/compaction paths to avoid changing global TLB semantics. Additionally, update flush_tlb_other(), pte_clear(), to use READ_ONCE()/WRITE_ONCE() for correct SMP memory ordering. This fixes observed crashes on both UP and SMP Alpha systems. Reviewed-by: Ivan Kokshaysky <ink@unseen.parts> Tested-by: Matoro Mahri <matoro_mailinglist_kernel@matoro.tk> Tested-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Magnus Lindholm <linmag7@gmail.com> Link: https://lore.kernel.org/r/20260102173603.18247-2-linmag7@gmail.com Signed-off-by: Magnus Lindholm <linmag7@gmail.com>
2026-01-24x86/entry/vdso32: Omit '.cfi_offset eflags' for LLVM < 16Nathan Chancellor1-0/+10
After commit: 884961618ee5 ("x86/entry/vdso32: Remove open-coded DWARF in sigreturn.S") building arch/x86/entry/vdso/vdso32/sigreturn.S with LLVM 15 fails with: <instantiation>:18:20: error: invalid register name .cfi_offset eflags, 64 ^ arch/x86/entry/vdso/vdso32/sigreturn.S:33:2: note: while in macro instantiation STARTPROC_SIGNAL_FRAME 8 ^ Support for eflags as an argument to .cfi_offset was added in the LLVM 16 development cycle: https://github.com/llvm/llvm-project/commit/67bd3c58c0c7389e39c5a2f4d3b1a30459ccf5b7 [1] Only add this .cfi_offset directive if it is supported by the assembler to clear up the error. [ mingo: Tidied up the changelog and the comment a bit ] Fixes: 884961618ee5 ("x86/entry/vdso32: Remove open-coded DWARF in sigreturn.S") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: H. Peter Anvin (Intel) <hpa@zytor.com> Link: https://patch.msgid.link/20260123-x86-vdso32-wa-llvm-15-cfi-offset-eflags-v1-1-0f412e3516a4@kernel.org
2026-01-24Merge tag 'kvmarm-fixes-6.19-1' of ↵Paolo Bonzini17-38/+73
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.19 - Ensure early return semantics are preserved for pKVM fault handlers - Fix case where the kernel runs with the guest's PAN value when CONFIG_ARM64_PAN is not set - Make stage-1 walks to set the access flag respect the access permission of the underlying stage-2, when enabled - Propagate computed FGT values to the pKVM view of the vCPU at vcpu_load() - Correctly program PXN and UXN privilege bits for hVHE's stage-1 page tables - Check that the VM is actually using VGICv3 before accessing the GICv3 CPU interface - Delete some unused code
2026-01-24KVM: x86: Add SRCU protection for reading PDPTRs in __get_sregs2()Vasiliy Kovalev1-0/+2
Add SRCU read-side protection when reading PDPTR registers in __get_sregs2(). Reading PDPTRs may trigger access to guest memory: kvm_pdptr_read() -> svm_cache_reg() -> load_pdptrs() -> kvm_vcpu_read_guest_page() -> kvm_vcpu_gfn_to_memslot() kvm_vcpu_gfn_to_memslot() dereferences memslots via __kvm_memslots(), which uses srcu_dereference_check() and requires either kvm->srcu or kvm->slots_lock to be held. Currently only vcpu->mutex is held, triggering lockdep warning: ============================= WARNING: suspicious RCU usage in kvm_vcpu_gfn_to_memslot 6.12.59+ #3 Not tainted include/linux/kvm_host.h:1062 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by syz.5.1717/15100: #0: ff1100002f4b00b0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x1d5/0x1590 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0xf0/0x120 lib/dump_stack.c:120 lockdep_rcu_suspicious+0x1e3/0x270 kernel/locking/lockdep.c:6824 __kvm_memslots include/linux/kvm_host.h:1062 [inline] __kvm_memslots include/linux/kvm_host.h:1059 [inline] kvm_vcpu_memslots include/linux/kvm_host.h:1076 [inline] kvm_vcpu_gfn_to_memslot+0x518/0x5e0 virt/kvm/kvm_main.c:2617 kvm_vcpu_read_guest_page+0x27/0x50 virt/kvm/kvm_main.c:3302 load_pdptrs+0xff/0x4b0 arch/x86/kvm/x86.c:1065 svm_cache_reg+0x1c9/0x230 arch/x86/kvm/svm/svm.c:1688 kvm_pdptr_read arch/x86/kvm/kvm_cache_regs.h:141 [inline] __get_sregs2 arch/x86/kvm/x86.c:11784 [inline] kvm_arch_vcpu_ioctl+0x3e20/0x4aa0 arch/x86/kvm/x86.c:6279 kvm_vcpu_ioctl+0x856/0x1590 virt/kvm/kvm_main.c:4663 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x18b/0x210 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xbd/0x1d0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Suggested-by: Sean Christopherson <seanjc@google.com> Cc: stable@vger.kernel.org Fixes: 6dba94035203 ("KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2") Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org> Link: https://patch.msgid.link/20260123222801.646123-1-kovalev@altlinux.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-24Merge tag 's390-6.19-4' of ↵Linus Torvalds2-9/+10
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - Add $(DISABLE_KSTACK_ERASE) to vdso compile flags to fix compile errors with old gcc versions - Fix path to s390 chacha implementation in vdso selftests, after vdso64 has been renamed to vdso - Fix off-by-one bug in APQN limit calculation - Discard .modinfo section from decompressor image to fix SecureBoot * tag 's390-6.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/boot/vmlinux.lds.S: Ensure bzImage ends with SecureBoot trailer s390/ap: Fix wrong APQN fill calculation selftests: vDSO: getrandom: Fix path to s390 chacha implementation s390/vdso: Disable kstack erase
2026-01-24Merge tag 'arm64-fixes' of ↵Linus Torvalds3-21/+33
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - A set of fixes for FPSIMD/SVE/SME state management (around signal handling and ptrace) where a task can be placed in an invalid state - __nocfi added to swsusp_arch_resume() to avoid a data abort on resuming from hibernate * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Set __nocfi on swsusp_arch_resume() arm64/fpsimd: signal: Fix restoration of SVE context arm64/fpsimd: signal: Allocate SSVE storage when restoring ZA arm64/fpsimd: ptrace: Fix SVE writes on !SME systems
2026-01-24Merge branch 'for-7.0/cxl-init' into cxl-for-nextDave Jiang1-4/+11
Merge in patches to support several patch series such as Soft Reserve handling, type2 accelerator enabling, and LSA 2.1 labeling support. Mainly addition of cxl_memdev_attach() to allow the memdev probe to make a decision of proceed/fail depending success of CXL topology enumeration. dax/hmem, e820, resource: Defer Soft Reserved insertion until hmem is ready cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation cxl/mem: Drop @host argument to devm_cxl_add_memdev() cxl/mem: Convert devm_cxl_add_memdev() to scope-based-cleanup cxl/port: Arrange for always synchronous endpoint attach cxl/mem: Arrange for always-synchronous memdev attach cxl/mem: Fix devm_cxl_memdev_edac_release() confusion
2026-01-23arm64: Set __nocfi on swsusp_arch_resume()Zhaoyang Huang1-1/+1
A DABT is reported[1] on an android based system when resume from hiberate. This happens because swsusp_arch_suspend_exit() is marked with SYM_CODE_*() and does not have a CFI hash, but swsusp_arch_resume() will attempt to verify the CFI hash when calling a copy of swsusp_arch_suspend_exit(). Given that there's an existing requirement that the entrypoint to swsusp_arch_suspend_exit() is the first byte of the .hibernate_exit.text section, we cannot fix this by marking swsusp_arch_suspend_exit() with SYM_FUNC_*(). The simplest fix for now is to disable the CFI check in swsusp_arch_resume(). Mark swsusp_arch_resume() as __nocfi to disable the CFI check. [1] [ 22.991934][ T1] Unable to handle kernel paging request at virtual address 0000000109170ffc [ 22.991934][ T1] Mem abort info: [ 22.991934][ T1] ESR = 0x0000000096000007 [ 22.991934][ T1] EC = 0x25: DABT (current EL), IL = 32 bits [ 22.991934][ T1] SET = 0, FnV = 0 [ 22.991934][ T1] EA = 0, S1PTW = 0 [ 22.991934][ T1] FSC = 0x07: level 3 translation fault [ 22.991934][ T1] Data abort info: [ 22.991934][ T1] ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000 [ 22.991934][ T1] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 22.991934][ T1] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 22.991934][ T1] [0000000109170ffc] user address but active_mm is swapper [ 22.991934][ T1] Internal error: Oops: 0000000096000007 [#1] PREEMPT SMP [ 22.991934][ T1] Dumping ftrace buffer: [ 22.991934][ T1] (ftrace buffer empty) [ 22.991934][ T1] Modules linked in: [ 22.991934][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.98-android15-8-g0b1d2aee7fc3-dirty-4k #1 688c7060a825a3ac418fe53881730b355915a419 [ 22.991934][ T1] Hardware name: Unisoc UMS9360-base Board (DT) [ 22.991934][ T1] pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 22.991934][ T1] pc : swsusp_arch_resume+0x2ac/0x344 [ 22.991934][ T1] lr : swsusp_arch_resume+0x294/0x344 [ 22.991934][ T1] sp : ffffffc08006b960 [ 22.991934][ T1] x29: ffffffc08006b9c0 x28: 0000000000000000 x27: 0000000000000000 [ 22.991934][ T1] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000820 [ 22.991934][ T1] x23: ffffffd0817e3000 x22: ffffffd0817e3000 x21: 0000000000000000 [ 22.991934][ T1] x20: ffffff8089171000 x19: ffffffd08252c8c8 x18: ffffffc080061058 [ 22.991934][ T1] x17: 00000000529c6ef0 x16: 00000000529c6ef0 x15: 0000000000000004 [ 22.991934][ T1] x14: ffffff8178c88000 x13: 0000000000000006 x12: 0000000000000000 [ 22.991934][ T1] x11: 0000000000000015 x10: 0000000000000001 x9 : ffffffd082533000 [ 22.991934][ T1] x8 : 0000000109171000 x7 : 205b5d3433393139 x6 : 392e32322020205b [ 22.991934][ T1] x5 : 000000010916f000 x4 : 000000008164b000 x3 : ffffff808a4e0530 [ 22.991934][ T1] x2 : ffffffd08058e784 x1 : 0000000082326000 x0 : 000000010a283000 [ 22.991934][ T1] Call trace: [ 22.991934][ T1] swsusp_arch_resume+0x2ac/0x344 [ 22.991934][ T1] hibernation_restore+0x158/0x18c [ 22.991934][ T1] load_image_and_restore+0xb0/0xec [ 22.991934][ T1] software_resume+0xf4/0x19c [ 22.991934][ T1] software_resume_initcall+0x34/0x78 [ 22.991934][ T1] do_one_initcall+0xe8/0x370 [ 22.991934][ T1] do_initcall_level+0xc8/0x19c [ 22.991934][ T1] do_initcalls+0x70/0xc0 [ 22.991934][ T1] do_basic_setup+0x1c/0x28 [ 22.991934][ T1] kernel_init_freeable+0xe0/0x148 [ 22.991934][ T1] kernel_init+0x20/0x1a8 [ 22.991934][ T1] ret_from_fork+0x10/0x20 [ 22.991934][ T1] Code: a9400a61 f94013e0 f9438923 f9400a64 (b85fc110) Co-developed-by: Jeson Gao <jeson.gao@unisoc.com> Signed-off-by: Jeson Gao <jeson.gao@unisoc.com> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Acked-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> [catalin.marinas@arm.com: commit log updated by Mark Rutland] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-01-23KVM: x86: Advertise AVX10_VNNI_INT CPUID to userspaceZhao Liu3-1/+22
Define and advertise AVX10_VNNI_INT CPUID to userspace when it's supported by the host. AVX10_VNNI_INT (0x24.0x1.ECX[bit 2]) is a discrete feature bit introduced on Intel Diamond Rapids, which enumerates the support for EVEX VPDP* instructions for INT8/INT16 [*]. Since this feature has no actual kernel usages, define it as a KVM-only feature in reverse_cpuid.h. Advertise new CPUID subleaf 0x24.0x1 with AVX10_VNNI_INT bit to userspace for guest use. It's safe since no additional enabling work is needed in the host kernel. [*]: Intel Advanced Vector Extensions 10.2 Architecture Specification (rev 5.0). Tested-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://patch.msgid.link/20251120050720.931449-5-zhao1.liu@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-23KVM: x86: Advertise AVX10.2 CPUID to userspaceZhao Liu1-1/+1
Bump up the maximum supported AVX10 version and enumerate AVX10.2 to userspace when it's supported by the host. Intel AVX10 Version 2 (Intel AVX10.2) includes a suite of new instructions delivering new AI features and performance, accelerated media processing, expanded Web Assembly, and Cryptography support, along with enhancements to existing legacy instructions for completeness and efficiency, and it is enumerated as version 2 in CPUID 0x24.0x0.EBX[bits 0-7] [1]. AVX10.2 has no current kernel usage and requires no additional host kernel enabling work (based on AVX10.1 support) and provides no new VMX controls [2]. Moreover, since AVX10.2 is the superset of AVX10.1, there's no need to worry about AVX10.1 and AVX10.2 compatibility issues in KVM. Therefore, it's safe to advertise AVX10.2 version to userspace directly if host supports AVX10.2. [1]: Intel Advanced Vector Extensions 10.2 Architecture Specification (rev 5.0). [2]: Note: Since AVX10.2 spec (rev 4.0), it has been declared "AVX10/512 will be used in all Intel products, supporting vector lengths of 128, 256, and 512 in all product lines", and the VMX support (in earlier revisions) for AVX10/256 guest on AVX10/512 host has been dropped. Tested-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://patch.msgid.link/20251120050720.931449-4-zhao1.liu@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-23KVM: x86: Advertise AMX CPUIDs in subleaf 0x1E.0x1 to userspaceZhao Liu3-0/+41
Define and advertise AMX CPUIDs (0x1E.0x1) to userspace when the leaf is supported by the host. Intel Diamond Rapids adds new AMX instructions to support new formats and memory operations [*], and introduces the CPUID subleaf 0x1E.0x1 to centralize the discrete AMX feature bits within EAX. Since these AMX features have no actual kernel usages, define them as KVM-only features in reverse_cpuid.h. In addition to the new features, CPUID 0x1E.0x1.EAX[bits 0-3] are aliaseed positions of existing AMX feature bits distributed across the 0x7 leaves. To avoid duplicate feature names, name these aliases with an *_ALIAS suffix, and define them in reverse_cpuid.h as KVM-only features as well. Advertise new CPUID subleaf 0x1E.0x1 with its AMX CPUID feature bits to userspace for guest use. It's safe since no additional enabling work is needed in the host kernel. [*]: Intel Architecture Instruction Set Extensions and Future Features (rev.059). Tested-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://patch.msgid.link/20251120050720.931449-3-zhao1.liu@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-23KVM: x86: Advertise MOVRS CPUID to userspaceZhao Liu2-0/+2
Define the feature flag for MOVRS and advertise support to userspace when the feature is supported by the host. MOVRS is a new set of instructions introduced in the Intel platform Diamond Rapids, to provide load instructions that carry a read-shared hint. Functionally, MOVRS family is equivalent to existing load instructions, but its read-shared hint indicates that the source memory location is likely to become read-shared by multiple processors, i.e., read in the future by at least one other processor before it is written (assuming it is ever written in the future). This hint could optimize the behavior of the caches, especially shared caches, for this data for future reads by multiple processors. Additionally, MOVRS family also includes a software prefetch instruction, PREFETCHRST2, that carries the same read-shared hint. [*] MOVRS family is enumerated by CPUID single-bit (0x7.0x1.EAX[bit 31]). Since it's on a densely-populated CPUID leaf and some other bits on this leaf have kernel usages, define this new feature in cpufeatures.h, but hide it in /proc/cpuinfo due to lack of current kernel usage. Advertise MOVRS bit to userspace directly. It's safe, since there's no new VMX controls or additional host enabling required for guests to use this feature. [*]: Intel Architecture Instruction Set Extensions and Future Features (rev.059). Tested-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://patch.msgid.link/20251120050720.931449-2-zhao1.liu@intel.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-23KVM: SEV: Add KVM_SEV_SNP_ENABLE_REQ_CERTS commandMichael Roth2-0/+18
Introduce a new command for KVM_MEMORY_ENCRYPT_OP ioctl that can be used to enable fetching of endorsement key certificates from userspace via the new KVM_EXIT_SNP_REQ_CERTS exit type. Also introduce a new KVM_X86_SEV_SNP_REQ_CERTS KVM device attribute so that userspace can query whether the kernel supports the new command/exit. Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Tested-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Link: https://patch.msgid.link/20260109231732.1160759-3-michael.roth@amd.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-23KVM: Introduce KVM_EXIT_SNP_REQ_CERTS for SNP certificate-fetchingMichael Roth2-6/+57
For SEV-SNP, the host can optionally provide a certificate table to the guest when it issues an attestation request to firmware (see GHCB 2.0 specification regarding "SNP Extended Guest Requests"). This certificate table can then be used to verify the endorsement key used by firmware to sign the attestation report. While it is possible for guests to obtain the certificates through other means, handling it via the host provides more flexibility in being able to keep the certificate data in sync with the endorsement key throughout host-side operations that might resulting in the endorsement key changing. In the case of KVM, userspace will be responsible for fetching the certificate table and keeping it in sync with any modifications to the endorsement key by other userspace management tools. Define a new KVM_EXIT_SNP_REQ_CERTS event where userspace is provided with the GPA of the buffer the guest has provided as part of the attestation request so that userspace can write the certificate data into it while relying on filesystem-based locking to keep the certificates up-to-date relative to the endorsement keys installed/utilized by firmware at the time the certificates are fetched. [Melody: Update the documentation scheme about how file locking is expected to happen.] Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Tested-by: Liam Merwick <liam.merwick@oracle.com> Tested-by: Dionna Glaze <dionnaglaze@google.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Melody Wang <huibo.wang@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Link: https://patch.msgid.link/20260109231732.1160759-2-michael.roth@amd.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-23KVM: x86: Drop WARN on INIT/SIPI being blocked when vCPU is in Wait-For-SIPISean Christopherson1-1/+0
Drop the sanity check in kvm_apic_accept_events() that attempts to detect KVM bugs by asserting that a vCPU isn't in Wait-For-SIPI if INIT/SIPI are blocked, because if INIT is blocked, then it should be impossible for a vCPU to get into WFS in the first place. Unfortunately, syzbot is smarter than KVM (and its maintainers), and circumvented the guards put in place by commit 0fe3e8d804fd ("KVM: x86: Move INIT_RECEIVED vs. INIT/SIPI blocked check to KVM_RUN") by swapping the order and stuffing VMXON after INIT, and then triggering kvm_apic_accept_events() by way of KVM_GET_MP_STATE. Simply drop the WARN as it hasn't detected any meaningful KVM bugs in years (if ever?), and preventing userspace from clobbering guest state is generally a non-goal. More importantly, fully closing the hole would likely require enforcing some amount of ordering in KVM's ioctls, which is a much bigger risk than simply deleting the WARN. Reported-by: syzbot+59f2c3a3fc4f6c09b8cd@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6925da1b.a70a0220.d98e3.00b0.GAE@google.com Link: https://patch.msgid.link/20260123022816.2283567-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-23arm64: errata: Workaround for SI L1 downstream coherency issueLucas Wei4-0/+61
When software issues a Cache Maintenance Operation (CMO) targeting a dirty cache line, the CPU and DSU cluster may optimize the operation by combining the CopyBack Write and CMO into a single combined CopyBack Write plus CMO transaction presented to the interconnect (MCN). For these combined transactions, the MCN splits the operation into two separate transactions, one Write and one CMO, and then propagates the write and optionally the CMO to the downstream memory system or external Point of Serialization (PoS). However, the MCN may return an early CompCMO response to the DSU cluster before the corresponding Write and CMO transactions have completed at the external PoS or downstream memory. As a result, stale data may be observed by external observers that are directly connected to the external PoS or downstream memory. This erratum affects any system topology in which the following conditions apply: - The Point of Serialization (PoS) is located downstream of the interconnect. - A downstream observer accesses memory directly, bypassing the interconnect. Conditions: This erratum occurs only when all of the following conditions are met: 1. Software executes a data cache maintenance operation, specifically, a clean or clean&invalidate by virtual address (DC CVAC or DC CIVAC), that hits on unique dirty data in the CPU or DSU cache. This results in a combined CopyBack and CMO being issued to the interconnect. 2. The interconnect splits the combined transaction into separate Write and CMO transactions and returns an early completion response to the CPU or DSU before the write has completed at the downstream memory or PoS. 3. A downstream observer accesses the affected memory address after the early completion response is issued but before the actual memory write has completed. This allows the observer to read stale data that has not yet been updated at the PoS or downstream memory. The implementation of workaround put a second loop of CMOs at the same virtual address whose operation meet erratum conditions to wait until cache data be cleaned to PoC. This way of implementation mitigates performance penalty compared to purely duplicate original CMO. Signed-off-by: Lucas Wei <lucaswei@google.com> Signed-off-by: Will Deacon <will@kernel.org>
2026-01-23KVM: arm64: Use kvm_has_mte() in pKVM trap initializationFuad Tabba1-1/+1
When initializing HCR traps in protected mode, use kvm_has_mte() to check for MTE support rather than kvm_has_feat(kvm, ID_AA64PFR1_EL1, MTE, IMP). kvm_has_mte() provides a more comprehensive check: - kvm_has_feat() only checks if MTE is in the guest's ID register view (i.e., what we advertise to the guest) - kvm_has_mte() checks both system_supports_mte() AND whether KVM_ARCH_FLAG_MTE_ENABLED is set for this VM instance Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260122112218.531948-5-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-23KVM: arm64: Inject UNDEF when accessing MTE sysregs with MTE disabledFuad Tabba1-0/+67
When MTE hardware is present but disabled via software (`arm64.nomte` or `CONFIG_ARM64_MTE=n`), the kernel clears `HCR_EL2.ATA` and sets `HCR_EL2.TID5`, to prevent the use of MTE instructions. Additionally, accesses to certain MTE system registers trap to EL2 with exception class ESR_ELx_EC_SYS64. To emulate hardware without MTE (where such accesses would cause an Undefined Instruction exception), inject UNDEF into the host. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260122112218.531948-4-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-23KVM: arm64: Trap MTE access and discovery when MTE is disabledFuad Tabba3-2/+8
If MTE is not supported by the hardware, or is disabled in the kernel configuration (`CONFIG_ARM64_MTE=n`) or command line (`arm64.nomte`), the kernel stops advertising MTE to userspace and avoids using MTE instructions. However, this is a software-level disable only. When MTE hardware is present and enabled by EL3 firmware, leaving `HCR_EL2.ATA` set allows the host to execute MTE instructions (STG, LDG, etc.) and access allocation tags in physical memory. Prevent this by clearing `HCR_EL2.ATA` when MTE is disabled. Remove it from the `HCR_HOST_NVHE_FLAGS` default, and conditionally set it in `cpu_prepare_hyp_mode()` only when `system_supports_mte()` returns true. This causes MTE instructions to trap to EL2 when `HCR_EL2.ATA` is cleared. Additionally, set `HCR_EL2.TID5` when MTE is disabled. This traps reads of `GMID_EL1` (Multiple tag transfer ID register) to EL2, preventing the discovery of MTE parameters (such as tag block size) when the feature is suppressed. Early boot code in `head.S` temporarily keeps `HCR_ATA` set to avoid special-casing initialization paths. This is safe because this code executes before untrusted code runs and will clear `HCR_ATA` if MTE is disabled. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260122112218.531948-3-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-23KVM: arm64: Remove dead code resetting HCR_EL2 for pKVMFuad Tabba1-5/+0
The pKVM lifecycle does not support tearing down the hypervisor and returning to the hyp stub once initialized. The transition to protected mode is one-way. Consequently, the code path in hyp-init.S responsible for resetting EL2 registers (triggered by kexec or hibernation) is unreachable in protected mode. Remove the dead code handling HCR_EL2 reset for ARM64_KVM_PROTECTED_MODE. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://patch.msgid.link/20260122112218.531948-2-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-23Merge branch kvm-arm64/pkvm-features-6.20 into kvmarm-master/nextMarc Zyngier9-24/+108
* kvm-arm64/pkvm-features-6.20: : . : pKVM guest feature trapping fixes, courtesy of Fuad Tabba. : . KVM: arm64: Prevent host from managing timer offsets for protected VMs KVM: arm64: Check whether a VM IOCTL is allowed in pKVM KVM: arm64: Track KVM IOCTLs and their associated KVM caps KVM: arm64: Do not allow KVM_CAP_ARM_MTE for any guest in pKVM KVM: arm64: Include VM type when checking VM capabilities in pKVM KVM: arm64: Introduce helper to calculate fault IPA offset KVM: arm64: Fix MTE flag initialization for protected VMs KVM: arm64: Fix Trace Buffer trap polarity for protected VMs KVM: arm64: Fix Trace Buffer trapping for protected VMs Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-23Merge branch kvm-arm64/feat_idst into kvmarm-master/nextMarc Zyngier7-16/+76
* kvm-arm64/feat_idst: : . : Add support for FEAT_IDST, allowing ID registers that are not implemented : to be reported as a normal trap rather than as an UNDEF exception. : . KVM: arm64: selftests: Add a test for FEAT_IDST KVM: arm64: pkvm: Report optional ID register traps with a 0x18 syndrome KVM: arm64: pkvm: Add a generic synchronous exception injection primitive KVM: arm64: Force trap of GMID_EL1 when the guest doesn't have MTE KVM: arm64: Handle CSSIDR2_EL1 and SMIDR_EL1 in a generic way KVM: arm64: Handle FEAT_IDST for sysregs without specific handlers KVM: arm64: Add a generic synchronous exception injection primitive KVM: arm64: Add trap routing for GMID_EL1 arm64: Repaint ID_AA64MMFR2_EL1.IDS description Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-23Merge branch kvm-arm64/vtcr into kvmarm-master/nextMarc Zyngier19-140/+257
* kvm-arm64/vtcr: : . : VTCR_EL2 conversion to the configuration-driven RESx framework, : fixing a couple of UXN/PXN/XN bugs in the process. : . KVM: arm64: nv: Return correct RES0 bits for FGT registers KVM: arm64: Always populate FGT masks at boot time KVM: arm64: Honor UX/PX attributes for EL2 S1 mappings KVM: arm64: Convert VTCR_EL2 to config-driven sanitisation KVM: arm64: Account for RES1 bits in DECLARE_FEAT_MAP() and co arm64: Convert VTCR_EL2 to sysreg infratructure arm64: Convert ID_AA64MMFR0_EL1.TGRAN{4,16,64}_2 to UnsignedEnum KVM: arm64: Invert KVM_PGTABLE_WALK_HANDLE_FAULT to fix pKVM walkers KVM: arm64: Don't blindly set set PSTATE.PAN on guest exit KVM: arm64: nv: Respect stage-2 write permssion when setting stage-1 AF KVM: arm64: Remove unused vcpu_{clear,set}_wfx_traps() KVM: arm64: Remove unused parameter in synchronize_vcpu_pstate() KVM: arm64: Remove extra argument for __pvkm_host_{share,unshare}_hyp() KVM: arm64: Inject UNDEF for a register trap without accessor KVM: arm64: Copy FGT traps to unprotected pKVM VCPU on VCPU load KVM: arm64: Fix EL2 S1 XN handling for hVHE setups KVM: arm64: gic: Check for vGICv3 when clearing TWI Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-23Merge branch arm64/for-next/cpufeature into kvmarm-master/nextMarc Zyngier21-97/+136
Merge arm64/for-next/cpufeature in to resolve conflicts resulting from the removal of CONFIG_PAN. * arm64/for-next/cpufeature: arm64: Add support for FEAT_{LS64, LS64_V} KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1 KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots arm64: Unconditionally enable PAN support arm64: Unconditionally enable LSE support arm64: Add support for TSV110 Spectre-BHB mitigation Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-23arm64: dts: a7k: add COM Express boardsElad Nachman6-2/+220
Add support for Armada 7020 Express Type 7 CPU module board by Marvell. Define this COM Express CPU module as dtsi and provide a dtsi file for the carrier board (Marvell DB-98CX85x0 COM Express type 7 carrier board). Since memory is soldered on CPU module, memory node is on CPU module dtsi file. This Carrier board only utilizes the PCIe link, hence no special device or driver support is provided by this dtsi file. Devise a dts file for the combined com express carrier and CPU module. The Aramda 7020 CPU COM Express board offers the following features: 1. Armada 7020 CPU, with dual ARM A72 cores 2. DDR4 memory, 8GB, on board soldered 3. 1Gbit Out of Band Ethernet via RGMII to PHY and RJ45 connector, all are present on A7K CPU module (none on the carrier) 4. Optional 10G KR Ethernet going via the COM Express type 7 connector 5. On-board 8 Gbit, 8-bit bus width NAND flash 6. On-board 512 Mbit SPI flash 7. PCIe Root Complex, 4 lanes PCIe gen3 connectivity, going via the COM Express type 7 connector 8. m.2 SATA connector 9. Micro-SD card connector 10. USB 2.0 via COM Express type 7 connector 11. Two i2c interfaces - one to the CPU module, and one to the carrier board via the COM Express type 7 connector 12. UART (mini USB connector by virtue of FT2232D UART to USB converter, connected to the Armada 7020 UART0) gc: 10gbase-kr is legacy, use "10gbase-r" instead in cp0_eth0 node Signed-off-by: Elad Nachman <enachman@marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2026-01-23ARM: dts: microchip: Drop usb_a9g20-dab-mmx.dtsiRob Herring (Arm)1-93/+0
This .dtsi file is not included anywhere in the tree and can't be tested. Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20260122202345.3387936-2-robh@kernel.org Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
2026-01-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski35-128/+173
Cross-merge networking fixes after downstream PR (net-6.19-rc7). Conflicts: drivers/net/ethernet/huawei/hinic3/hinic3_irq.c b35a6fd37a00 ("hinic3: Add adaptive IRQ coalescing with DIM") fb2bb2a1ebf7 ("hinic3: Fix netif_queue_set_napi queue_index input parameter error") https://lore.kernel.org/fc0a7fdf08789a52653e8ad05281a0a849e79206.1768915707.git.zhuyikai1@h-partners.com drivers/net/wireless/ath/ath12k/mac.c drivers/net/wireless/ath/ath12k/wifi7/hw.c 31707572108d ("wifi: ath12k: Fix wrong P2P device link id issue") c26f294fef2a ("wifi: ath12k: Move ieee80211_ops callback to the arch specific module") https://lore.kernel.org/20260114123751.6a208818@canb.auug.org.au Adjacent changes: drivers/net/wireless/ath/ath12k/mac.c 8b8d6ee53dfd ("wifi: ath12k: Fix scan state stuck in ABORTING after cancel_remain_on_channel") 914c890d3b90 ("wifi: ath12k: Add framework for hardware specific ieee80211_ops registration") Signed-off-by: Jakub Kicinski <kuba@kernel.org>