kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2025-11-25	KVM: arm64: Use MI to detect groups being enabled/disabled	Marc Zyngier	2	-0/+10
	Add the maintenance interrupt to force an exit when the guest enables/disables individual groups, so that we can resort the ap_list accordingly. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-27-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Move undeliverable interrupts to the end of ap_list	Marc Zyngier	1	-1/+22
	Interrupts in the ap_list that cannot be acted upon because they are not enabled, or that their group is not enabled, shouldn't make it into the LRs if we are space-constrained. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-26-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Invert ap_list sorting to push active interrupts out	Marc Zyngier	1	-15/+12
	Having established that pending interrupts should have priority to be moved into the LRs over the active interrupts, implement this in the ap_list sorting. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-25-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Make vgic_target_oracle() globally available	Marc Zyngier	2	-1/+2
	Make the internal crystal ball global, so that implementation-specific code can use it. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-24-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Turn kvm_vgic_vcpu_enable() into kvm_vgic_vcpu_reset()	Marc Zyngier	4	-14/+8
	Now that we always reconfigure the vgic HCR register on entry, the "enable" part of kvm_vgic_vcpu_enable() is pretty useless. Removing the enable bits from these functions makes it plain that they are just about computing the reset state. Just rename the functions accordingly. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-23-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Revamp vgic maintenance interrupt configuration	Marc Zyngier	4	-65/+73
	We currently don't use the maintenance interrupt very much, apart from EOI on level interrupts, and for LR underflow in limited cases. However, as we are moving toward a setup where active interrupts can live outside of the LRs, we need to use the MIs in a more diverse set of cases. Add a new helper that produces a digest of the ap_list, and use that summary to set the various control bits as required. This slightly changes the way v2 SGIs are handled, as they used to count for more than one interrupt, but not anymore. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-22-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Eagerly save VMCR on exit	Marc Zyngier	8	-22/+13
	We currently save/restore the VMCR register in a pretty lazy way (on load/put, consistently with what we do with the APRs). However, we are going to need the group-enable bits that are backed by VMCR on each entry (so that we can avoid injecting interrupts for disabled groups). Move the synchronisation from put to sync, which results in some minor churn in the nVHE hypercalls to simplify things. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-21-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Compute vgic state irrespective of the number of interrupts	Marc Zyngier	1	-33/+2
	As we are going to rely on the [G]ICH_HCR{,_EL2} register to be programmed with MI information at all times, slightly de-optimise the flush/sync code to always be called. This is rather lightweight when no interrupts are in flight. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-20-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: GICv2: Extract LR computing primitive	Marc Zyngier	1	-22/+39
	Split vgic_v2_populate_lr() into two helpers, so that we have another primitive that computes the LR from a vgic_irq, but doesn't update anything in the shadow structure. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-19-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: GICv2: Extract LR folding primitive	Marc Zyngier	1	-35/+32
	As we are going to need to handle deactivation for interrupts that are not in the LRs, split vgic_v2_fold_lr_state() into a helper that deals with a single interrupt, and the function that loops over the used LRs. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-18-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: GICv2: Decouple GICH_HCR programming from LRs being loaded	Marc Zyngier	1	-14/+14
	Not programming GICH_HCR while no LRs are populated is a bit of an issue, as we otherwise don't see any maintenance interrupt when the guest interacts with the LRs. Decouple the two and always program the control register, even when we don't have to touch the LRs. This is very similar to what we are already doing for GICv3. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-17-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: GICv2: Preserve EOIcount on exit	Marc Zyngier	1	-0/+6
	EOIcount is how the virtual CPU interface signals that the guest is deactivating interrupts outside of the LRs when EOImode==0. We therefore need to preserve that information so that we can find out what actually needs deactivating, just like we already do on GICv3. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-16-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: GICv3: Extract LR computing primitive	Marc Zyngier	1	-17/+32
	Split vgic_v3_populate_lr() into two, so that we have another primitive that computes the LR from a vgic_irq, but doesn't update anything in the shadow structure. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-15-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: GICv3: Extract LR folding primitive	Marc Zyngier	1	-45/+43
	As we are going to need to handle deactivation for interrupts that are not in the LRs, split vgic_v3_fold_lr_state() into a helper that deals with a single interrupt, and the function that loops over the used LRs. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-14-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: GICv3: Decouple ICH_HCR_EL2 programming from LRs	Marc Zyngier	1	-14/+12
	Not programming ICH_HCR_EL2 while no LRs are populated is a bit of an issue, as we otherwise don't see any maintenance interrupt when the guest interacts with the LRs. Decouple the two and always program the control register, even when we don't have to touch the LRs. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-13-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: GICv3: Preserve EOIcount on exit	Marc Zyngier	1	-0/+6
	EOIcount is how the virtual CPU interface signals that the guest is deactivating interrupts outside of the LRs when EOImode==0. We therefore need to preserve that information so that we can find out what actually needs deactivating. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-12-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: GICv3: Drop LPI active state when folding LRs	Marc Zyngier	1	-1/+3
	Despite LPIs not having an active state, virtual LPIs do have one, which gets cleared on EOI. So far, so good. However, this leads to a small problem: when an active LPI is not in the LRs, that EOImode==0 and that the guest EOIs it, EOIcount doesn't get bumped up. Which means that in these condition, the LPI would stay active forever. Clearly, we can't have that. So if we spot an active LPI, we drop that state. It's pretty pointless anyway, and only serves as a way to trip SW over. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-11-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Add LR overflow handling documentation	Marc Zyngier	1	-1/+80
	Add a bit of documentation describing how we are dealing with LR overflow. This is mostly a braindump of how things are expected to work. For now anyway. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-10-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Add tracking of vgic_irq being present in a LR	Marc Zyngier	2	-0/+12
	We currently cannot identify whether an interrupt is queued into a LR. It wasn't needed until now, but that's about to change. Add yet another flag to track that state. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-9-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Repack struct vgic_irq fields	Marc Zyngier	1	-1/+4
	struct vgic_irq has grown over the years, in a rather bad way. Repack it using bitfields so that the individual flags, and move things around a bit so that it a bit smaller. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-8-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: GICv3: Detect and work around the lack of ICV_DIR_EL1 trapping	Marc Zyngier	5	-1/+67
	A long time ago, an unsuspecting architect forgot to add a trap bit for ICV_DIR_EL1 in ICH_HCR_EL2. Which was unfortunate, but what's a bit of spec between friends? Thankfully, this was fixed in a later revision, and ARM "deprecates" the lack of trapping ability. Unfortuantely, a few (billion) CPUs went out with that defect, anything ARMv8.0 from ARM, give or take. And on these CPUs, you can't trap DIR on its own, full stop. As the next best thing, we can trap everything in the common group, which is a tad expensive, but hey ho, that's what you get. You can otherwise recycle the HW in the neaby bin. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-7-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: vgic-v3: Fix GICv3 trapping in protected mode	Marc Zyngier	2	-0/+8
	As we are about to start trapping a bunch of extra things, augment the pKVM trap description with all the registers trapped by ICH_HCR_EL2.TC, making them legal instead of resulting in a UNDEF injection in the guest. While we're at it, ensure that pKVM captures the vgic model so that it can be checked by the emulation code. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-6-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Turn vgic-v3 errata traps into a patched-in constant	Marc Zyngier	5	-49/+84
	The trap bits are currently only set to manage CPU errata. However, we are about to make use of them for purposes beyond beating broken CPUs into submission. For this purpose, turn these errata-driven bits into a patched-in constant that is merged with the KVM-driven value at the point of programming the ICH_HCR_EL2 register, rather than being directly stored with with the shadow value.. This allows the KVM code to distinguish between a trap being handled for the purpose of an erratum workaround, or for KVM's own need. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-5-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	irqchip/gic: Expose CPU interface VA to KVM	Marc Zyngier	1	-0/+1
	Future changes will require KVM to be able to perform deactivations by writing to the physical CPU interface. Add the corresponding VA to the kvm_info structure, and let KVM stash it. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-3-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: nv: Forward FEAT_XNX permissions to the shadow stage-2	Oliver Upton	3	-8/+57
	Add support for FEAT_XNX to shadow stage-2 MMUs, being careful to only evaluate XN[0] when the feature is actually exposed to the VM. Restructure the layering of permissions in the fault handler to assume pX and uX then restricting based on the guest's stage-2 afterwards. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://msgid.link/20251124190158.177318-4-oupton@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	KVM: arm64: Add support for FEAT_XNX stage-2 permissions	Oliver Upton	2	-15/+60
	FEAT_XNX adds support for encoding separate execute permissions for EL0 and EL1 at stage-2. Add support for this to the page table library, hiding the unintuitive encoding scheme behind generic pX and uX permission flags. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://msgid.link/20251124190158.177318-3-oupton@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	arm64: Detect FEAT_XNX	Oliver Upton	2	-0/+8
	Detect the feature in anticipation of using it in KVM. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://msgid.link/20251124190158.177318-2-oupton@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-25	cpumask: Don't use "proxy" headers	Andy Shevchenko	1	-0/+2
	Update header inclusions to follow IWYU (Include What You Use) principle. Note that kernel.h is discouraged to be included as it's written at the top of that file. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
2025-11-24	x86_64/bug: Inline the UD1	Peter Zijlstra	3	-3/+19
	(Ab)use the static_call infrastructure to convert all: call __WARN_trap instances into the desired: ud1 (%edx), %rdi eliminating the CALL/RET, but more importantly, fixing the fact that all WARNs will have: RIP: 0010:__WARN_trap+0 Basically, by making it a static_call trampoline call, objtool will collect the callsites, and then the inline rewrite will hit the special case and replace the code with the magic instruction. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251110115758.456717741@infradead.org
2025-11-24	x86/bug: Implement WARN_ONCE()	Peter Zijlstra	1	-0/+9
	Implement WARN_ONCE like WARN using BUGFLAG_ONCE. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251110115758.339309119@infradead.org
2025-11-24	x86_64/bug: Implement __WARN_printf()	Peter Zijlstra	3	-15/+170
	The basic idea is to have __WARN_printf() be a vararg function such that the compiler can do the optimal calling convention for us. This function body will be a #UD and then set up a va_list in the exception from pt_regs. But because the trap will be in a called function, the bug_entry must be passed in. Have that be the first argument, with the format tucked away inside the bug_entry. The comments should clarify the real fun details. The big downside is that all WARNs will now show: RIP: 0010:__WARN_trap:+0 One possible solution is to simply discard the top frame when unwinding. A follow up patch takes care of this slightly differently by abusing the x86 static_call implementation. This changes (with the next patches): WARN_ONCE(preempt_count() != 2*PREEMPT_DISABLE_OFFSET, "corrupted preempt_count: %s/%d/0x%x\n", from: cmpl $2, %ecx #, _7 jne .L1472 ... .L1472: cmpb $0, __already_done.11(%rip) je .L1513 ... .L1513 movb $1, __already_done.11(%rip) movl 1424(%r14), %edx # _15->pid, _15->pid leaq 1912(%r14), %rsi #, _17 movq $.LC43, %rdi #, call __warn_printk # ud2 .pushsection __bug_table,"aw" 2: .long 1b - . # bug_entry::bug_addr .long .LC1 - . # bug_entry::file .word 5093 # bug_entry::line .word 2313 # bug_entry::flags .org 2b + 12 .popsection .pushsection .discard.annotate_insn,"M", @progbits, 8 .long 1b - . .long 8 # ANNOTYPE_REACHABLE .popsection into: cmpl $2, %ecx #, _7 jne .L1442 #, ... .L1442: lea (2f)(%rip), %rdi 1: .pushsection __bug_table,"aw" 2: .long 1b - . # bug_entry::bug_addr .long .LC43 - . # bug_entry::format .long .LC1 - . # bug_entry::file .word 5093 # bug_entry::line .word 2323 # bug_entry::flags .org 2b + 16 .popsection movl 1424(%r14), %edx # _19->pid, _19->pid leaq 1912(%r14), %rsi #, _13 ud1 (%edx), %rdi Notably, by pushing everything into the exception handler it can take care of the ONCE thing. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251110115758.213813530@infradead.org
2025-11-24	x86/bug: Use BUG_FORMAT for DEBUG_BUGVERBOSE_DETAILED	Peter Zijlstra	1	-2/+8
	Since we have an explicit format string, use it for the condition string instead of frobbing it in the file string. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251110115758.097401406@infradead.org
2025-11-24	x86/bug: Add BUG_FORMAT basics	Peter Zijlstra	1	-10/+21
	Opt-in to BUG_FORMAT for x86_64, adjust the BUGTABLE helper and for now, just store NULL pointers. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251110115757.980264454@infradead.org
2025-11-24	bpf: specify the old and new poke_type for bpf_arch_text_poke	Menglong Dong	6	-36/+50
	In the origin logic, the bpf_arch_text_poke() assume that the old and new instructions have the same opcode. However, they can have different opcode if we want to replace a "call" insn with a "jmp" insn. Therefore, add the new function parameter "old_t" along with the "new_t", which are used to indicate the old and new poke type. Meanwhile, adjust the implement of bpf_arch_text_poke() for all the archs. "BPF_MOD_NOP" is added to make the code more readable. In bpf_arch_text_poke(), we still check if the new and old address is NULL to determine if nop insn should be used, which I think is more safe. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20251118123639.688444-6-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-24	bpf,x86: adjust the "jmp" mode for bpf trampoline	Menglong Dong	1	-5/+11
	In the origin call case, if BPF_TRAMP_F_SKIP_FRAME is not set, it means that the trampoline is not called, but "jmp". Introduce the function bpf_trampoline_use_jmp() to check if the trampoline is in "jmp" mode. Do some adjustment on the "jmp" mode for the x86_64. The main adjustment that we make is for the stack parameter passing case, as the stack alignment logic changes in the "jmp" mode without the "rip". What's more, the location of the parameters on the stack also changes. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Link: https://lore.kernel.org/r/20251118123639.688444-5-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-24	bpf: fix the usage of BPF_TRAMP_F_SKIP_FRAME	Menglong Dong	2	-2/+2
	Some places calculate the origin_call by checking if BPF_TRAMP_F_SKIP_FRAME is set. However, it should use BPF_TRAMP_F_ORIG_STACK for this propose. Just fix them. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20251118123639.688444-4-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-24	x86/ftrace: Implement DYNAMIC_FTRACE_WITH_JMP	Menglong Dong	3	-2/+18
	Implement the DYNAMIC_FTRACE_WITH_JMP for x86_64. In ftrace_call_replace, we will use JMP32_INSN_OPCODE instead of CALL_INSN_OPCODE if the address should use "jmp". Meanwhile, adjust the direct call in the ftrace_regs_caller. The RSB is balanced in the "jmp" mode. Take the function "foo" for example: original_caller: call foo -> foo: call fentry -> fentry: [do ftrace callbacks ] move tramp_addr to stack RET -> tramp_addr tramp_addr: [..] call foo_body -> foo_body: [..] RET -> back to tramp_addr [..] RET -> back to original_caller Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20251118123639.688444-3-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-24	Merge tag 'icc-6.19-rc1' of ↵	Greg Kroah-Hartman	1	-0/+3
	ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-next Georgi writes: interconnect changes for 6.19 This pull request contains the interconnect changes for the 6.19-rc1 merge window. The core and driver changes are listed below. Core changes: - kbps_to_icc() macro optimization Driver changes: - Switch all Qualcomm RPMh interconnect drivers to use the dynamic node IDs and drop support for non-dynamic ID allocation - Add new driver and BWMON support for the Kaanapali SoC - Add QoS support for the SM6350 SoC - Add QoS support for the SA8775p SoC - Fix missing link from SNOC_PNOC to the USB 2 on MSM8996 SoC that includes also a dts change that has been acked by the maintainer - Drop the QPIC interconnect and BCM nodes for the SDX75 SoC, as these should be handled by the rpmh-clk driver - Other misc fixes Signed-off-by: Georgi Djakov <djakov@kernel.org> * tag 'icc-6.19-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/djakov/icc: (40 commits) interconnect: qcom: sm6350: enable QoS configuration interconnect: qcom: sm6350: Remove empty BCM arrays interconnect: qcom: icc-rpmh: Get parent's regmap for nested NoCs dt-bindings: interconnect: qcom,sm6350-rpmh: Add clocks for QoS dt-bindings: interconnect: qcom-bwmon: Document Kaanapali BWMONs interconnect: qcom: icc-rpmh: drop support for non-dynamic IDS interconnect: qcom: sm8750: convert to dynamic IDs interconnect: qcom: sm8650: convert to dynamic IDs interconnect: qcom: sm8550: convert to dynamic IDs interconnect: qcom: sm8450: convert to dynamic IDs interconnect: qcom: sm8350: convert to dynamic IDs interconnect: qcom: sm8150: convert to dynamic IDs interconnect: qcom: sm7150: convert to dynamic IDs interconnect: qcom: sm6350: convert to dynamic IDs interconnect: qcom: sdx75: convert to dynamic IDs interconnect: qcom: sdx65: convert to dynamic IDs interconnect: qcom: sdx55: convert to dynamic IDs interconnect: qcom: sdm670: convert to dynamic IDs interconnect: qcom: sc7180: convert to dynamic IDs interconnect: qcom: sar2130p: convert to dynamic IDs ...
2025-11-24	arm64: proton-pack: Fix hard lockup when !MITIGATE_SPECTRE_BRANCH_HISTORY	Jonathan Marek	1	-0/+2
	The "drop print" commit removed the whole branch and not just the print. For some ARM64 cpus, this leads to hard lockup when CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY is not enabled. Fixes: 62e72463ca71 ("arm64: proton-pack: Drop print when !CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Will Deacon <will@kernel.org>
2025-11-24	Revert "arm64: acpi: Enable ACPI CCEL support"	Will Deacon	1	-10/+0
	This reverts commit d02c2e45b1e7767b177f6854026e4ad0d70b4a4d. Mauro reports that this breaks APEI notifications on his QEMU setup because the "reserved for firmware" region still needs to be writable by Linux in order to signal _back_ to the firmware after processing the reported error: \| {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 \| ... \| [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error \| Unable to handle kernel write to read-only memory at virtual address ffff800080035018 \| Mem abort info: \| ESR = 0x000000009600004f \| EC = 0x25: DABT (current EL), IL = 32 bits \| SET = 0, FnV = 0 \| EA = 0, S1PTW = 0 \| FSC = 0x0f: level 3 permission fault \| Data abort info: \| ISV = 0, ISS = 0x0000004f, ISS2 = 0x00000000 \| CM = 0, WnR = 1, TnD = 0, TagAccess = 0 \| GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 \| swapper pgtable: 4k pages, 52-bit VAs, pgdp=00000000505d7000 \| pgd=10000000510bc003, p4d=1000000100229403, pud=100000010022a403, pmd=100000010022b403, pte=0060000139b90483 \| Internal error: Oops: 000000009600004f [#1] SMP For now, revert the offending commit. We can presumably switch back to PAGE_KERNEL when bringing this back in the future. Link: https://lore.kernel.org/r/20251121224611.07efa95a@foz.lan Reported-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2025-11-24	um: drivers: virtio: use string choices helper	Kuninori Morimoto	1	-2/+2
	Remove hard-coded strings by using the string helper functions Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://patch.msgid.link/87h5uywtwp.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-24	s390: Add stackprotector support	Heiko Carstens	18	-4/+269
	Stackprotector support was previously unavailable on s390 because by default compilers generate code which is not suitable for the kernel: the canary value is accessed via thread local storage, where the address of thread local storage is within access registers 0 and 1. Using those registers also for the kernel would come with a significant performance impact and more complicated kernel entry/exit code, since access registers contents would have to be exchanged on every kernel entry and exit. With the upcoming gcc 16 release new compiler options will become available which allow to generate code suitable for the kernel. [1] Compiler option -mstack-protector-guard=global instructs gcc to generate stackprotector code that refers to a global stackprotector canary value via symbol __stack_chk_guard. Access to this value is guaranteed to occur via larl and lgrl instructions. Furthermore, compiler option -mstack-protector-guard-record generates a section containing all code addresses that reference the canary value. To allow for per task canary values the instructions which load the address of __stack_chk_guard are patched so they access a lowcore field instead: a per task canary value is available within the task_struct of each task, and is written to the per-cpu lowcore location on each context switch. Also add sanity checks and debugging option to be consistent with other kernel code patching mechanisms. Full debugging output can be enabled with the following kernel command line options: debug_stackprotector bootdebug ignore_loglevel earlyprintk dyndbg="file stackprotector.c +p" Example debug output: stackprot: 0000021e402d4eda: c010005a9ae3 -> c01f00070240 where "<insn address>: <old insn> -> <new insn>". [1] gcc commit 0cd1f03939d5 ("s390: Support global stack protector") Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-11-24	s390/modules: Simplify module_finalize() slightly	Heiko Carstens	1	-7/+5
	Preinitialize the return value, and break out the for loop in module_finalize() in case of an error to get rid of an ifdef. This makes it easier to add additional code, which may also depend on config options. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-11-24	s390: Remove KMSG_COMPONENT macro	Heiko Carstens	40	-84/+44
	The KMSG_COMPONENT macro is a leftover of the s390 specific "kernel message catalog" which never made it upstream. Remove the macro in order to get rid of a pointless indirection. Replace all users with the string it defines. In almost all cases this leads to a simple replacement like this: - #define KMSG_COMPONENT "appldata" - #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt + #define pr_fmt(fmt) "appldata: " fmt Except for some special cases this is just mechanical/scripted work. Acked-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-11-24	s390/percpu: Get rid of ARCH_MODULE_NEEDS_WEAK_PER_CPU	Heiko Carstens	2	-9/+0
	Since the rework of the kernel virtual address space [1] the module area and the kernel image are within the same 4GB area. Therefore there is no need for the weak per cpu workaround for modules anymore. Remove it. [1] commit c98d2ecae08f ("s390/mm: Uncouple physical vs virtual address spaces") Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-11-24	m68k: defconfig: Update defconfigs for v6.18-rc1	Geert Uytterhoeven	12	-24/+12
	- Drop CONFIG_SCTP_COOKIE_HMAC_SHA1=y (removed in commit 2f3dd6ec901f29ae ("sctp: Convert cookie authentication to use HMAC-SHA256")), - Drop CONFIG_BATMAN_ADV_NC=y (removed in commit 87b95082db32ae1c ("batman-adv: remove network coding support")), - Enable modular build of the SHA-1 secure hash algorithm (no longer auto-enabled since commit 2f3dd6ec901f29ae ("sctp: Convert cookie authentication to use HMAC-SHA256")). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://patch.msgid.link/65e00bcb7b2980278bb087986ee405627aa32d8b.1760360254.git.geert@linux-m68k.org
2025-11-24	crypto: aesni - ctr_crypt() use min() instead of min_t()	David Laight	1	-2/+1
	min_t(unsigned int, a, b) casts an 'unsigned long' to 'unsigned int'. Use min(a, b) instead as it promotes any 'unsigned int' to 'unsigned long' and so cannot discard significant bits. In this case the 'unsigned long' value is small enough that the result is ok. Detected by an extra check added to min_t(). Signed-off-by: David Laight <david.laight.linux@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-11-24	RISC-V: KVM: Flush VS-stage TLB after VCPU migration for Andes cores	Hui Min Mina Chou	7	-25/+49
	Most implementations cache the combined result of two-stage translation, but some, like Andes cores, use split TLBs that store VS-stage and G-stage entries separately. On such systems, when a VCPU migrates to another CPU, an additional HFENCE.VVMA is required to avoid using stale VS-stage entries, which could otherwise cause guest faults. Introduce a static key to identify CPUs with split two-stage TLBs. When enabled, KVM issues an extra HFENCE.VVMA on VCPU migration to prevent stale VS-stage mappings. Signed-off-by: Hui Min Mina Chou <minachou@andestech.com> Signed-off-by: Ben Zong-You Xie <ben717@andestech.com> Reviewed-by: Radim Krčmář <rkrcmar@ventanamicro.com> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Link: https://lore.kernel.org/r/20251117084555.157642-1-minachou@andestech.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-11-24	RISC-V: KVM: Fix guest page fault within HLV* instructions	Fangyu Yu	1	-0/+22
	When executing HLV* instructions at the HS mode, a guest page fault may occur when a g-stage page table migration between triggering the virtual instruction exception and executing the HLV* instruction. This may be a corner case, and one simpler way to handle this is to re-execute the instruction where the virtual instruction exception occurred, and the guest page fault will be automatically handled. Fixes: b91f0e4cb8a3 ("RISC-V: KVM: Factor-out instruction emulation into separate sources") Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251121133543.46822-1-fangyu.yu@linux.alibaba.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-11-24	KVM: riscv: Support enabling dirty log gradually in small chunks	Dong Yang	2	-1/+7
	There is already support of enabling dirty log gradually in small chunks for x86 in commit 3c9bd4006bfc ("KVM: x86: enable dirty log gradually in small chunks") and c862626 ("KVM: arm64: Support enabling dirty log gradually in small chunks"). This adds support for riscv. x86 and arm64 writes protect both huge pages and normal pages now, so riscv protect also protects both huge pages and normal pages. On a nested virtualization setup (RISC-V KVM running inside a QEMU VM on an [Intel® Core™ i5-12500H] host), I did some tests with a 2G Linux VM using different backing page sizes. The time taken for memory_global_dirty_log_start in the L2 QEMU is listed below: Page Size Before After Optimization 4K 4490.23ms 31.94ms 2M 48.97ms 45.46ms 1G 28.40ms 30.93ms Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Signed-off-by: Dong Yang <dayss1224@gmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251103062825.9084-1-dayss1224@gmail.com Signed-off-by: Anup Patel <anup@brainfault.org>