starfive-tech/linux.git - StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2023-12-19	x86/alternatives: Move apply_relocation() out of init section	Arnd Bergmann	1	-4/+4
	This function is now called from a few places that are no __init_or_module, resulting a link time warning: WARNING: modpost: vmlinux: section mismatch in reference: patch_dest+0x8a (section: .text) -> apply_relocation (section: .init.text) Remove the annotation here. [ mingo: Also sync up add_nop() with these changes. ] Fixes: 17bce3b2ae2d ("x86/callthunks: Handle %rip-relative relocations in call thunk template") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20231204072856.1033621-1-arnd@kernel.org
2023-12-11	x86/percpu: Avoid sparse warning with cast to named address space	Uros Bizjak	1	-0/+5
	Teach sparse about __seg_fs and __seg_gs named address space qualifiers to to avoid warnings about unexpected keyword at the end of cast operator. Reported-by: kernel test robot <lkp@intel.com> Acked-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231204210320.114429-3-ubizjak@gmail.com Closes: https://lore.kernel.org/oe-kbuild-all/202310080853.UhMe5iWa-lkp@intel.com/
2023-12-11	x86/traps: Use current_top_of_stack() helper in traps.c	Uros Bizjak	1	-2/+2
	Use current_top_of_stack() helper in sync_regs() and vc_switch_off_ist() instead of open-coding the reading of the top_of_stack percpu variable explicitly. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20231204210320.114429-2-ubizjak@gmail.com
2023-12-11	x86/percpu: Fix "const_pcpu_hot" version generation failure	Uros Bizjak	3	-3/+11
	Version generation for "const_pcpu_hot" symbol failed because genksyms doesn't know the __seg_gs keyword. Effectively revert commit 4604c052b84d ("x86/percpu: Declare const_pcpu_hot as extern const variable") and use this_cpu_read_const() instead to avoid "sparse: dereference of noderef expression" warning when reading const_pcpu_hot. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20231204210320.114429-1-ubizjak@gmail.com
2023-12-02	x86/callthunks: Correct calculation of dest address in is_callthunk()	Uros Bizjak	1	-1/+1
	GCC didn't warn on the invalid use of relocation destination pointer, so the calculated destination value was applied to the uninitialized pointer location in error. Fixes: 17bce3b2ae2d ("x86/callthunks: Handle %rip-relative relocations in call thunk template") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Closes: https://lore.kernel.org/lkml/20231201035457.GA321497@dev-arch.thelio-3990X/ Link: https://lore.kernel.org/r/20231201085727.3647051-1-ubizjak@gmail.com
2023-11-30	x86/smp: Use atomic_try_cmpxchg in native_stop_other_cpus()	Uros Bizjak	1	-2/+3
	Use atomic_try_cmpxchg() instead of atomic_cmpxchg(*ptr, old, new) == old. X86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after the CMPXCHG. Tested by building a native Fedora-38 kernel and rebooting a 12-way SMP system using "shutdown -r" command some 100 times. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231123203605.3474745-2-ubizjak@gmail.com
2023-11-30	x86/smp: Move the call to smp_processor_id() after the early exit in ↵	Uros Bizjak	1	-3/+6
	native_stop_other_cpus() Improve code generation in native_stop_other_cpus() a tiny bit: smp_processor_id() accesses a per-CPU variable, so the compiler is not able to move the call after the early exit on its own. Also rename the "cpu" variable to a more descriptive "this_cpu", and use 'cpu' as a separate iterator variable later in the function. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231123203605.3474745-1-ubizjak@gmail.com
2023-11-30	x86/percpu: Declare const_pcpu_hot as extern const variable	Uros Bizjak	1	-2/+1
	const_pcpu_hot is aliased by linker to pcpu_hot, so there is no need to use the DECLARE_PER_CPU_ALIGNED() macro. Also, declare const_pcpu_hot as extern to avoid allocating storage space for the aliased structure. Fixes: ed2f752e0e0a ("x86/percpu: Introduce const-qualified const_pcpu_hot to micro-optimize code generation") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231130162949.83518-1-ubizjak@gmail.com Closes: https://lore.kernel.org/oe-kbuild-all/202311302257.tSFtZnly-lkp@intel.com/
2023-11-30	x86/callthunks: Mark apply_relocation() as __init_or_module	Ingo Molnar	2	-2/+2
	Do it like the rest of the methods using it. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20231105213731.1878100-3-ubizjak@gmail.com
2023-11-30	x86/acpi: Use %rip-relative addressing in wakeup_64.S	Uros Bizjak	1	-12/+12
	This is a "nice-to-have" change with minor code generation benefits: - Instruction with %rip-relative address operand is one byte shorter than its absolute address counterpart, - it is also compatible with position independent executable (-fpie) builds, - it is also consistent with what the compiler emits by default when a symbol is accessed. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20231103104900.409470-1-ubizjak@gmail.com
2023-11-30	x86/callthunks: Fix and unify call thunks assembly snippets	Uros Bizjak	1	-16/+7
	Currently thunk debug macros explicitly define %gs: segment register prefix for their percpu variables. This is not compatible with !CONFIG_SMP, which requires non-prefixed percpu variables. Fix call thunks debug macros to use PER_CPU_VAR macro from percpu.h to conditionally use %gs: segment register prefix, depending on CONFIG_SMP. Finally, unify ASM_ prefixed assembly macros with their non-prefixed variants. With support of %rip-relative relocations in place, call thunk templates allow %rip-relative addressing, so unified assembly snippet can be used everywhere. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231105213731.1878100-4-ubizjak@gmail.com
2023-11-30	x86/callthunks: Handle %rip-relative relocations in call thunk template	Uros Bizjak	3	-9/+28
	Contrary to alternatives, relocations are currently not supported in call thunk templates. Re-use the existing infrastructure from alternative.c to allow %rip-relative relocations when copying call thunk template from its storage location. The patch allows unification of ASM_INCREMENT_CALL_DEPTH, which already uses PER_CPU_VAR macro, with INCREMENT_CALL_DEPTH, used in call thunk template, which is currently limited to use absolute address. Reuse existing relocation infrastructure from alternative.c., as suggested by Peter Zijlstra. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231105213731.1878100-3-ubizjak@gmail.com
2023-11-30	x86/percpu: Define PER_CPU_VAR macro also for !__ASSEMBLY__	Uros Bizjak	1	-0/+5
	Some C source files define 'asm' statements that use PER_CPU_VAR, so make PER_CPU_VAR macro available also without __ASSEMBLY__. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231105213731.1878100-2-ubizjak@gmail.com
2023-10-24	x86/percpu: Return correct variable from current_top_of_stack()	Uros Bizjak	1	-1/+1
	current_top_of_stack() should return variable from _seg_gs qualified named address space when CONFIG_USE_X86_SEG_SUPPORT=y is enbled. Fixes: ed2f752e0e0a ("x86/percpu: Introduce const-qualified const_pcpu_hot to micro-optimize code generation") Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231024142830.3226-1-ubizjak@gmail.com Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Sean Christopherson <seanjc@google.com>
2023-10-23	x86/percpu: Introduce const-qualified const_pcpu_hot to micro-optimize code ↵	Uros Bizjak	6	-4/+16
	generation Some variables in pcpu_hot, currently current_task and top_of_stack are actually per-thread variables implemented as per-CPU variables and thus stable for the duration of the respective task. There is already an attempt to eliminate redundant reads from these variables using this_cpu_read_stable() asm macro, which hides the dependency on the read memory address. However, the compiler has limited ability to eliminate asm common subexpressions, so this approach results in a limited success. The solution is to allow more aggressive elimination by aliasing pcpu_hot into a const-qualified const_pcpu_hot, and to read stable per-CPU variables from this constant copy. The current per-CPU infrastructure does not support reads from const-qualified variables. However, when the compiler supports segment qualifiers, it is possible to declare the const-aliased variable in the relevant named address space. The compiler considers access to the variable, declared in this way, as a read from a constant location, and will optimize reads from the variable accordingly. By implementing constant-qualified const_pcpu_hot, the compiler can eliminate redundant reads from the constant variables, reducing the number of loads from current_task from 3766 to 3217 on a test build, a -14.6% reduction. The reduction of loads translates to the following code savings: text data bss dec hex filename 25,477,353 4389456 808452 30675261 1d4113d vmlinux-old.o 25,476,074 4389440 808452 30673966 1d40c2e vmlinux-new.o representing a code size reduction of -1279 bytes. [ mingo: Updated the changelog, EXPORT(const_pcpu_hot). ] Co-developed-by: Nadav Amit <namit@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231020162004.135244-1-ubizjak@gmail.com
2023-10-20	x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR()	Uros Bizjak	3	-19/+35
	Introduce x86_64 %rip-relative addressing to the PER_CPU_VAR() macro. Instructions using %rip-relative address operand are one byte shorter than their absolute address counterparts and are also compatible with position independent executable (-fpie) builds. The patch reduces code size of a test kernel build by 150 bytes. The PER_CPU_VAR() macro is intended to be applied to a symbol and should not be used with register operands. Introduce the new __percpu macro and use it in cmpxchg{8,16}b_emu.S instead. Also add a missing function comment to this_cpu_cmpxchg8b_emu(). No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Sean Christopherson <seanjc@google.com>
2023-10-20	x86/percpu, xen: Correct PER_CPU_VAR() usage to include symbol and its addend	Uros Bizjak	1	-5/+5
	The PER_CPU_VAR() macro should be applied to a symbol and its addend. Inconsistent usage is currently harmless, but needs to be corrected before %rip-relative addressing is introduced to the PER_CPU_VAR() macro. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: linux-kernel@vger.kernel.org Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Sean Christopherson <seanjc@google.com>
2023-10-20	x86/percpu: Correct PER_CPU_VAR() usage to include symbol and its addend	Uros Bizjak	4	-4/+4
	The PER_CPU_VAR() macro should be applied to a symbol and its addend. Inconsistent usage is currently harmless, but needs to be corrected before %rip-relative addressing is introduced to the PER_CPU_VAR() macro. No functional changes intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Sean Christopherson <seanjc@google.com>
2023-10-20	x86/fpu: Clean up FPU switching in the middle of task switching	Linus Torvalds	3	-12/+12
	It happens to work, but it's very very wrong, because our 'current' macro is magic that is supposedly loading a stable value. It just happens to be not quite stable enough and the compilers re-load the value enough for this code to work. But it's wrong. The whole struct fpu *prev_fpu = &prev->fpu; thing in __switch_to() is pretty ugly. There's no reason why we should look at that 'prev_fpu' pointer there, or pass it down. And it only generates worse code, in how it loads 'current' when __switch_to() has the right task pointers. The attached patch not only cleans this up, it actually generates better code too: (a) it removes one push/pop pair at entry/exit because there's one less register used (no 'current') (b) it removes that pointless load of 'current' because it just uses the right argument: - movq %gs:pcpu_hot(%rip), %r12 - testq $16384, (%r12) + testq $16384, (%rdi) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20231018184227.446318-1-ubizjak@gmail.com
2023-10-18	x86/percpu: Use the correct asm operand modifier in percpu_stable_op()	Uros Bizjak	1	-2/+2
	The "P" asm operand modifier is a x86 target-specific modifier. When used for a constant, it drops all syntax-specific prefixes and issues the bare constant. This modifier is not correct for address handling, in this case a generic "a" operand modifier should be used. The "a" asm operand modifier substitutes a memory reference, with the actual operand treated as address. For x86_64, when a symbol is provided, the "a" modifier emits "sym(%rip)" instead of "sym", enabling shorter %rip-relative addressing. Clang allows only "i" and "r" operand constraints with an "a" modifier, so the patch normalizes the modifier/constraint pair to "a"/"i" which is consistent between both compilers. The patch reduces code size of a test build by 4072 bytes: text data bss dec hex filename 25523268 4388300 808452 30720020 1d4c014 vmlinux-old.o 25519196 4388300 808452 30715948 1d4b02c vmlinux-new.o [ mingo: Changelog clarity. ] Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20231016200755.287403-1-ubizjak@gmail.com
2023-10-16	x86/percpu: Use C for arch_raw_cpu_ptr(), to improve code generation	Uros Bizjak	1	-0/+17
	Implement arch_raw_cpu_ptr() in C to allow the compiler to perform better optimizations, such as setting an appropriate base to compute the address. The compiler is free to choose either MOV or ADD from this_cpu_off address to construct the optimal final address. There are some other issues when memory access to the percpu area is implemented with an asm. Compilers can not eliminate asm common subexpressions over basic block boundaries, but are extremely good at optimizing memory access. By implementing arch_raw_cpu_ptr() in C, the compiler can eliminate additional redundant loads from this_cpu_off, further reducing the number of percpu offset reads from 1646 to 1631 on a test build, a -0.9% reduction. Co-developed-by: Nadav Amit <namit@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20231015202523.189168-2-ubizjak@gmail.com
2023-10-16	x86/percpu: Rewrite arch_raw_cpu_ptr() to be easier for compilers to optimize	Uros Bizjak	1	-2/+4
	Implement arch_raw_cpu_ptr() as a load from this_cpu_off and then add the ptr value to the base. This way, the compiler can propagate addend to the following instruction and simplify address calculation. E.g.: address calcuation in amd_pmu_enable_virt() improves from: 48 c7 c0 00 00 00 00 mov $0x0,%rax 87b7: R_X86_64_32S cpu_hw_events 65 48 03 05 00 00 00 add %gs:0x0(%rip),%rax 00 87bf: R_X86_64_PC32 this_cpu_off-0x4 48 c7 80 28 13 00 00 movq $0x0,0x1328(%rax) 00 00 00 00 to: 65 48 8b 05 00 00 00 mov %gs:0x0(%rip),%rax 00 8798: R_X86_64_PC32 this_cpu_off-0x4 48 c7 80 00 00 00 00 movq $0x0,0x0(%rax) 00 00 00 00 87a6: R_X86_64_32S cpu_hw_events+0x1328 The compiler also eliminates additional redundant loads from this_cpu_off, reducing the number of percpu offset reads from 1668 to 1646 on a test build, a -1.3% reduction. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20231015202523.189168-1-ubizjak@gmail.com
2023-10-11	x86/percpu: Disable named address spaces for KASAN	Uros Bizjak	1	-1/+6
	-fsanitize=kernel-address (KASAN) is at the moment incompatible with named address spaces - see GCC PR sanitizer/111736: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111736 GCC is doing a KASAN check on a percpu address which it shouldn't do, and didn't used to do because we did the access using inline asm. But now that GCC does the accesses as normal (albeit special address space) memory accesses, the KASAN code triggers on them too, and it all goes to hell in a handbasket very quickly. Those percpu accessor functions need to disable any KASAN checking or other sanitizer checking. Not on the percpu address, because that's not a "real" address, it's obviously just the offset from the segment register. And GCC should probably not have generated such code in the first place, so arguably this is a bug with -fsanitize=kernel-address. The patch also removes a stale dependency on CONFIG_SMP. Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20231009151409.53656-1-ubizjak@gmail.com Closes: https://lore.kernel.org/oe-lkp/202310071301.a5113890-oliver.sang@intel.com
2023-10-05	x86/percpu: Use C for percpu read/write accessors	Uros Bizjak	1	-11/+54
	The percpu code mostly uses inline assembly. Using segment qualifiers allows to use C code instead, which enables the compiler to perform various optimizations (e.g. propagation of memory arguments). Convert percpu read and write accessors to C code, so the memory argument can be propagated to the instruction that uses this argument. Some examples of propagations: a) into sign/zero extensions: the code improves from: 65 8a 05 00 00 00 00 mov %gs:0x0(%rip),%al 0f b6 c0 movzbl %al,%eax to: 65 0f b6 05 00 00 00 movzbl %gs:0x0(%rip),%eax 00 and in a similar way for: movzbl %gs:0x0(%rip),%edx movzwl %gs:0x0(%rip),%esi movzbl %gs:0x78(%rbx),%eax movslq %gs:0x0(%rip),%rdx movslq %gs:(%rdi),%rbx b) into compares: the code improves from: 65 8b 05 00 00 00 00 mov %gs:0x0(%rip),%eax a9 00 00 0f 00 test $0xf0000,%eax to: 65 f7 05 00 00 00 00 testl $0xf0000,%gs:0x0(%rip) 00 00 0f 00 and in a similar way for: testl $0xf0000,%gs:0x0(%rip) testb $0x1,%gs:0x0(%rip) testl $0xff00,%gs:0x0(%rip) cmpb $0x0,%gs:0x0(%rip) cmp %gs:0x0(%rip),%r14d cmpw $0x8,%gs:0x0(%rip) cmpb $0x0,%gs:(%rax) c) into other insns: the code improves from: 1a355: 83 fa ff cmp $0xffffffff,%edx 1a358: 75 07 jne 1a361 <...> 1a35a: 65 8b 15 00 00 00 00 mov %gs:0x0(%rip),%edx 1a361: to: 1a35a: 83 fa ff cmp $0xffffffff,%edx 1a35d: 65 0f 44 15 00 00 00 cmove %gs:0x0(%rip),%edx 1a364: 00 The above propagations result in the following code size improvements for current mainline kernel (with the default config), compiled with: # gcc (GCC) 12.3.1 20230508 (Red Hat 12.3.1-1) text data bss dec filename 25508862 4386540 808388 30703790 vmlinux-vanilla.o 25500922 4386532 808388 30695842 vmlinux-new.o Co-developed-by: Nadav Amit <namit@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20231004192404.31733-1-ubizjak@gmail.com
2023-10-05	x86/percpu: Use compiler segment prefix qualifier	Nadav Amit	2	-23/+47
	Using a segment prefix qualifier is cleaner than using a segment prefix in the inline assembly, and provides the compiler with more information, telling it that __seg_gs:[addr] is different than [addr] when it analyzes data dependencies. It also enables various optimizations that will be implemented in the next patches. Use segment prefix qualifiers when they are supported. Unfortunately, gcc does not provide a way to remove segment qualifiers, which is needed to use typeof() to create local instances of the per-CPU variable. For this reason, do not use the segment qualifier for per-CPU variables, and do casting using the segment qualifier instead. Uros: Improve compiler support detection and update the patch to the current mainline. Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20231004145137.86537-4-ubizjak@gmail.com
2023-10-05	x86/percpu: Enable named address spaces with known compiler version	Uros Bizjak	1	-0/+7
	Enable named address spaces with known compiler versions (GCC 12.1 and later) in order to avoid possible issues with named address spaces with older compilers. Set CC_HAS_NAMED_AS when the compiler satisfies version requirements and set USE_X86_SEG_SUPPORT to signal when segment qualifiers could be used. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20231004145137.86537-3-ubizjak@gmail.com
2023-10-03	x86/lib: Address kernel-doc warnings	Zhu Wang	1	-5/+0
	Fix all kernel-doc warnings in csum-wrappers_64.c: arch/x86/lib/csum-wrappers_64.c:25: warning: Excess function parameter 'isum' description in 'csum_and_copy_from_user' arch/x86/lib/csum-wrappers_64.c:25: warning: Excess function parameter 'errp' description in 'csum_and_copy_from_user' arch/x86/lib/csum-wrappers_64.c:49: warning: Excess function parameter 'isum' description in 'csum_and_copy_to_user' arch/x86/lib/csum-wrappers_64.c:49: warning: Excess function parameter 'errp' description in 'csum_and_copy_to_user' arch/x86/lib/csum-wrappers_64.c:71: warning: Excess function parameter 'sum' description in 'csum_partial_copy_nocheck' Signed-off-by: Zhu Wang <wangzhu9@huawei.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org
2023-09-27	x86/entry: Fix typos in comments	Xin Li (Intel)	1	-4/+4
	Fix 2 typos in the comments. Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: H. Peter Anvin (Intel) <hpa@zytor.com> Link: https://lore.kernel.org/r/20230926061319.1929127-1-xin@zytor.com
2023-09-27	x86/entry: Remove unused argument %rsi passed to exc_nmi()	Xin Li (Intel)	1	-2/+0
	exc_nmi() only takes one argument of type struct pt_regs *, but asm_exc_nmi() calls it with 2 arguments. The second one passed in %rsi seems to be a leftover, so simply remove it. Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: H. Peter Anvin (Intel) <hpa@zytor.com> Link: https://lore.kernel.org/r/20230926061319.1929127-1-xin@zytor.com
2023-09-22	x86/bitops: Remove unused __sw_hweight64() assembly implementation on x86-32	Ingo Molnar	1	-14/+6
	Header cleanups in the fast-headers tree highlighted that we have an unused assembly implementation for __sw_hweight64(): WARNING: modpost: EXPORT symbol "__sw_hweight64" [vmlinux] version ... __arch_hweight64() on x86-32 is defined in the arch/x86/include/asm/arch_hweight.h header as an inline, using __arch_hweight32(): #ifdef CONFIG_X86_32 static inline unsigned long __arch_hweight64(__u64 w) { return __arch_hweight32((u32)w) + __arch_hweight32((u32)(w >> 32)); } But there's also a __sw_hweight64() assembly implementation: arch/x86/lib/hweight.S SYM_FUNC_START(__sw_hweight64) #ifdef CONFIG_X86_64 ... #else /* CONFIG_X86_32 / / We're getting an u64 arg in (%eax,%edx): unsigned long hweight64(__u64 w) */ pushl %ecx call __sw_hweight32 movl %eax, %ecx # stash away result movl %edx, %eax # second part of input call __sw_hweight32 addl %ecx, %eax # result popl %ecx ret #endif But this __sw_hweight64 assembly implementation is unused - and it's essentially doing the same thing that the inline wrapper does. Remove the assembly version and add a comment about it. Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org
2023-09-21	x86/percpu: Do not clobber %rsi in percpu_{try_,}cmpxchg{64,128}_op	Uros Bizjak	1	-12/+16
	The fallback alternative uses %rsi register to manually load pointer to the percpu variable before the call to the emulation function. This is unoptimal, because the load is hidden from the compiler. Move the load of %rsi outside inline asm, so the compiler can reuse the value. The code in slub.o improves from: 55ac: 49 8b 3c 24 mov (%r12),%rdi 55b0: 48 8d 4a 40 lea 0x40(%rdx),%rcx 55b4: 49 8b 1c 07 mov (%r15,%rax,1),%rbx 55b8: 4c 89 f8 mov %r15,%rax 55bb: 48 8d 37 lea (%rdi),%rsi 55be: e8 00 00 00 00 callq 55c3 <...> 55bf: R_X86_64_PLT32 this_cpu_cmpxchg16b_emu-0x4 55c3: 75 a3 jne 5568 <...> 55c5: ... 0000000000000000 <.altinstr_replacement>: 5: 65 48 0f c7 0f cmpxchg16b %gs:(%rdi) to: 55ac: 49 8b 34 24 mov (%r12),%rsi 55b0: 48 8d 4a 40 lea 0x40(%rdx),%rcx 55b4: 49 8b 1c 07 mov (%r15,%rax,1),%rbx 55b8: 4c 89 f8 mov %r15,%rax 55bb: e8 00 00 00 00 callq 55c0 <...> 55bc: R_X86_64_PLT32 this_cpu_cmpxchg16b_emu-0x4 55c0: 75 a6 jne 5568 <...> 55c2: ... Where the alternative replacement instruction now uses %rsi: 0000000000000000 <.altinstr_replacement>: 5: 65 48 0f c7 0e cmpxchg16b %gs:(%rsi) The instruction (effectively a reg-reg move) at 55bb: in the original assembly is removed. Also, both the CALL and replacement CMPXCHG16B are 5 bytes long, removing the need for NOPs in the asm code. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230918151452.62344-1-ubizjak@gmail.com
2023-09-15	x86/percpu: Use raw_cpu_try_cmpxchg() in preempt_count_set()	Uros Bizjak	1	-2/+2
	Use raw_cpu_try_cmpxchg() instead of raw_cpu_cmpxchg(ptr, old, new) == old. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after CMPXCHG (and related MOV instruction in front of CMPXCHG). Also, raw_cpu_try_cmpxchg() implicitly assigns old ptr value to "old" when cmpxchg fails. There is no need to re-read the value in the loop. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230830151623.3900-2-ubizjak@gmail.com
2023-09-15	x86/percpu: Define raw_cpu_try_cmpxchg and this_cpu_try_cmpxchg()	Uros Bizjak	1	-0/+27
	Define target-specific raw_cpu_try_cmpxchg_N() and this_cpu_try_cmpxchg_N() macros. These definitions override the generic fallback definitions and enable target-specific optimized implementations. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230830151623.3900-1-ubizjak@gmail.com
2023-09-15	x86/percpu: Define {raw,this}_cpu_try_cmpxchg{64,128}	Uros Bizjak	1	-0/+67
	Define target-specific {raw,this}_cpu_try_cmpxchg64() and {raw,this}_cpu_try_cmpxchg128() macros. These definitions override the generic fallback definitions and enable target-specific optimized implementations. Several places in mm/slub.o improve from e.g.: 53bc: 48 8d 4f 40 lea 0x40(%rdi),%rcx 53c0: 48 89 fa mov %rdi,%rdx 53c3: 49 8b 5c 05 00 mov 0x0(%r13,%rax,1),%rbx 53c8: 4c 89 e8 mov %r13,%rax 53cb: 49 8d 30 lea (%r8),%rsi 53ce: e8 00 00 00 00 call 53d3 <...> 53cf: R_X86_64_PLT32 this_cpu_cmpxchg16b_emu-0x4 53d3: 48 31 d7 xor %rdx,%rdi 53d6: 4c 31 e8 xor %r13,%rax 53d9: 48 09 c7 or %rax,%rdi 53dc: 75 ae jne 538c <...> to: 53bc: 48 8d 4a 40 lea 0x40(%rdx),%rcx 53c0: 49 8b 1c 07 mov (%r15,%rax,1),%rbx 53c4: 4c 89 f8 mov %r15,%rax 53c7: 48 8d 37 lea (%rdi),%rsi 53ca: e8 00 00 00 00 call 53cf <...> 53cb: R_X86_64_PLT32 this_cpu_cmpxchg16b_emu-0x4 53cf: 75 bb jne 538c <...> reducing the size of mm/slub.o by 80 bytes: text data bss dec hex filename 39758 5337 4208 49303 c097 slub-new.o 39838 5337 4208 49383 c0e7 slub-old.o Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230906185941.53527-1-ubizjak@gmail.com
2023-09-07	x86/asm/bitops: Use __builtin_clz{l\|ll} to evaluate constant expressions	Nick Desaulniers	1	-0/+9
	Micro-optimize the bitops code some more, similar to commits: fdb6649ab7c1 ("x86/asm/bitops: Use __builtin_ctzl() to evaluate constant expressions") 2fcff790dcb4 ("powerpc: Use builtin functions for fls()/__fls()/fls64()") From a recent discussion, I noticed that x86 is lacking an optimization that appears in arch/powerpc/include/asm/bitops.h related to constant folding. If you add a BUILD_BUG_ON(__builtin_constant_p(param)) to these functions, you'll find that there were cases where the use of inline asm pessimized the compiler's ability to perform constant folding resulting in runtime calculation of a value that could have been computed at compile time. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230828-x86_fls-v1-1-e6a31b9f79c3@google.com
2023-09-04	Merge tag 'timers-core-2023-09-04-v2' of ↵	Linus Torvalds	7	-479/+131
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull clocksource/clockevent driver updates from Thomas Gleixner: - Remove the OXNAS driver instead of adding a new one! - A set of boring fixes, cleanups and improvements * tag 'timers-core-2023-09-04-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: Explicitly include correct DT includes clocksource/drivers/sun5i: Convert to platform device driver clocksource/drivers/sun5i: Remove pointless struct clocksource/drivers/sun5i: Remove duplication of code and data clocksource/drivers/loongson1: Set variable ls1x_timer_lock storage-class-specifier to static clocksource/drivers/arm_arch_timer: Disable timer before programming CVAL dt-bindings: timer: oxsemi,rps-timer: remove obsolete bindings clocksource/drivers/timer-oxnas-rps: Remove obsolete timer driver
2023-09-04	Merge tag 'm68knommu-for-v6.6' of ↵	Linus Torvalds	2	-7/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu updates from Greg Ungerer: "Two changes, one a trivial white space clean up, the other removes the unnecessary local pcibios_setup() code" * tag 'm68knommu-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k: coldfire: dma_timer: ERROR: "foo __init bar" should be "foo __init bar" m68k/pci: Drop useless pcibios_setup()
2023-09-04	Merge tag 'uml-for-linus-6.6-rc1' of ↵	Linus Torvalds	18	-245/+23
	git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Drop 32-bit checksum implementation and re-use it from arch/x86 - String function cleanup - Fixes for -Wmissing-variable-declarations and -Wmissing-prototypes builds * tag 'uml-for-linus-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: um: virt-pci: fix missing declaration warning um: Refactor deprecated strncpy to memcpy um: fix 3 instances of -Wmissing-prototypes um: port_kern: fix -Wmissing-variable-declarations uml: audio: fix -Wmissing-variable-declarations um: vector: refactor deprecated strncpy um: use obj-y to descend into arch/um/*/ um: Hard-code the result of 'uname -s' um: Use the x86 checksum implementation on 32-bit asm-generic: current: Don't include thread-info.h if building asm um: Remove unsued extern declaration ldt_host_info() um: Fix hostaudio build errors um: Remove strlcpy usage
2023-09-04	Merge tag 'hyperv-next-signed-20230902' of ↵	Linus Torvalds	16	-113/+759
	git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Support for SEV-SNP guests on Hyper-V (Tianyu Lan) - Support for TDX guests on Hyper-V (Dexuan Cui) - Use SBRM API in Hyper-V balloon driver (Mitchell Levy) - Avoid dereferencing ACPI root object handle in VMBus driver (Maciej Szmigiero) - A few misecllaneous fixes (Jiapeng Chong, Nathan Chancellor, Saurabh Sengar) * tag 'hyperv-next-signed-20230902' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (24 commits) x86/hyperv: Remove duplicate include x86/hyperv: Move the code in ivm.c around to avoid unnecessary ifdef's x86/hyperv: Remove hv_isolation_type_en_snp x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM with the paravisor Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the paravisor x86/hyperv: Introduce a global variable hyperv_paravisor_present Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM x86/hyperv: Fix serial console interrupts for fully enlightened TDX guests Drivers: hv: vmbus: Support fully enlightened TDX guests x86/hyperv: Support hypercalls for fully enlightened TDX guests x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests x86/hyperv: Fix undefined reference to isolation_type_en_snp without CONFIG_HYPERV x86/hyperv: Add missing 'inline' to hv_snp_boot_ap() stub hv: hyperv.h: Replace one-element array with flexible-array member Drivers: hv: vmbus: Don't dereference ACPI root object handle x86/hyperv: Add hyperv-specific handling for VMMCALL under SEV-ES x86/hyperv: Add smp support for SEV-SNP guest clocksource: hyper-v: Mark hyperv tsc page unencrypted in sev-snp enlightened guest x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp enlightened guest drivers: hv: Mark percpu hvcall input arg page unencrypted in SEV-SNP enlightened guest ...
2023-09-04	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost	Linus Torvalds	9	-89/+624
	Pull virtio updates from Michael Tsirkin: "A small pull request this time around, mostly because the vduse network got postponed to next relase so we can be sure we got the security store right" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_ring: fix avail_wrap_counter in virtqueue_add_packed virtio_vdpa: build affinity masks conditionally virtio_net: merge dma operations when filling mergeable buffers virtio_ring: introduce dma sync api for virtqueue virtio_ring: introduce dma map api for virtqueue virtio_ring: introduce virtqueue_reset() virtio_ring: separate the logic of reset/enable from virtqueue_resize virtio_ring: correct the expression of the description of virtqueue_resize() virtio_ring: skip unmap for premapped virtio_ring: introduce virtqueue_dma_dev() virtio_ring: support add premapped buf virtio_ring: introduce virtqueue_set_dma_premapped() virtio_ring: put mapping error check in vring_map_one_sg virtio_ring: check use_dma_api before unmap desc for indirect vdpa_sim: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK vdpa: add get_backend_features vdpa operation vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature vdpa: add VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag vdpa/mlx5: Remove unused function declarations
2023-09-04	Merge tag 'tomoyo-pr-20230903' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1	Linus Torvalds	3	-7/+5
	Pull tomoyo updates from Tetsuo Handa: "Three cleanup patches, no behavior changes" * tag 'tomoyo-pr-20230903' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1: tomoyo: remove unused function declaration tomoyo: refactor deprecated strncpy tomoyo: add format attributes to functions
2023-09-04	virtio_ring: fix avail_wrap_counter in virtqueue_add_packed	Yuan Yao	1	-1/+1
	In current packed virtqueue implementation, the avail_wrap_counter won't flip, in the case when the driver supplies a descriptor chain with a length equals to the queue size; total_sg == vq->packed.vring.num. Let’s assume the following situation: vq->packed.vring.num=4 vq->packed.next_avail_idx: 1 vq->packed.avail_wrap_counter: 0 Then the driver adds a descriptor chain containing 4 descriptors. We expect the following result with avail_wrap_counter flipped: vq->packed.next_avail_idx: 1 vq->packed.avail_wrap_counter: 1 But, the current implementation gives the following result: vq->packed.next_avail_idx: 1 vq->packed.avail_wrap_counter: 0 To reproduce the bug, you can set a packed queue size as small as possible, so that the driver is more likely to provide a descriptor chain with a length equal to the packed queue size. For example, in qemu run following commands: sudo qemu-system-x86_64 \ -enable-kvm \ -nographic \ -kernel "path/to/kernel_image" \ -m 1G \ -drive file="path/to/rootfs",if=none,id=disk \ -device virtio-blk,drive=disk \ -drive file="path/to/disk_image",if=none,id=rwdisk \ -device virtio-blk,drive=rwdisk,packed=on,queue-size=4,\ indirect_desc=off \ -append "console=ttyS0 root=/dev/vda rw init=/bin/bash" Inside the VM, create a directory and mount the rwdisk device on it. The rwdisk will hang and mount operation will not complete. This commit fixes the wrap counter error by flipping the packed.avail_wrap_counter, when start of descriptor chain equals to the end of descriptor chain (head == i). Fixes: 1ce9e6055fa0 ("virtio_ring: introduce packed ring support") Signed-off-by: Yuan Yao <yuanyaogoog@chromium.org> Message-Id: <20230808051110.3492693-1-yuanyaogoog@chromium.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-04	virtio_vdpa: build affinity masks conditionally	Jason Wang	1	-6/+11
	We try to build affinity mask via create_affinity_masks() unconditionally which may lead several issues: - the affinity mask is not used for parent without affinity support (only VDUSE support the affinity now) - the logic of create_affinity_masks() might not work for devices other than block. For example it's not rare in the networking device where the number of queues could exceed the number of CPUs. Such case breaks the current affinity logic which is based on group_cpus_evenly() who assumes the number of CPUs are not less than the number of groups. This can trigger a warning[1]: if (ret >= 0) WARN_ON(nr_present + nr_others < numgrps); Fixing this by only build the affinity masks only when - Driver passes affinity descriptor, driver like virtio-blk can make sure to limit the number of queues when it exceeds the number of CPUs - Parent support affinity setting config ops This help to avoid the warning. More optimizations could be done on top. [1] [ 682.146655] WARNING: CPU: 6 PID: 1550 at lib/group_cpus.c:400 group_cpus_evenly+0x1aa/0x1c0 [ 682.146668] CPU: 6 PID: 1550 Comm: vdpa Not tainted 6.5.0-rc5jason+ #79 [ 682.146671] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [ 682.146673] RIP: 0010:group_cpus_evenly+0x1aa/0x1c0 [ 682.146676] Code: 4c 89 e0 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc e8 1b c4 74 ff 48 89 ef e8 13 ac 98 ff 4c 89 e7 45 31 e4 e8 08 ac 98 ff eb c2 <0f> 0b eb b6 e8 fd 05 c3 00 45 31 e4 eb e5 cc cc cc cc cc cc cc cc [ 682.146679] RSP: 0018:ffffc9000215f498 EFLAGS: 00010293 [ 682.146682] RAX: 000000000001f1e0 RBX: 0000000000000041 RCX: 0000000000000000 [ 682.146684] RDX: ffff888109922058 RSI: 0000000000000041 RDI: 0000000000000030 [ 682.146686] RBP: ffff888109922058 R08: ffffc9000215f498 R09: ffffc9000215f4a0 [ 682.146687] R10: 00000000000198d0 R11: 0000000000000030 R12: ffff888107e02800 [ 682.146689] R13: 0000000000000030 R14: 0000000000000030 R15: 0000000000000041 [ 682.146692] FS: 00007fef52315740(0000) GS:ffff888237380000(0000) knlGS:0000000000000000 [ 682.146695] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 682.146696] CR2: 00007fef52509000 CR3: 0000000110dbc004 CR4: 0000000000370ee0 [ 682.146698] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 682.146700] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 682.146701] Call Trace: [ 682.146703] <TASK> [ 682.146705] ? __warn+0x7b/0x130 [ 682.146709] ? group_cpus_evenly+0x1aa/0x1c0 [ 682.146712] ? report_bug+0x1c8/0x1e0 [ 682.146717] ? handle_bug+0x3c/0x70 [ 682.146721] ? exc_invalid_op+0x14/0x70 [ 682.146723] ? asm_exc_invalid_op+0x16/0x20 [ 682.146727] ? group_cpus_evenly+0x1aa/0x1c0 [ 682.146729] ? group_cpus_evenly+0x15c/0x1c0 [ 682.146731] create_affinity_masks+0xaf/0x1a0 [ 682.146735] virtio_vdpa_find_vqs+0x83/0x1d0 [ 682.146738] ? __pfx_default_calc_sets+0x10/0x10 [ 682.146742] virtnet_find_vqs+0x1f0/0x370 [ 682.146747] virtnet_probe+0x501/0xcd0 [ 682.146749] ? vp_modern_get_status+0x12/0x20 [ 682.146751] ? get_cap_addr.isra.0+0x10/0xc0 [ 682.146754] virtio_dev_probe+0x1af/0x260 [ 682.146759] really_probe+0x1a5/0x410 Fixes: 3dad56823b53 ("virtio-vdpa: Support interrupt affinity spreading mechanism") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230811091539.1359865-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-04	virtio_net: merge dma operations when filling mergeable buffers	Xuan Zhuo	1	-26/+202
	Currently, the virtio core will perform a dma operation for each buffer. Although, the same page may be operated multiple times. This patch, the driver does the dma operation and manages the dma address based the feature premapped of virtio core. This way, we can perform only one dma operation for the pages of the alloc frag. This is beneficial for the iommu device. kernel command line: intel_iommu=on iommu.passthrough=0 \| strict=0 \| strict=1 Before \| 775496pps \| 428614pps After \| 1109316pps \| 742853pps Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20230810123057.43407-13-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-04	virtio_ring: introduce dma sync api for virtqueue	Xuan Zhuo	2	-0/+84
	These API has been introduced: * virtqueue_dma_need_sync * virtqueue_dma_sync_single_range_for_cpu * virtqueue_dma_sync_single_range_for_device These APIs can be used together with the premapped mechanism to sync the DMA address. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20230810123057.43407-12-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-04	virtio_ring: introduce dma map api for virtqueue	Xuan Zhuo	2	-0/+77
	Added virtqueue_dma_map_api* to map DMA addresses for virtual memory in advance. The purpose is to keep memory mapped across multiple add/get buf operations. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20230810123057.43407-11-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-04	virtio_ring: introduce virtqueue_reset()	Xuan Zhuo	2	-0/+35
	Introduce virtqueue_reset() to release all buffer inside vq. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230810123057.43407-10-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-04	virtio_ring: separate the logic of reset/enable from virtqueue_resize	Xuan Zhuo	1	-19/+39
	The subsequent reset function will reuse these logic. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230810123057.43407-9-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-04	virtio_ring: correct the expression of the description of virtqueue_resize()	Xuan Zhuo	1	-1/+1
	Modify the "useless" to a more accurate "unused". Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230810123057.43407-8-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-04	virtio_ring: skip unmap for premapped	Xuan Zhuo	1	-14/+28
	Now we add a case where we skip dma unmap, the vq->premapped is true. We can't just rely on use_dma_api to determine whether to skip the dma operation. For convenience, I introduced the "do_unmap". By default, it is the same as use_dma_api. If the driver is configured with premapped, then do_unmap is false. So as long as do_unmap is false, for addr of desc, we should skip dma unmap operation. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20230810123057.43407-7-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>