summaryrefslogtreecommitdiff
path: root/arch/loongarch/kernel
AgeCommit message (Collapse)AuthorFilesLines
2022-12-13Merge tag 'random-6.2-rc1-for-linus' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: - Replace prandom_u32_max() and various open-coded variants of it, there is now a new family of functions that uses fast rejection sampling to choose properly uniformly random numbers within an interval: get_random_u32_below(ceil) - [0, ceil) get_random_u32_above(floor) - (floor, U32_MAX] get_random_u32_inclusive(floor, ceil) - [floor, ceil] Coccinelle was used to convert all current users of prandom_u32_max(), as well as many open-coded patterns, resulting in improvements throughout the tree. I'll have a "late" 6.1-rc1 pull for you that removes the now unused prandom_u32_max() function, just in case any other trees add a new use case of it that needs to converted. According to linux-next, there may be two trivial cases of prandom_u32_max() reintroductions that are fixable with a 's/.../.../'. So I'll have for you a final conversion patch doing that alongside the removal patch during the second week. This is a treewide change that touches many files throughout. - More consistent use of get_random_canary(). - Updates to comments, documentation, tests, headers, and simplification in configuration. - The arch_get_random*_early() abstraction was only used by arm64 and wasn't entirely useful, so this has been replaced by code that works in all relevant contexts. - The kernel will use and manage random seeds in non-volatile EFI variables, refreshing a variable with a fresh seed when the RNG is initialized. The RNG GUID namespace is then hidden from efivarfs to prevent accidental leakage. These changes are split into random.c infrastructure code used in the EFI subsystem, in this pull request, and related support inside of EFISTUB, in Ard's EFI tree. These are co-dependent for full functionality, but the order of merging doesn't matter. - Part of the infrastructure added for the EFI support is also used for an improvement to the way vsprintf initializes its siphash key, replacing an sleep loop wart. - The hardware RNG framework now always calls its correct random.c input function, add_hwgenerator_randomness(), rather than sometimes going through helpers better suited for other cases. - The add_latent_entropy() function has long been called from the fork handler, but is a no-op when the latent entropy gcc plugin isn't used, which is fine for the purposes of latent entropy. But it was missing out on the cycle counter that was also being mixed in beside the latent entropy variable. So now, if the latent entropy gcc plugin isn't enabled, add_latent_entropy() will expand to a call to add_device_randomness(NULL, 0), which adds a cycle counter, without the absent latent entropy variable. - The RNG is now reseeded from a delayed worker, rather than on demand when used. Always running from a worker allows it to make use of the CPU RNG on platforms like S390x, whose instructions are too slow to do so from interrupts. It also has the effect of adding in new inputs more frequently with more regularity, amounting to a long term transcript of random values. Plus, it helps a bit with the upcoming vDSO implementation (which isn't yet ready for 6.2). - The jitter entropy algorithm now tries to execute on many different CPUs, round-robining, in hopes of hitting even more memory latencies and other unpredictable effects. It also will mix in a cycle counter when the entropy timer fires, in addition to being mixed in from the main loop, to account more explicitly for fluctuations in that timer firing. And the state it touches is now kept within the same cache line, so that it's assured that the different execution contexts will cause latencies. * tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits) random: include <linux/once.h> in the right header random: align entropy_timer_state to cache line random: mix in cycle counter when jitter timer fires random: spread out jitter callback to different CPUs random: remove extraneous period and add a missing one in comments efi: random: refresh non-volatile random seed when RNG is initialized vsprintf: initialize siphash key using notifier random: add back async readiness notifier random: reseed in delayed work rather than on-demand random: always mix cycle counter in add_latent_entropy() hw_random: use add_hwgenerator_randomness() for early entropy random: modernize documentation comment on get_random_bytes() random: adjust comment to account for removed function random: remove early archrandom abstraction random: use random.trust_{bootloader,cpu} command line option only stackprotector: actually use get_random_canary() stackprotector: move get_random_canary() into stackprotector.h treewide: use get_random_u32_inclusive() when possible treewide: use get_random_u32_{above,below}() instead of manual loop treewide: use get_random_u32_below() instead of deprecated function ...
2022-12-08LoongArch: Export symbol for function smp_send_reschedule()Bibo Mao1-0/+11
Function smp_send_reschedule() is standard kernel API, which is defined in header file include/linux/smp.h. However, on LoongArch it is defined as an inline function, this is confusing and kernel modules can not use this function. Now we define smp_send_reschedule() as a general function, and add a EXPORT_SYMBOL_GPL on this function, so that kernel modules can use it. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Fix unsigned comparison with less than zeroKaiLong Wang1-1/+2
Eliminate the following coccicheck warning: ./arch/loongarch/kernel/unwind_prologue.c:84:5-13: WARNING: Unsigned expression compared with zero: frame_ra < 0 Signed-off-by: KaiLong Wang <wangkailong@jari.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Clear FPU/SIMD thread info flags for kernel threadHuacai Chen1-4/+5
If a kernel thread is created by a user thread, it may carry FPU/SIMD thread info flags (TIF_USEDFPU, TIF_USEDSIMD, etc.). Then it will be considered as a fpu owner and kernel try to save its FPU/SIMD context and cause such errors: [ 41.518931] do_fpu invoked from kernel context![#1]: [ 41.523933] CPU: 1 PID: 395 Comm: iou-wrk-394 Not tainted 6.1.0-rc5+ #217 [ 41.530757] Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.pre-beta8 08/18/2022 [ 41.544064] $ 0 : 0000000000000000 90000000011e9468 9000000106c7c000 9000000106c7fcf0 [ 41.552101] $ 4 : 9000000106305d40 9000000106689800 9000000106c7fd08 0000000003995818 [ 41.560138] $ 8 : 0000000000000001 90000000009a72e4 0000000000000020 fffffffffffffffc [ 41.568174] $12 : 0000000000000000 0000000000000000 0000000000000020 00000009aab7e130 [ 41.576211] $16 : 00000000000001ff 0000000000000407 0000000000000001 0000000000000000 [ 41.584247] $20 : 0000000000000000 0000000000000001 9000000106c7fd70 90000001002f0400 [ 41.592284] $24 : 0000000000000000 900000000178f740 90000000011e9834 90000001063057c0 [ 41.600320] $28 : 0000000000000000 0000000000000001 9000000006826b40 9000000106305140 [ 41.608356] era : 9000000000228848 _save_fp+0x0/0xd8 [ 41.613542] ra : 90000000011e9468 __schedule+0x568/0x8d0 [ 41.619160] CSR crmd: 000000b0 [ 41.619163] CSR prmd: 00000000 [ 41.622359] CSR euen: 00000000 [ 41.625558] CSR ecfg: 00071c1c [ 41.628756] CSR estat: 000f0000 [ 41.635239] ExcCode : f (SubCode 0) [ 41.638783] PrId : 0014c010 (Loongson-64bit) [ 41.643191] Modules linked in: acpi_ipmi vfat fat ipmi_si ipmi_devintf cfg80211 ipmi_msghandler rfkill fuse efivarfs [ 41.653734] Process iou-wrk-394 (pid: 395, threadinfo=0000000004ebe913, task=00000000636fa1be) [ 41.662375] Stack : 00000000ffff0875 9000000006800ec0 9000000006800ec0 90000000002d57e0 [ 41.670412] 0000000000000001 0000000000000000 9000000106535880 0000000000000001 [ 41.678450] 9000000105291800 0000000000000000 9000000105291838 900000000178e000 [ 41.686487] 9000000106c7fd90 9000000106305140 0000000000000001 90000000011e9834 [ 41.694523] 00000000ffff0875 90000000011f034c 9000000105291838 9000000105291830 [ 41.702561] 0000000000000000 9000000006801440 00000000ffff0875 90000000002d48c0 [ 41.710597] 9000000128800001 9000000106305140 9000000105291838 9000000105291838 [ 41.718634] 9000000105291830 9000000107811740 9000000105291848 90000000009bf1e0 [ 41.726672] 9000000105291830 9000000107811748 2d6b72772d756f69 0000000000343933 [ 41.734708] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 41.742745] ... [ 41.745252] Call Trace: [ 42.197868] [<9000000000228848>] _save_fp+0x0/0xd8 [ 42.205214] [<90000000011ed468>] __schedule+0x568/0x8d0 [ 42.210485] [<90000000011ed834>] schedule+0x64/0xd4 [ 42.215411] [<90000000011f434c>] schedule_timeout+0x88/0x188 [ 42.221115] [<90000000009c36d0>] io_wqe_worker+0x184/0x350 [ 42.226645] [<9000000000221cf0>] ret_from_kernel_thread+0xc/0x9c This can be easily triggered by ltp testcase syscalls/io_uring02 and it can also be easily fixed by clearing the FPU/SIMD thread info flags for kernel threads in copy_thread(). Cc: stable@vger.kernel.org Reported-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: SMP: Change prefix from loongson3 to loongsonHuacai Chen2-23/+23
SMP operations can be shared by Loongson-2 series and Loongson-3 series, so we change the prefix from loongson3 to loongson for all functions and data structures. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Combine acpi_boot_table_init() and acpi_boot_init()Huacai Chen2-22/+10
Combine acpi_boot_table_init() and acpi_boot_init() since they are very simple, and we don't need to check the return value of acpi_boot_init(). Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-18treewide: use get_random_u32_below() instead of deprecated functionJason A. Donenfeld2-2/+2
This is a simple mechanical transformation done by: @@ expression E; @@ - prandom_u32_max + get_random_u32_below (E) Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Reviewed-by: SeongJae Park <sj@kernel.org> # for damon Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-29LoongArch: Remove unused kernel stack paddingJinyang He3-5/+4
The current LoongArch kernel stack is padded as if obeying the MIPS o32 calling convention (32 bytes), signifying the port's MIPS lineage but no longer making sense. Remove the padding for clarity. Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-17Merge tag 'random-6.1-rc1-for-linus' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull more random number generator updates from Jason Donenfeld: "This time with some large scale treewide cleanups. The intent of this pull is to clean up the way callers fetch random integers. The current rules for doing this right are: - If you want a secure or an insecure random u64, use get_random_u64() - If you want a secure or an insecure random u32, use get_random_u32() The old function prandom_u32() has been deprecated for a while now and is just a wrapper around get_random_u32(). Same for get_random_int(). - If you want a secure or an insecure random u16, use get_random_u16() - If you want a secure or an insecure random u8, use get_random_u8() - If you want secure or insecure random bytes, use get_random_bytes(). The old function prandom_bytes() has been deprecated for a while now and has long been a wrapper around get_random_bytes() - If you want a non-uniform random u32, u16, or u8 bounded by a certain open interval maximum, use prandom_u32_max() I say "non-uniform", because it doesn't do any rejection sampling or divisions. Hence, it stays within the prandom_*() namespace, not the get_random_*() namespace. I'm currently investigating a "uniform" function for 6.2. We'll see what comes of that. By applying these rules uniformly, we get several benefits: - By using prandom_u32_max() with an upper-bound that the compiler can prove at compile-time is ≤65536 or ≤256, internally get_random_u16() or get_random_u8() is used, which wastes fewer batched random bytes, and hence has higher throughput. - By using prandom_u32_max() instead of %, when the upper-bound is not a constant, division is still avoided, because prandom_u32_max() uses a faster multiplication-based trick instead. - By using get_random_u16() or get_random_u8() in cases where the return value is intended to indeed be a u16 or a u8, we waste fewer batched random bytes, and hence have higher throughput. This series was originally done by hand while I was on an airplane without Internet. Later, Kees and I worked on retroactively figuring out what could be done with Coccinelle and what had to be done manually, and then we split things up based on that. So while this touches a lot of files, the actual amount of code that's hand fiddled is comfortably small" * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: prandom: remove unused functions treewide: use get_random_bytes() when possible treewide: use get_random_u32() when possible treewide: use get_random_{u8,u16}() when possible, part 2 treewide: use get_random_{u8,u16}() when possible, part 1 treewide: use prandom_u32_max() when possible, part 2 treewide: use prandom_u32_max() when possible, part 1
2022-10-12Merge tag 'loongarch-6.1' of ↵Linus Torvalds18-111/+1754
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Use EXPLICIT_RELOCS (ABIv2.0) - Use generic BUG() handler - Refactor TLB/Cache operations - Add qspinlock support - Add perf events support - Add kexec/kdump support - Add BPF JIT support - Add ACPI-based laptop driver - Update the default config file * tag 'loongarch-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (25 commits) LoongArch: Update Loongson-3 default config file LoongArch: Add ACPI-based generic laptop driver LoongArch: Add BPF JIT support LoongArch: Add some instruction opcodes and formats LoongArch: Move {signed,unsigned}_imm_check() to inst.h LoongArch: Add kdump support LoongArch: Add kexec support LoongArch: Use generic BUG() handler LoongArch: Add SysRq-x (TLB Dump) support LoongArch: Add perf events support LoongArch: Add qspinlock support LoongArch: Use TLB for ioremap() LoongArch: Support access filter to /dev/mem interface LoongArch: Refactor cache probe and flush methods LoongArch: mm: Refactor TLB exception handlers LoongArch: Support R_LARCH_GOT_PC_{LO12,HI20} in modules LoongArch: Support PC-relative relocations in modules LoongArch: Define ELF relocation types added in ABIv2.0 LoongArch: Adjust symbol addressing for AS_HAS_EXPLICIT_RELOCS LoongArch: Add Kconfig option AS_HAS_EXPLICIT_RELOCS ...
2022-10-12LoongArch: Move {signed,unsigned}_imm_check() to inst.hTiezhu Yang1-10/+0
{signed,unsigned}_imm_check() will also be used in the bpf jit, so move them from module.c to inst.h, this is preparation for later patches. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add kdump supportYouling Tang7-8/+201
This patch adds support for kdump. In kdump case the normal kernel will reserve a region for the crash kernel and jump there on panic. Arch-specific functions are added to allow for implementing a crash dump file interface, /proc/vmcore, which can be viewed as a ELF file. A user-space tool, such as kexec-tools, is responsible for allocating a separate region for the core's ELF header within the crash kdump kernel memory and filling it in when executing kexec_load(). Then, its location will be advertised to the crash dump kernel via a command line argument "elfcorehdr=", and the crash dump kernel will preserve this region for later use with arch_reserve_vmcore() at boot time. At the same time, the crash kdump kernel is also limited within the "crashkernel" area via a command line argument "mem=", so as not to destroy the original kernel dump data. In the crash dump kernel environment, /proc/vmcore is used to access the primary kernel's memory with copy_oldmem_page(). I tested kdump on LoongArch machines (Loongson-3A5000) and it works as expected (suggested crashkernel parameter is "crashkernel=512M@2560M"), you may test it by triggering a crash through /proc/sysrq-trigger: $ sudo kexec -p /boot/vmlinux-kdump --reuse-cmdline --append="nr_cpus=1" # echo c > /proc/sysrq-trigger Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add kexec supportYouling Tang4-1/+329
Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to the LoongArch architecture, so as to add support for the kexec re-boot mechanism (CONFIG_KEXEC) on LoongArch platforms. Kexec supports loading vmlinux.elf in ELF format and vmlinux.efi in PE format. I tested kexec on LoongArch machines (Loongson-3A5000) and it works as expected: $ sudo kexec -l /boot/vmlinux.efi --reuse-cmdline $ sudo kexec -e Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Use generic BUG() handlerYouling Tang2-2/+28
Inspired by commit 9fb7410f955("arm64/BUG: Use BRK instruction for generic BUG traps"), do similar for LoongArch to use generic BUG() handler. This patch uses the BREAK software breakpoint instruction to generate a trap instead, similarly to most other arches, with the generic BUG code generating the dmesg boilerplate. This allows bug metadata to be moved to a separate table and reduces the amount of inline code at BUG() and WARN() sites. This also avoids clobbering any registers before they can be dumped. To mitigate the size of the bug table further, this patch makes use of the existing infrastructure for encoding addresses within the bug table as 32-bit relative pointers instead of absolute pointers. (Note: this limits the max kernel size to 2GB.) Before patch: [ 3018.338013] lkdtm: Performing direct entry BUG [ 3018.342445] Kernel bug detected[#5]: [ 3018.345992] CPU: 2 PID: 865 Comm: cat Tainted: G D 6.0.0-rc6+ #35 After patch: [ 125.585985] lkdtm: Performing direct entry BUG [ 125.590433] ------------[ cut here ]------------ [ 125.595020] kernel BUG at drivers/misc/lkdtm/bugs.c:78! [ 125.600211] Oops - BUG[#1]: [ 125.602980] CPU: 3 PID: 410 Comm: cat Not tainted 6.0.0-rc6+ #36 Out-of-line file/line data information obtained compared to before. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add SysRq-x (TLB Dump) supportHuacai Chen2-0/+67
Add SysRq-x (TLB Dump) support for LoongArch, which is useful for debugging. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add perf events supportHuacai Chen3-0/+942
The perf events infrastructure of LoongArch is very similar to old MIPS- based Loongson, so most of the codes are derived from MIPS. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Use TLB for ioremap()Huacai Chen1-1/+1
We can support more cache attributes (e.g., CC, SUC and WUC) and page protection when we use TLB for ioremap(). The implementation is based on GENERIC_IOREMAP. The existing simple ioremap() implementation has better performance so we keep it and introduce ARCH_IOREMAP to control the selection. We move pagetable_init() earlier to make early ioremap() works, and we modify the PCI ecam mapping because the TLB-based version of ioremap() will actually take the size into account. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Refactor cache probe and flush methodsHuacai Chen2-74/+27
Current cache probe and flush methods have some drawbacks: 1, Assume there are 3 cache levels and only 3 levels; 2, Assume L1 = I + D, L2 = V, L3 = S, V is exclusive, S is inclusive. However, the fact is I + D, I + D + V, I + D + S and I + D + V + S are all valid. So, refactor the cache probe and flush methods to adapt more types of cache hierarchy. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Support R_LARCH_GOT_PC_{LO12,HI20} in modulesXi Ruoyao2-5/+73
GCC >= 13 and GNU assembler >= 2.40 use these relocations to address external symbols, so we need to add them. Let the module loader emit GOT entries for data symbols so we would be able to handle GOT relocations. The GOT entry is just the data's symbol address. In module.lds, emit a stub .got section for a section header entry. The actual content of the section entry will be filled at runtime by module_ frob_arch_sections(). Tested-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Support PC-relative relocations in modulesXi Ruoyao2-1/+75
Binutils >= 2.40 uses R_LARCH_B26 instead of R_LARCH_SOP_PUSH_PLT_PCREL, and R_LARCH_PCALA* instead of R_LARCH_SOP_PUSH_PCREL. Handle R_LARCH_B26 and R_LARCH_PCALA* in the module loader. For R_LARCH_ B26, also create a PLT entry as needed. Tested-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Define ELF relocation types added in ABIv2.0Xi Ruoyao1-1/+1
These relocation types are used by GNU binutils >= 2.40 and GCC >= 13. Add their definitions so we will be able to use them in later patches. Link: https://github.com/loongson/LoongArch-Documentation/pull/57 Tested-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Adjust symbol addressing for AS_HAS_EXPLICIT_RELOCSXi Ruoyao2-6/+10
If explicit relocation hints are used by the toolchain, -Wa,-mla-* options will be useless for the C code. So only use them for the !CONFIG_AS_HAS_EXPLICIT_RELOCS case. Replace "la" with "la.pcrel" in head.S to keep the semantic consistent with new and old toolchains for the low level startup code. For per-CPU variables, the "address" of the symbol is actually an offset from $r21. The value is near the loading address of main kernel image, but far from the loading address of modules. So we use model("extreme") attibute to tell the compiler that a PC-relative addressing with 32-bit offset is not sufficient for local per-CPU variables. The behavior with different assemblers and compilers are summarized in the following table: AS has CC has explicit relocs explicit relocs * Behavior ============================================================== No No Use la.* macros. No change from Linux 6.0. -------------------------------------------------------------- No Yes Disable explicit relocs. No change from Linux 6.0. -------------------------------------------------------------- Yes No Not supported. -------------------------------------------------------------- Yes Yes Enable explicit relocs. No -Wa,-mla* options used. ============================================================== *: We assume CC must have model attribute if it has explicit relocs. Both features are added in GCC 13 development cycle, so any GCC release >= 13 should be OK. Using early GCC 13 development snapshots may produce modules with unsupported relocations. Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=f09482a Link: https://gcc.gnu.org/r13-1834 Link: https://gcc.gnu.org/r13-2199 Tested-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Do not create sysfs control file for io master CPUsTiezhu Yang2-6/+2
Now io master CPUs are not hotpluggable on LoongArch, but in the current code only /sys/devices/system/cpu/cpu0/online is not created. Let us set the hotpluggable field of all the io master CPUs as 0, then prevent to create sysfs control file for all the io master CPUs which confuses some user space tools. This is similar with commit 9cce844abf07 ("MIPS: CPU#0 is not hotpluggable"). Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Fix cpu name after CPU-hotplugJianmin Lv1-1/+3
Don't overwrite the SMBIOS-provided CPU name on coming back from CPU- hotplug (including S3/S4) if it is already initialized. Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12treewide: use prandom_u32_max() when possible, part 1Jason A. Donenfeld2-2/+2
Rather than incurring a division or requesting too many random bytes for the given range, use the prandom_u32_max() function, which only takes the minimum required bytes from the RNG and avoids divisions. This was done mechanically with this coccinelle script: @basic@ expression E; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u64; @@ ( - ((T)get_random_u32() % (E)) + prandom_u32_max(E) | - ((T)get_random_u32() & ((E) - 1)) + prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2) | - ((u64)(E) * get_random_u32() >> 32) + prandom_u32_max(E) | - ((T)get_random_u32() & ~PAGE_MASK) + prandom_u32_max(PAGE_SIZE) ) @multi_line@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; identifier RAND; expression E; @@ - RAND = get_random_u32(); ... when != RAND - RAND %= (E); + RAND = prandom_u32_max(E); // Find a potential literal @literal_mask@ expression LITERAL; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; position p; @@ ((T)get_random_u32()@p & (LITERAL)) // Add one to the literal. @script:python add_one@ literal << literal_mask.LITERAL; RESULT; @@ value = None if literal.startswith('0x'): value = int(literal, 16) elif literal[0] in '123456789': value = int(literal, 10) if value is None: print("I don't know how to handle %s" % (literal)) cocci.include_match(False) elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1: print("Skipping 0x%x for cleanup elsewhere" % (value)) cocci.include_match(False) elif value & (value + 1) != 0: print("Skipping 0x%x because it's not a power of two minus one" % (value)) cocci.include_match(False) elif literal.startswith('0x'): coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1)) else: coccinelle.RESULT = cocci.make_expr("%d" % (value + 1)) // Replace the literal mask with the calculated result. @plus_one@ expression literal_mask.LITERAL; position literal_mask.p; expression add_one.RESULT; identifier FUNC; @@ - (FUNC()@p & (LITERAL)) + prandom_u32_max(RESULT) @collapse_ret@ type T; identifier VAR; expression E; @@ { - T VAR; - VAR = (E); - return VAR; + return E; } @drop_var@ type T; identifier VAR; @@ { - T VAR; ... when != VAR } Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: KP Singh <kpsingh@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-10Merge tag 'bitmap-6.1-rc1' of https://github.com/norov/linuxLinus Torvalds1-1/+1
Pull bitmap updates from Yury Norov: - Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES (Phil Auld) - cleanup nr_cpu_ids vs nr_cpumask_bits mess (me) This series cleans that mess and adds new config FORCE_NR_CPUS that allows to optimize cpumask subsystem if the number of CPUs is known at compile-time. - optimize find_bit() functions (me) Reworks find_bit() functions based on new FIND_{FIRST,NEXT}_BIT() macros. - add find_nth_bit() (me) Adds find_nth_bit(), which is ~70 times faster than bitcounting with for_each() loop: for_each_set_bit(bit, mask, size) if (n-- == 0) return bit; Also adds bitmap_weight_and() to let people replace this pattern: tmp = bitmap_alloc(nbits); bitmap_and(tmp, map1, map2, nbits); weight = bitmap_weight(tmp, nbits); bitmap_free(tmp); with a single bitmap_weight_and() call. - repair cpumask_check() (me) After switching cpumask to use nr_cpu_ids, cpumask_check() started generating many false-positive warnings. This series fixes it. - Add for_each_cpu_andnot() and for_each_cpu_andnot() (Valentin Schneider) Extends the API with one more function and applies it in sched/core. * tag 'bitmap-6.1-rc1' of https://github.com/norov/linux: (28 commits) sched/core: Merge cpumask_andnot()+for_each_cpu() into for_each_cpu_andnot() lib/test_cpumask: Add for_each_cpu_and(not) tests cpumask: Introduce for_each_cpu_andnot() lib/find_bit: Introduce find_next_andnot_bit() cpumask: fix checking valid cpu range lib/bitmap: add tests for for_each() loops lib/find: optimize for_each() macros lib/bitmap: introduce for_each_set_bit_wrap() macro lib/find_bit: add find_next{,_and}_bit_wrap cpumask: switch for_each_cpu{,_not} to use for_each_bit() net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and} cpumask: add cpumask_nth_{,and,andnot} lib/bitmap: remove bitmap_ord_to_pos lib/bitmap: add tests for find_nth_bit() lib: add find_nth{,_and,_andnot}_bit() lib/bitmap: add bitmap_weight_and() lib/bitmap: don't call __bitmap_weight() in kernel code tools: sync find_bit() implementation lib/find_bit: optimize find_next_bit() functions lib/find_bit: create find_first_zero_bit_le() ...
2022-10-10Merge tag 'kbuild-v6.1' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove potentially incomplete targets when Kbuid is interrupted by SIGINT etc in case GNU Make may miss to do that when stderr is piped to another program. - Rewrite the single target build so it works more correctly. - Fix rpm-pkg builds with V=1. - List top-level subdirectories in ./Kbuild. - Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in kallsyms. - Avoid two different modules in lib/zstd/ having shared code, which potentially causes building the common code as build-in and modular back-and-forth. - Unify two modpost invocations to optimize the build process. - Remove head-y syntax in favor of linker scripts for placing particular sections in the head of vmlinux. - Bump the minimal GNU Make version to 3.82. - Clean up misc Makefiles and scripts. * tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits) docs: bump minimal GNU Make version to 3.82 ia64: simplify esi object addition in Makefile Revert "kbuild: Check if linker supports the -X option" kbuild: rebuild .vmlinux.export.o when its prerequisite is updated kbuild: move modules.builtin(.modinfo) rules to Makefile.vmlinux_o zstd: Fixing mixed module-builtin objects kallsyms: ignore __kstrtab_* and __kstrtabns_* symbols kallsyms: take the input file instead of reading stdin kallsyms: drop duplicated ignore patterns from kallsyms.c kbuild: reuse mksysmap output for kallsyms mksysmap: update comment about __crc_* kbuild: remove head-y syntax kbuild: use obj-y instead extra-y for objects placed at the head kbuild: hide error checker logs for V=1 builds kbuild: re-run modpost when it is updated kbuild: unify two modpost invocations kbuild: move vmlinux.o rule to the top Makefile kbuild: move .vmlinux.objs rule to Makefile.modpost kbuild: list sub-directories in ./Kbuild Makefile.compiler: replace cc-ifversion with compiler-specific macros ...
2022-10-09Merge tag 'efi-next-for-v6.1' of ↵Linus Torvalds7-22/+188
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: "A bit more going on than usual in the EFI subsystem. The main driver for this has been the introduction of the LoonArch architecture last cycle, which inspired some cleanup and refactoring of the EFI code. Another driver for EFI changes this cycle and in the future is confidential compute. The LoongArch architecture does not use either struct bootparams or DT natively [yet], and so passing information between the EFI stub and the core kernel using either of those is undesirable. And in general, overloading DT has been a source of issues on arm64, so using DT for this on new architectures is a to avoid for the time being (even if we might converge on something DT based for non-x86 architectures in the future). For this reason, in addition to the patch that enables EFI boot for LoongArch, there are a number of refactoring patches applied on top of which separate the DT bits from the generic EFI stub bits. These changes are on a separate topich branch that has been shared with the LoongArch maintainers, who will include it in their pull request as well. This is not ideal, but the best way to manage the conflicts without stalling LoongArch for another cycle. Another development inspired by LoongArch is the newly added support for EFI based decompressors. Instead of adding yet another arch-specific incarnation of this pattern for LoongArch, we are introducing an EFI app based on the existing EFI libstub infrastructure that encapulates the decompression code we use on other architectures, but in a way that is fully generic. This has been developed and tested in collaboration with distro and systemd folks, who are eager to start using this for systemd-boot and also for arm64 secure boot on Fedora. Note that the EFI zimage files this introduces can also be decompressed by non-EFI bootloaders if needed, as the image header describes the location of the payload inside the image, and the type of compression that was used. (Note that Fedora's arm64 GRUB is buggy [0] so you'll need a recent version or switch to systemd-boot in order to use this.) Finally, we are adding TPM measurement of the kernel command line provided by EFI. There is an oversight in the TCG spec which results in a blind spot for command line arguments passed to loaded images, which means that either the loader or the stub needs to take the measurement. Given the combinatorial explosion I am anticipating when it comes to firmware/bootloader stacks and firmware based attestation protocols (SEV-SNP, TDX, DICE, DRTM), it is good to set a baseline now when it comes to EFI measured boot, which is that the kernel measures the initrd and command line. Intermediate loaders can measure additional assets if needed, but with the baseline in place, we can deploy measured boot in a meaningful way even if you boot into Linux straight from the EFI firmware. Summary: - implement EFI boot support for LoongArch - implement generic EFI compressed boot support for arm64, RISC-V and LoongArch, none of which implement a decompressor today - measure the kernel command line into the TPM if measured boot is in effect - refactor the EFI stub code in order to isolate DT dependencies for architectures other than x86 - avoid calling SetVirtualAddressMap() on arm64 if the configured size of the VA space guarantees that doing so is unnecessary - move some ARM specific code out of the generic EFI source files - unmap kernel code from the x86 mixed mode 1:1 page tables" * tag 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits) efi/arm64: libstub: avoid SetVirtualAddressMap() when possible efi: zboot: create MemoryMapped() device path for the parent if needed efi: libstub: fix up the last remaining open coded boot service call efi/arm: libstub: move ARM specific code out of generic routines efi/libstub: measure EFI LoadOptions efi/libstub: refactor the initrd measuring functions efi/loongarch: libstub: remove dependency on flattened DT efi: libstub: install boot-time memory map as config table efi: libstub: remove DT dependency from generic stub efi: libstub: unify initrd loading between architectures efi: libstub: remove pointless goto kludge efi: libstub: simplify efi_get_memory_map() and struct efi_boot_memmap efi: libstub: avoid efi_get_memory_map() for allocating the virt map efi: libstub: drop pointless get_memory_map() call efi: libstub: fix type confusion for load_options_size arm64: efi: enable generic EFI compressed boot loongarch: efi: enable generic EFI compressed boot riscv: efi: enable generic EFI compressed boot efi/libstub: implement generic EFI zboot efi/libstub: move efi_system_table global var into separate object ...
2022-10-03Merge tag 'acpi-6.1-rc1' of ↵Linus Torvalds2-32/+22
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "ACPI and PNP updates for 6.1-rc1. These rearrange the ACPI device object initialization code (to get rid of a redundant parent pointer from struct acpi_device among other things), unify the _UID handling, drop support for some _OSI strings that should not be necessary any more, add new IDs to support more hardware and some more quirks, fix a few issues and clean up code all over. Specifics: - Reimplement acpi_get_pci_dev() using the list of physical devices associated with the given ACPI device object (Rafael Wysocki) - Rename ACPI device object reference counting functions (Rafael Wysocki) - Rearrange ACPI device object initialization code (Rafael Wysocki) - Drop parent field from struct acpi_device (Rafael Wysocki) - Extend the the int3472-tps68470 driver to support multiple consumers of a single TPS68470 along with the requisite framework-level support (Daniel Scally) - Filter out non-memory resources in is_memory(), add a helper function to find all memory type resources of an ACPI device object and use that function in 3 places (Heikki Krogerus) - Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS model S5402ZA (Tamim Khan, Kellen Renshaw) - Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus) - Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario Limonciello) - Clean up ACPI platform devices support code (Andy Shevchenko, John Garry) - Clean up ACPI bus management code (Andy Shevchenko, ye xingchen) - Add support for multiple DMA windows with different offsets to the ACPI device enumeration code and use it on LoongArch (Jianmin Lv) - Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko) - Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario Limonciello) - Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT parsing code (Liu Shixin) - Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on invalid physical addresses (Hans de Goede) - Silence missing-declarations warning related to Apple device properties management (Lukas Wunner) - Disable frequency invariance in the CPPC library if registers used by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton) - Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan) - Fix Tx acknowledge in the PCC address space handler (Huisong Li) - Use wait_for_completion_timeout() for PCC mailbox operations (Huisong Li) - Release resources on PCC address space setup failure path (Rafael Mendonca) - Remove unneeded result variables from APEI code (ye xingchen) - Print total number of records found during BERT log parsing (Dmitry Monakhov) - Drop support for 3 _OSI strings that should not be necessary any more and update documentation on custom _OSI strings so that adding new ones is not encouraged any more (Mario Limonciello) - Drop unneeded result variable from ec_write() (ye xingchen) - Remove the leftover struct acpi_ac_bl from the ACPI AC driver (Hanjun Guo) - Reorder symbols to get rid of a few forward declarations in the ACPI fan driver (Uwe Kleine-König) - Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid Norlander) - Add ARM DMA-330 controller to the supported list in the ACPI AMBA driver (Vijayenthiran Subramaniam) - Drop references to non-functional 01.org/linux-acpi web site from MAINTAINERS and Kconfig help texts (Rafael Wysocki) - Replace strlcpy() with unused retval with strscpy() in the ACPI support code (Wolfram Sang) - Do not initialize ret in main() in the pfrut utility (Shi junming) - Drop useless ACPI DSDT override documentation (Rafael Wysocki) - Fix a few typos and wording mistakes in the ACPI device enumeration documentation (Jean Delvare) - Introduce acpi_dev_uid_to_integer() to convert a _UID string into an integer value (Andy Shevchenko) - Use acpi_dev_uid_to_integer() in several places to unify _UID handling (Andy Shevchenko) - Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng Cui)" * tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (79 commits) ACPI: LPSS: Deduplicate skipping device in acpi_lpss_create_device() ACPI: LPSS: Replace loop with first entry retrieval ACPI: x86: s2idle: Add another ID to s2idle_dmi_table ACPI: x86: s2idle: Fix a NULL pointer dereference MAINTAINERS: Drop records pointing to 01.org/linux-acpi ACPI: Kconfig: Drop link to https://01.org/linux-acpi ACPI: docs: Drop useless DSDT override documentation ACPI: DPTF: Drop stale link from Kconfig help ACPI: x86: s2idle: Add a quirk for ASUSTeK COMPUTER INC. ROG Flow X13 ACPI: x86: s2idle: Add a quirk for Lenovo Slim 7 Pro 14ARH7 ACPI: x86: s2idle: Add a quirk for ASUS ROG Zephyrus G14 ACPI: x86: s2idle: Add a quirk for ASUS TUF Gaming A17 FA707RE ACPI: x86: s2idle: Add module parameter to prefer Microsoft GUID ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt ACPI: x86: s2idle: Move _HID handling for AMD systems into structures platform/x86: int3472: Add board data for Surface Go2 IR camera platform/x86: int3472: Support multiple gpio lookups in board data platform/x86: int3472: Support multiple clock consumers ACPI: bus: Add iterator for dependent devices ACPI: scan: Add acpi_dev_get_next_consumer_dev() ...
2022-10-03Merge tag 'efi-next-for-v6.1' into loongarch-nextHuacai Chen7-22/+188
LoongArch architecture changes for 6.1 depend on the efi changes to work, so merge them to create a base.
2022-10-02kbuild: use obj-y instead extra-y for objects placed at the headMasahiro Yamada1-2/+2
The objects placed at the head of vmlinux need special treatments: - arch/$(SRCARCH)/Makefile adds them to head-y in order to place them before other archives in the linker command line. - arch/$(SRCARCH)/kernel/Makefile adds them to extra-y instead of obj-y to avoid them going into built-in.a. This commit gets rid of the latter. Create vmlinux.a to collect all the objects that are unconditionally linked to vmlinux. The objects listed in head-y are moved to the head of vmlinux.a by using 'ar m'. With this, arch/$(SRCARCH)/kernel/Makefile can consistently use obj-y for builtin objects. There is no *.o that is directly linked to vmlinux. Drop unneeded code in scripts/clang-tools/gen_compile_commands.py. $(AR) mPi needs 'T' to workaround the llvm-ar bug. The fix was suggested by Nathan Chancellor [1]. [1]: https://lore.kernel.org/llvm/YyjjT5gQ2hGMH0ni@dev-arch.thelio-3990X/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-09-29LoongArch: Fix and cleanup csr_era handling in do_ri()Huacai Chen1-13/+2
We don't emulate reserved instructions and just send a signal to the current process now. So we don't need to call compute_return_era() to add 4 (point to the next instruction) to csr_era in pt_regs. RA/ERA's backup/restore is cleaned up as well. Signed-off-by: Jun Yi <yijun@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-29LoongArch: Align the address of kernel_entry to 4KBHuacai Chen1-0/+2
Align the address of kernel_entry to 4KB, to avoid early tlb miss exception in case the entry code crosses page boundary. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-27Merge tag 'efi-loongarch-for-v6.1-2' into HEADArd Biesheuvel4-12/+37
Second shared stable tag between EFI and LoongArch trees This is necessary because the EFI libstub refactoring patches are mostly directed at enabling LoongArch to wire up generic EFI boot support without being forced to consume DT properties that conflict with information that EFI also provides, e.g., memory map and reservations, etc.
2022-09-27efi/loongarch: libstub: remove dependency on flattened DTArd Biesheuvel4-12/+37
LoongArch does not use FDT or DT natively [yet], and the only reason it currently uses it is so that it can reuse the existing EFI stub code. Overloading the DT with data passed between the EFI stub and the core kernel has been a source of problems: there is the overlap between information provided by EFI which DT can also provide (initrd base/size, command line, memory descriptions), requiring us to reason about which is which and what to prioritize. It has also resulted in ABI leaks, i.e., internal ABI being promoted to external ABI inadvertently because the bootloader can set the EFI stub's DT properties as well (e.g., "kaslr-seed"). This has become especially problematic with boot environments that want to pretend that EFI boot is being done (to access ACPI and SMBIOS tables, for instance) but have no ability to execute the EFI stub, and so the environment that the EFI stub creates is emulated [poorly, in some cases]. Another downside of treating DT like this is that the DT binary that the kernel receives is different from the one created by the firmware, which is undesirable in the context of secure and measured boot. Given that LoongArch support in Linux is brand new, we can avoid these pitfalls, and treat the DT strictly as a hardware description, and use a separate handover method between the EFI stub and the kernel. Now that initrd loading and passing the EFI memory map have been refactored into pure EFI routines that use EFI configuration tables, the only thing we need to pass directly is the kernel command line (even if we could pass this via a config table as well, it is used extremely early, so passing it directly is preferred in this case.) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-24LoongArch: Use acpi_arch_dma_setup() and remove ARCH_HAS_PHYS_TO_DMAJianmin Lv2-32/+22
Use _DMA defined in ACPI spec for translation between DMA address and CPU address, and implement acpi_arch_dma_setup for initializing dev->dma_range_map, where acpi_dma_get_range is called for parsing _DMA. e.g. If we have two dma ranges: cpu address dma address size offset 0x200080000000 0x2080000000 0x400000000 0x1fe000000000 0x400080000000 0x4080000000 0x400000000 0x3fc000000000 _DMA for pci devices should be declared in host bridge as flowing: Name (_DMA, ResourceTemplate() { QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, NonCacheable, ReadWrite, 0x0, 0x4080000000, 0x447fffffff, 0x3fc000000000, 0x400000000, , , ) QWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, NonCacheable, ReadWrite, 0x0, 0x2080000000, 0x247fffffff, 0x1fe000000000, 0x400000000, , , ) }) Acked-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-09-20smp: add set_nr_cpu_ids()Yury Norov1-1/+1
In preparation to support compile-time nr_cpu_ids, add a setter for the variable. This is a no-op for all arches. Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-09-17efi/libstub: use EFI provided memcpy/memset routinesArd Biesheuvel1-3/+0
The stub is used in different execution environments, but on arm64, RISC-V and LoongArch, we still use the core kernel's implementation of memcpy and memset, as they are just a branch instruction away, and can generally be reused even from code such as the EFI stub that runs in a completely different address space. KAsan complicates this slightly, resulting in the need for some hacks to expose the uninstrumented, __ prefixed versions as the normal ones, as the latter are instrumented to include the KAsan checks, which only work in the core kernel. Unfortunately, #define'ing memcpy to __memcpy when building C code does not guarantee that no explicit memcpy() calls will be emitted. And with the upcoming zboot support, which consists of a separate binary which therefore needs its own implementation of memcpy/memset anyway, it's better to provide one explicitly instead of linking to the existing one. Given that EFI exposes implementations of memmove() and memset() via the boot services table, let's wire those up in the appropriate way, and drop the references to the core kernel ones. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-06efi/loongarch: Add efistub booting supportHuacai Chen6-10/+154
This patch adds efistub booting support, which is the standard UEFI boot protocol for LoongArch to use. We use generic efistub, which means we can pass boot information (i.e., system table, memory map, kernel command line, initrd) via a light FDT and drop a lot of non-standard code. We use a flat mapping to map the efi runtime in the kernel's address space. In efi, VA = PA; in kernel, VA = PA + PAGE_OFFSET. As a result, flat mapping is not identity mapping, SetVirtualAddressMap() is still needed for the efi runtime. Tested-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> [ardb: change fpic to fpie as suggested by Xi Ruoyao] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-03LoongArch: Fix section mismatch due to acpi_os_ioremap()Huacai Chen1-1/+1
Now acpi_os_ioremap() is marked with __init because it calls memblock_ is_memory() which is also marked with __init in the !ARCH_KEEP_MEMBLOCK case. However, acpi_os_ioremap() is called by ordinary functions such as acpi_os_{read, write}_memory() and causes section mismatch warnings: WARNING: modpost: vmlinux.o: section mismatch in reference: acpi_os_read_memory (section: .text) -> acpi_os_ioremap (section: .init.text) WARNING: modpost: vmlinux.o: section mismatch in reference: acpi_os_write_memory (section: .text) -> acpi_os_ioremap (section: .init.text) Fix these warnings by selecting ARCH_KEEP_MEMBLOCK unconditionally and removing the __init modifier of acpi_os_ioremap(). This can also give a chance to track "memory" and "reserved" memblocks after early boot. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03LoongArch: Adjust arch_do_signal_or_restart() to adapt generic entryHuacai Chen1-2/+2
Commit 8ba62d37949e248c69 ("task_work: Call tracehook_notify_signal from get_signal on all architectures") adjust arch_do_signal_or_restart() for all architectures. LoongArch hasn't been upstream yet at that time and can be still built successfully without adjustment because this function has a weak version with the correct prototype. It is obviously that we should convert LoongArch to use new API, otherwise some signal handlings will be lost. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03LoongArch: Avoid orphan input sectionsArd Biesheuvel1-0/+2
Ensure that all input sections are listed explicitly in the linker script, and issue a warning otherwise. This ensures that the binary image matches the PE/COFF and other image metadata exactly, which is important for things like code signing. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-08-25LoongArch: Cleanup reset routines with new APIHuacai Chen1-48/+21
Cleanup reset routines by using new do_kernel_power_off() instead of old pm_power_off(), and then simplify the whole file (reset.c) organization by inlining some functions. This cleanup also fix a poweroff error if EFI runtime is disabled. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-08-12Merge tag 'loongarch-5.20' of ↵Linus Torvalds14-36/+498
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Optimise getcpu() with vDSO - PCI enablement on top of pci & irqchip changes - Stack unwinder and stack trace support - Some bug fixes and build error fixes - Update the default config file * tag 'loongarch-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: docs/zh_CN/LoongArch: Add I14 description docs/LoongArch: Add I14 description LoongArch: Update Loongson-3 default config file LoongArch: Add USER_STACKTRACE support LoongArch: Add STACKTRACE support LoongArch: Add prologue unwinder support LoongArch: Add guess unwinder support LoongArch: Add vDSO syscall __vdso_getcpu() LoongArch: Add PCI controller support LoongArch: Parse MADT to get multi-processor information LoongArch: Jump to the link address before enable PG LoongArch: Requires __force attributes for any casts LoongArch: Fix unsigned comparison with less than zero LoongArch: Adjust arch/loongarch/Kconfig LoongArch: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
2022-08-12LoongArch: Add USER_STACKTRACE supportQing Zhang1-0/+41
To get the best stacktrace output, you can compile your userspace programs with frame pointers (at least glibc + the app you are tracing). 1, export "CC = gcc -fno-omit-frame-pointer"; 2, compile your programs with "CC"; 3, use uprobe to get stacktrace output. ... echo 'p:malloc /usr/lib64/libc.so.6:0x0a4704 size=%r4:u64' > uprobe_events echo 'p:free /usr/lib64/libc.so.6:0x0a4d50 ptr=%r4:x64' >> uprobe_events echo 'comm == "demo"' > ./events/uprobes/malloc/filter echo 'comm == "demo"' > ./events/uprobes/free/filter echo 1 > ./options/userstacktrace echo 1 > ./options/sym-userobj ... Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-08-12LoongArch: Add STACKTRACE supportQing Zhang5-1/+70
1. Use common arch_stack_walk() infrastructure to avoid duplicated code and avoid taking care of the stack storage and filtering. 2. Add sched_ra (means sched return address) and sched_cfa (means sched call frame address) to thread_info, and store them in switch_to(). 3. Add __get_wchan() implementation. Now we can print the process stack and wait channel by cat /proc/*/stack and /proc/*/wchan. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-08-12LoongArch: Add prologue unwinder supportQing Zhang3-0/+180
It unwind the stack frame based on prologue code analyze. CONFIG_KALLSYMS is needed, at least the address and length of each function. Three stages when we do unwind, 1) unwind_start(), the prapare of unwinding, fill unwind_state. 2) unwind_done(), judge whether the unwind process is finished or not. 3) unwind_next_frame(), unwind the next frame. Dividing unwinder helps to add new unwinders in the future, e.g.: unwinder_frame, unwinder_orc, .etc. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-08-12LoongArch: Add guess unwinder supportQing Zhang4-11/+140
Name "guess unwinder" comes from x86, it scans the stack and reports every kernel text address it finds. Unwinders can be used by dump_stack() and other stacktrace functions. Three stages when we do unwind, 1) unwind_start(), the prapare of unwinding, fill unwind_state. 2) unwind_done(), judge whether the unwind process is finished or not. 3) unwind_next_frame(), unwind the next frame. Add get_stack_info() to get stack info. At present we have irq stack and task stack. The next_sp is the key info between two types of stacks. Dividing unwinder helps to add new unwinders in the future. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-08-12LoongArch: Add vDSO syscall __vdso_getcpu()Huacai Chen1-10/+15
We test 20 million times of getcpu(), the real syscall version take 25 seconds, while the vsyscall version take only 2.4 seconds. Signed-off-by: Rui Wang <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-08-12LoongArch: Parse MADT to get multi-processor informationHuacai Chen2-4/+39
Parse MADT to get multi-processor information, in order to fix the boot problem and cpu-hotplug problem for SMP platform. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>