Age | Commit message (Collapse) | Author | Files | Lines |
|
commit dce5ea1d0f45fa612f5760b88614a3f32bc75e3f upstream.
vm_map_base, empty_zero_page and invalid_pmd_table could be accessed
widely by some out-of-tree non-GPL but important file systems or drivers
(e.g. OpenZFS). Let's use EXPORT_SYMBOL() instead of EXPORT_SYMBOL_GPL()
to export them, so as to avoid build errors.
1, Details about vm_map_base:
This is a LoongArch-specific symbol and may be referenced through macros
PCI_IOBASE, VMALLOC_START and VMALLOC_END.
2, Details about empty_zero_page:
As it stands today, only 3 architectures export empty_zero_page as a GPL
symbol: IA64, LoongArch and MIPS. LoongArch gets the GPL export by
inheriting from MIPS, and the MIPS export was first introduced in commit
497d2adcbf50b ("[MIPS] Export empty_zero_page for sake of the ext4
module."). The IA64 export was similar: commit a7d57ecf4216e ("[IA64]
Export three symbols for module use") did so for kvm.
In both IA64 and MIPS, the export of empty_zero_page was done for
satisfying some in-kernel component built as module (kvm and ext4
respectively), and given its reasonably low-level nature, GPL is a
reasonable choice. But looking at the bigger picture it is evident most
other architectures do not regard it as GPL, so in effect the symbol
probably should not be treated as such, in favor of consistency.
3, Details about invalid_pmd_table:
Keep consistency with invalid_pte_table and make it be possible by some
modules.
Cc: stable@vger.kernel.org
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit df830336045db1246d3245d3737fee9939c5f731 upstream.
Not all LoongArch processors support CRC32 instructions. This feature
is indicated by CPUCFG1.CRC32 (Bit25) but it is wrongly defined in the
previous versions of the ISA manual (and so does in loongarch.h). The
CRC32 feature is set unconditionally now, so fix it.
BTW, expose the CRC32 feature in /proc/cpuinfo.
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a6f6a95f25803500079513780d11a911ce551d76 ]
Just skip the opcode(BPF_ST | BPF_NOSPEC) in the BPF JIT instead of
failing to JIT the entire program, given LoongArch currently has no
couterpart of a speculation barrier instruction. To verify the issue,
use the ltp testcase as shown below.
Also, Wang says:
I can confirm there's currently no speculation barrier equivalent
on LonogArch. (Loongson says there are builtin mitigations for
Spectre-V1 and V2 on their chips, and AFAIK efforts to port the
exploits to mips/LoongArch have all failed a few years ago.)
Without this patch:
$ ./bpf_prog02
[...]
bpf_common.c:123: TBROK: Failed verification: ??? (524)
[...]
Summary:
passed 0
failed 0
broken 1
skipped 0
warnings 0
With this patch:
$ ./bpf_prog02
[...]
Summary:
passed 0
failed 0
broken 0
skipped 0
warnings 0
Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support")
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: WANG Xuerui <git@xen0n.name>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/bpf/20230328071335.2664966-1-guodongtai@kylinos.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bb7a78e343468873bf00b2b181fcfd3c02d8cb56 ]
Under CONFIG_DEBUG_ATOMIC_SLEEP=y and CONFIG_DEBUG_PREEMPT=y, we can see
the following messages on LoongArch, this is because using might_sleep()
in preemption disable context.
[ 0.001127] smp: Bringing up secondary CPUs ...
[ 0.001222] Booting CPU#1...
[ 0.001244] 64-bit Loongson Processor probed (LA464 Core)
[ 0.001247] CPU1 revision is: 0014c012 (Loongson-64bit)
[ 0.001250] FPU1 revision is: 00000000
[ 0.001252] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283
[ 0.001255] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
[ 0.001257] preempt_count: 1, expected: 0
[ 0.001258] RCU nest depth: 0, expected: 0
[ 0.001259] Preemption disabled at:
[ 0.001261] [<9000000000223800>] arch_dup_task_struct+0x20/0x110
[ 0.001272] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.2.0-rc7+ #43
[ 0.001275] Hardware name: Loongson Loongson-3A5000-7A1000-1w-A2101/Loongson-LS3A5000-7A1000-1w-A2101, BIOS vUDK2018-LoongArch-V4.0.05132-beta10 12/13/202
[ 0.001277] Stack : 0072617764726148 0000000000000000 9000000000222f1c 90000001001e0000
[ 0.001286] 90000001001e3be0 90000001001e3be8 0000000000000000 0000000000000000
[ 0.001292] 90000001001e3be8 0000000000000040 90000001001e3cb8 90000001001e3a50
[ 0.001297] 9000000001642000 90000001001e3be8 be694d10ce4139dd 9000000100174500
[ 0.001303] 0000000000000001 0000000000000001 00000000ffffe0a2 0000000000000020
[ 0.001309] 000000000000002f 9000000001354116 00000000056b0000 ffffffffffffffff
[ 0.001314] 0000000000000000 0000000000000000 90000000014f6e90 9000000001642000
[ 0.001320] 900000000022b69c 0000000000000001 0000000000000000 9000000001736a90
[ 0.001325] 9000000100038000 0000000000000000 9000000000222f34 0000000000000000
[ 0.001331] 00000000000000b0 0000000000000004 0000000000000000 0000000000070000
[ 0.001337] ...
[ 0.001339] Call Trace:
[ 0.001342] [<9000000000222f34>] show_stack+0x5c/0x180
[ 0.001346] [<90000000010bdd80>] dump_stack_lvl+0x60/0x88
[ 0.001352] [<9000000000266418>] __might_resched+0x180/0x1cc
[ 0.001356] [<90000000010c742c>] mutex_lock+0x20/0x64
[ 0.001359] [<90000000002a8ccc>] irq_find_matching_fwspec+0x48/0x124
[ 0.001364] [<90000000002259c4>] constant_clockevent_init+0x68/0x204
[ 0.001368] [<900000000022acf4>] start_secondary+0x40/0xa8
[ 0.001371] [<90000000010c0124>] smpboot_entry+0x60/0x64
Here are the complete call chains:
smpboot_entry()
start_secondary()
constant_clockevent_init()
get_timer_irq()
irq_find_matching_fwnode()
irq_find_matching_fwspec()
mutex_lock()
might_sleep()
__might_sleep()
__might_resched()
In order to avoid the above issue, we should break the call chains,
using timer_irq_installed variable as check condition to only call
get_timer_irq() once in constant_clockevent_init() is a simple and
proper way.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 64f50f6575721ef03d001e907455cbe3baa2a5b1 ]
This patch fixes the following issue of function calls in JIT, like:
[ 29.346981] multi-func JIT bug 105 != 103
The issus can be reproduced by running the "inline simple bpf_loop call"
verifier test.
This is because we are emiting 2-4 instructions for 64-bit immediate moves.
During the first pass of JIT, the placeholder address is zero, emiting two
instructions for it. In the extra pass, the function address is in XKVRANGE,
emiting four instructions for it. This change the instruction index in
JIT context. Let's always use 4 instructions for function address in JIT.
So that the instruction sequences don't change between the first pass and
the extra pass for function calls.
Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support")
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/bpf/20230214152633.2265699-1-hengqi.chen@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 429a9671f235c94fc4b5d6687308714b74adc820 ]
At unwind_start(), it is better to get its frame info here rather than
get them outside, even we don't have 'regs'. In this way we can simply
use unwind_{start, next_frame, done} outside.
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit d52fec86a465355b379e839fa372ead0334d62e6 upstream.
HWCAP_LOONGARCH_CPUCFG is missing in elf_hwcap, so add it for glibc's
later use.
Cc: stable@vger.kernel.org
Reported-by: Yinyu Cai <caiyinyu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In virtual machine (guest mode), the tlbwr instruction can not write the
last entry of MTLB, so we need to make it non-present by invtlb and then
write it by tlbfill. This also simplify the whole logic.
Signed-off-by: Rui Wang <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Function smp_send_reschedule() is standard kernel API, which is defined
in header file include/linux/smp.h. However, on LoongArch it is defined
as an inline function, this is confusing and kernel modules can not use
this function.
Now we define smp_send_reschedule() as a general function, and add a
EXPORT_SYMBOL_GPL on this function, so that kernel modules can use it.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"15 hotfixes, 11 marked cc:stable.
Only three or four of the latter address post-6.0 issues, which is
hopefully a sign that things are converging"
* tag 'mm-hotfixes-stable-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
revert "kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible"
Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled
drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame
mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths
mm/khugepaged: fix GUP-fast interaction by sending IPI
mm/khugepaged: take the right locks for page table retraction
mm: migrate: fix THP's mapcount on isolation
mm: introduce arch_has_hw_nonleaf_pmd_young()
mm: add dummy pmd_young() for architectures not having it
mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in damon_sysfs_set_schemes()
tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep"
nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing
madvise: use zap_page_range_single for madvise dontneed
mm: replace VM_WARN_ON to pr_warn if the node is offline with __GFP_THISNODE
|
|
In order to avoid #ifdeffery add a dummy pmd_young() implementation as a
fallback. This is required for the later patch "mm: introduce
arch_has_hw_nonleaf_pmd_young()".
Link: https://lkml.kernel.org/r/fd3ac3cd-7349-6bbd-890a-71a9454ca0b3@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Yu Zhao <yuzhao@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Eliminate the following coccicheck warning:
./arch/loongarch/kernel/unwind_prologue.c:84:5-13: WARNING: Unsigned
expression compared with zero: frame_ra < 0
Signed-off-by: KaiLong Wang <wangkailong@jari.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Set _PAGE_DIRTY only if _PAGE_MODIFIED is set in {pmd,pte}_mkwrite().
Otherwise, _PAGE_DIRTY silences the TLB modify exception and make us
have no chance to mark a pmd/pte dirty (_PAGE_MODIFIED) for software.
Reviewed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Now {pmd,pte}_mkdirty() set _PAGE_DIRTY bit unconditionally, this causes
random segmentation fault after commit 0ccf7f168e17bb7e ("mm/thp: carry
over dirty bit when thp splits on pmd").
The reason is: when fork(), parent process use pmd_wrprotect() to clear
huge page's _PAGE_WRITE and _PAGE_DIRTY (for COW); then pte_mkdirty() set
_PAGE_DIRTY as well as _PAGE_MODIFIED while splitting dirty huge pages;
once _PAGE_DIRTY is set, there will be no tlb modify exception so the COW
machanism fails; and at last memory corruption occurred between parent
and child processes.
So, we should set _PAGE_DIRTY only when _PAGE_WRITE is set in {pmd,pte}_
mkdirty().
Cc: stable@vger.kernel.org
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
If a kernel thread is created by a user thread, it may carry FPU/SIMD
thread info flags (TIF_USEDFPU, TIF_USEDSIMD, etc.). Then it will be
considered as a fpu owner and kernel try to save its FPU/SIMD context
and cause such errors:
[ 41.518931] do_fpu invoked from kernel context![#1]:
[ 41.523933] CPU: 1 PID: 395 Comm: iou-wrk-394 Not tainted 6.1.0-rc5+ #217
[ 41.530757] Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.pre-beta8 08/18/2022
[ 41.544064] $ 0 : 0000000000000000 90000000011e9468 9000000106c7c000 9000000106c7fcf0
[ 41.552101] $ 4 : 9000000106305d40 9000000106689800 9000000106c7fd08 0000000003995818
[ 41.560138] $ 8 : 0000000000000001 90000000009a72e4 0000000000000020 fffffffffffffffc
[ 41.568174] $12 : 0000000000000000 0000000000000000 0000000000000020 00000009aab7e130
[ 41.576211] $16 : 00000000000001ff 0000000000000407 0000000000000001 0000000000000000
[ 41.584247] $20 : 0000000000000000 0000000000000001 9000000106c7fd70 90000001002f0400
[ 41.592284] $24 : 0000000000000000 900000000178f740 90000000011e9834 90000001063057c0
[ 41.600320] $28 : 0000000000000000 0000000000000001 9000000006826b40 9000000106305140
[ 41.608356] era : 9000000000228848 _save_fp+0x0/0xd8
[ 41.613542] ra : 90000000011e9468 __schedule+0x568/0x8d0
[ 41.619160] CSR crmd: 000000b0
[ 41.619163] CSR prmd: 00000000
[ 41.622359] CSR euen: 00000000
[ 41.625558] CSR ecfg: 00071c1c
[ 41.628756] CSR estat: 000f0000
[ 41.635239] ExcCode : f (SubCode 0)
[ 41.638783] PrId : 0014c010 (Loongson-64bit)
[ 41.643191] Modules linked in: acpi_ipmi vfat fat ipmi_si ipmi_devintf cfg80211 ipmi_msghandler rfkill fuse efivarfs
[ 41.653734] Process iou-wrk-394 (pid: 395, threadinfo=0000000004ebe913, task=00000000636fa1be)
[ 41.662375] Stack : 00000000ffff0875 9000000006800ec0 9000000006800ec0 90000000002d57e0
[ 41.670412] 0000000000000001 0000000000000000 9000000106535880 0000000000000001
[ 41.678450] 9000000105291800 0000000000000000 9000000105291838 900000000178e000
[ 41.686487] 9000000106c7fd90 9000000106305140 0000000000000001 90000000011e9834
[ 41.694523] 00000000ffff0875 90000000011f034c 9000000105291838 9000000105291830
[ 41.702561] 0000000000000000 9000000006801440 00000000ffff0875 90000000002d48c0
[ 41.710597] 9000000128800001 9000000106305140 9000000105291838 9000000105291838
[ 41.718634] 9000000105291830 9000000107811740 9000000105291848 90000000009bf1e0
[ 41.726672] 9000000105291830 9000000107811748 2d6b72772d756f69 0000000000343933
[ 41.734708] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 41.742745] ...
[ 41.745252] Call Trace:
[ 42.197868] [<9000000000228848>] _save_fp+0x0/0xd8
[ 42.205214] [<90000000011ed468>] __schedule+0x568/0x8d0
[ 42.210485] [<90000000011ed834>] schedule+0x64/0xd4
[ 42.215411] [<90000000011f434c>] schedule_timeout+0x88/0x188
[ 42.221115] [<90000000009c36d0>] io_wqe_worker+0x184/0x350
[ 42.226645] [<9000000000221cf0>] ret_from_kernel_thread+0xc/0x9c
This can be easily triggered by ltp testcase syscalls/io_uring02 and it
can also be easily fixed by clearing the FPU/SIMD thread info flags for
kernel threads in copy_thread().
Cc: stable@vger.kernel.org
Reported-by: Qi Hu <huqi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
SMP operations can be shared by Loongson-2 series and Loongson-3 series,
so we change the prefix from loongson3 to loongson for all functions and
data structures.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Combine acpi_boot_table_init() and acpi_boot_init() since they are very
simple, and we don't need to check the return value of acpi_boot_init().
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The latest version of grep claims the egrep is now obsolete so the build
now contains warnings that look like:
egrep: warning: egrep is obsolescent; using grep -E
Fix this up by changing the LoongArch Makefile to use "grep -E" instead.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Not all compilers support declare variables in switch-case, so move
declarations to the beginning of a function. Otherwise we may get such
build errors:
arch/loongarch/net/bpf_jit.c: In function ‘emit_atomic’:
arch/loongarch/net/bpf_jit.c:362:3: error: a label can only be part of a statement and a declaration is not a statement
u8 r0 = regmap[BPF_REG_0];
^~
arch/loongarch/net/bpf_jit.c: In function ‘build_insn’:
arch/loongarch/net/bpf_jit.c:727:3: error: a label can only be part of a statement and a declaration is not a statement
u8 t7 = -1;
^~
arch/loongarch/net/bpf_jit.c:778:3: error: a label can only be part of a statement and a declaration is not a statement
int ret;
^~~
arch/loongarch/net/bpf_jit.c:779:3: error: expected expression before ‘u64’
u64 func_addr;
^~~
arch/loongarch/net/bpf_jit.c:780:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
bool func_addr_fixed;
^~~~
arch/loongarch/net/bpf_jit.c:784:11: error: ‘func_addr’ undeclared (first use in this function); did you mean ‘in_addr’?
&func_addr, &func_addr_fixed);
^~~~~~~~~
in_addr
arch/loongarch/net/bpf_jit.c:784:11: note: each undeclared identifier is reported only once for each function it appears in
arch/loongarch/net/bpf_jit.c:814:3: error: a label can only be part of a statement and a declaration is not a statement
u64 imm64 = (u64)(insn + 1)->imm << 32 | (u32)insn->imm;
^~~
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Eliminate the following coccicheck warning:
./arch/loongarch/include/asm/ptrace.h:32:15-21: WARNING use flexible-array member instead
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Yushan Zhou <katrinzhou@tencent.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The current LoongArch kernel stack is padded as if obeying the MIPS o32
calling convention (32 bytes), signifying the port's MIPS lineage but no
longer making sense. Remove the padding for clarity.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull more random number generator updates from Jason Donenfeld:
"This time with some large scale treewide cleanups.
The intent of this pull is to clean up the way callers fetch random
integers. The current rules for doing this right are:
- If you want a secure or an insecure random u64, use get_random_u64()
- If you want a secure or an insecure random u32, use get_random_u32()
The old function prandom_u32() has been deprecated for a while
now and is just a wrapper around get_random_u32(). Same for
get_random_int().
- If you want a secure or an insecure random u16, use get_random_u16()
- If you want a secure or an insecure random u8, use get_random_u8()
- If you want secure or insecure random bytes, use get_random_bytes().
The old function prandom_bytes() has been deprecated for a while
now and has long been a wrapper around get_random_bytes()
- If you want a non-uniform random u32, u16, or u8 bounded by a
certain open interval maximum, use prandom_u32_max()
I say "non-uniform", because it doesn't do any rejection sampling
or divisions. Hence, it stays within the prandom_*() namespace, not
the get_random_*() namespace.
I'm currently investigating a "uniform" function for 6.2. We'll see
what comes of that.
By applying these rules uniformly, we get several benefits:
- By using prandom_u32_max() with an upper-bound that the compiler
can prove at compile-time is ≤65536 or ≤256, internally
get_random_u16() or get_random_u8() is used, which wastes fewer
batched random bytes, and hence has higher throughput.
- By using prandom_u32_max() instead of %, when the upper-bound is
not a constant, division is still avoided, because
prandom_u32_max() uses a faster multiplication-based trick instead.
- By using get_random_u16() or get_random_u8() in cases where the
return value is intended to indeed be a u16 or a u8, we waste fewer
batched random bytes, and hence have higher throughput.
This series was originally done by hand while I was on an airplane
without Internet. Later, Kees and I worked on retroactively figuring
out what could be done with Coccinelle and what had to be done
manually, and then we split things up based on that.
So while this touches a lot of files, the actual amount of code that's
hand fiddled is comfortably small"
* tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
prandom: remove unused functions
treewide: use get_random_bytes() when possible
treewide: use get_random_u32() when possible
treewide: use get_random_{u8,u16}() when possible, part 2
treewide: use get_random_{u8,u16}() when possible, part 1
treewide: use prandom_u32_max() when possible, part 2
treewide: use prandom_u32_max() when possible, part 1
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- fix a race which causes page refcounting errors in ZONE_DEVICE pages
(Alistair Popple)
- fix userfaultfd test harness instability (Peter Xu)
- various other patches in MM, mainly fixes
* tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (29 commits)
highmem: fix kmap_to_page() for kmap_local_page() addresses
mm/page_alloc: fix incorrect PGFREE and PGALLOC for high-order page
mm/selftest: uffd: explain the write missing fault check
mm/hugetlb: use hugetlb_pte_stable in migration race check
mm/hugetlb: fix race condition of uffd missing/minor handling
zram: always expose rw_page
LoongArch: update local TLB if PTE entry exists
mm: use update_mmu_tlb() on the second thread
kasan: fix array-bounds warnings in tests
hmm-tests: add test for migrate_device_range()
nouveau/dmem: evict device private memory during release
nouveau/dmem: refactor nouveau_dmem_fault_copy_one()
mm/migrate_device.c: add migrate_device_range()
mm/migrate_device.c: refactor migrate_vma and migrate_deivce_coherent_page()
mm/memremap.c: take a pgmap reference on page allocation
mm: free device private pages have zero refcount
mm/memory.c: fix race when faulting a device private page
mm/damon: use damon_sz_region() in appropriate place
mm/damon: move sz_damon_region to damon_sz_region
lib/test_meminit: add checks for the allocation functions
...
|
|
Currently, the implementation of update_mmu_tlb() is empty if
__HAVE_ARCH_UPDATE_MMU_TLB is not defined. Then if two threads
concurrently fault at the same page, the second thread that did not win
the race will give up and do nothing. In the LoongArch architecture, this
second thread will trigger another fault, and only updates its local TLB.
Instead of triggering another fault, it's better to implement
update_mmu_tlb() to directly update the local TLB of the second thread.
Just do it.
Link: https://lkml.kernel.org/r/20220929112318.32393-3-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Suggested-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- hfs and hfsplus kmap API modernization (Fabio Francesco)
- make crash-kexec work properly when invoked from an NMI-time panic
(Valentin Schneider)
- ntfs bugfixes (Hawkins Jiawei)
- improve IPC msg scalability by replacing atomic_t's with percpu
counters (Jiebin Sun)
- nilfs2 cleanups (Minghao Chi)
- lots of other single patches all over the tree!
* tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype
proc: test how it holds up with mapping'less process
mailmap: update Frank Rowand email address
ia64: mca: use strscpy() is more robust and safer
init/Kconfig: fix unmet direct dependencies
ia64: update config files
nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure
fork: remove duplicate included header files
init/main.c: remove unnecessary (void*) conversions
proc: mark more files as permanent
nilfs2: remove the unneeded result variable
nilfs2: delete unnecessary checks before brelse()
checkpatch: warn for non-standard fixes tag style
usr/gen_init_cpio.c: remove unnecessary -1 values from int file
ipc/msg: mitigate the lock contention with percpu counter
percpu: add percpu_counter_add_local and percpu_counter_sub_local
fs/ocfs2: fix repeated words in comments
relay: use kvcalloc to alloc page array in relay_alloc_page_array
proc: make config PROC_CHILDREN depend on PROC_FS
fs: uninline inode_maybe_inc_iversion()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Use EXPLICIT_RELOCS (ABIv2.0)
- Use generic BUG() handler
- Refactor TLB/Cache operations
- Add qspinlock support
- Add perf events support
- Add kexec/kdump support
- Add BPF JIT support
- Add ACPI-based laptop driver
- Update the default config file
* tag 'loongarch-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (25 commits)
LoongArch: Update Loongson-3 default config file
LoongArch: Add ACPI-based generic laptop driver
LoongArch: Add BPF JIT support
LoongArch: Add some instruction opcodes and formats
LoongArch: Move {signed,unsigned}_imm_check() to inst.h
LoongArch: Add kdump support
LoongArch: Add kexec support
LoongArch: Use generic BUG() handler
LoongArch: Add SysRq-x (TLB Dump) support
LoongArch: Add perf events support
LoongArch: Add qspinlock support
LoongArch: Use TLB for ioremap()
LoongArch: Support access filter to /dev/mem interface
LoongArch: Refactor cache probe and flush methods
LoongArch: mm: Refactor TLB exception handlers
LoongArch: Support R_LARCH_GOT_PC_{LO12,HI20} in modules
LoongArch: Support PC-relative relocations in modules
LoongArch: Define ELF relocation types added in ABIv2.0
LoongArch: Adjust symbol addressing for AS_HAS_EXPLICIT_RELOCS
LoongArch: Add Kconfig option AS_HAS_EXPLICIT_RELOCS
...
|
|
1, Enable ZBOOT, KEXEC and BPF_JIT;
2, Add more patition types;
3, Add some USB Type-C options;
4, Add some common network options;
5, Add some Bluetooth device drivers;
6, Remove obsolete config options (for some detailed information, see
Link).
Link: https://lore.kernel.org/kernel-janitors/20220929090645.1389-1-lukas.bulwahn@gmail.com/
Co-developed-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Co-developed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
BPF programs are normally handled by a BPF interpreter, add BPF JIT
support for LoongArch to allow the kernel to generate native code when
a program is loaded into the kernel. This will significantly speed-up
processing of BPF programs.
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
According to the "Table of Instruction Encoding" in LoongArch Reference
Manual [1], add some instruction opcodes and formats which are used in
the BPF JIT for LoongArch.
[1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#table-of-instruction-encoding
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
{signed,unsigned}_imm_check() will also be used in the bpf jit, so move
them from module.c to inst.h, this is preparation for later patches.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
This patch adds support for kdump. In kdump case the normal kernel will
reserve a region for the crash kernel and jump there on panic.
Arch-specific functions are added to allow for implementing a crash dump
file interface, /proc/vmcore, which can be viewed as a ELF file.
A user-space tool, such as kexec-tools, is responsible for allocating a
separate region for the core's ELF header within the crash kdump kernel
memory and filling it in when executing kexec_load().
Then, its location will be advertised to the crash dump kernel via a
command line argument "elfcorehdr=", and the crash dump kernel will
preserve this region for later use with arch_reserve_vmcore() at boot
time.
At the same time, the crash kdump kernel is also limited within the
"crashkernel" area via a command line argument "mem=", so as not to
destroy the original kernel dump data.
In the crash dump kernel environment, /proc/vmcore is used to access the
primary kernel's memory with copy_oldmem_page().
I tested kdump on LoongArch machines (Loongson-3A5000) and it works as
expected (suggested crashkernel parameter is "crashkernel=512M@2560M"),
you may test it by triggering a crash through /proc/sysrq-trigger:
$ sudo kexec -p /boot/vmlinux-kdump --reuse-cmdline --append="nr_cpus=1"
# echo c > /proc/sysrq-trigger
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to
the LoongArch architecture, so as to add support for the kexec re-boot
mechanism (CONFIG_KEXEC) on LoongArch platforms.
Kexec supports loading vmlinux.elf in ELF format and vmlinux.efi in PE
format.
I tested kexec on LoongArch machines (Loongson-3A5000) and it works as
expected:
$ sudo kexec -l /boot/vmlinux.efi --reuse-cmdline
$ sudo kexec -e
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Inspired by commit 9fb7410f955("arm64/BUG: Use BRK instruction for
generic BUG traps"), do similar for LoongArch to use generic BUG()
handler.
This patch uses the BREAK software breakpoint instruction to generate
a trap instead, similarly to most other arches, with the generic BUG
code generating the dmesg boilerplate.
This allows bug metadata to be moved to a separate table and reduces
the amount of inline code at BUG() and WARN() sites. This also avoids
clobbering any registers before they can be dumped.
To mitigate the size of the bug table further, this patch makes use of
the existing infrastructure for encoding addresses within the bug table
as 32-bit relative pointers instead of absolute pointers.
(Note: this limits the max kernel size to 2GB.)
Before patch:
[ 3018.338013] lkdtm: Performing direct entry BUG
[ 3018.342445] Kernel bug detected[#5]:
[ 3018.345992] CPU: 2 PID: 865 Comm: cat Tainted: G D 6.0.0-rc6+ #35
After patch:
[ 125.585985] lkdtm: Performing direct entry BUG
[ 125.590433] ------------[ cut here ]------------
[ 125.595020] kernel BUG at drivers/misc/lkdtm/bugs.c:78!
[ 125.600211] Oops - BUG[#1]:
[ 125.602980] CPU: 3 PID: 410 Comm: cat Not tainted 6.0.0-rc6+ #36
Out-of-line file/line data information obtained compared to before.
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add SysRq-x (TLB Dump) support for LoongArch, which is useful for
debugging.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The perf events infrastructure of LoongArch is very similar to old MIPS-
based Loongson, so most of the codes are derived from MIPS.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
On NUMA system, the performance of qspinlock is better than generic
spinlock. Below is the UnixBench test results on a 8 nodes (4 cores
per node, 32 cores in total) machine.
A. With generic spinlock:
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 449574022.5 38523.9
Double-Precision Whetstone 55.0 85190.4 15489.2
Execl Throughput 43.0 14696.2 3417.7
File Copy 1024 bufsize 2000 maxblocks 3960.0 143157.8 361.5
File Copy 256 bufsize 500 maxblocks 1655.0 37631.8 227.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 444814.2 766.9
Pipe Throughput 12440.0 5047490.7 4057.5
Pipe-based Context Switching 4000.0 2021545.7 5053.9
Process Creation 126.0 23829.8 1891.3
Shell Scripts (1 concurrent) 42.4 33756.7 7961.5
Shell Scripts (8 concurrent) 6.0 4062.9 6771.5
System Call Overhead 15000.0 2479748.6 1653.2
========
System Benchmarks Index Score 2955.6
B. With qspinlock:
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 449467876.9 38514.8
Double-Precision Whetstone 55.0 85174.6 15486.3
Execl Throughput 43.0 14769.1 3434.7
File Copy 1024 bufsize 2000 maxblocks 3960.0 146150.5 369.1
File Copy 256 bufsize 500 maxblocks 1655.0 37496.8 226.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 447527.0 771.6
Pipe Throughput 12440.0 5175989.2 4160.8
Pipe-based Context Switching 4000.0 2207747.8 5519.4
Process Creation 126.0 25125.5 1994.1
Shell Scripts (1 concurrent) 42.4 33461.2 7891.8
Shell Scripts (8 concurrent) 6.0 4024.7 6707.8
System Call Overhead 15000.0 2917278.6 1944.9
========
System Benchmarks Index Score 3040.1
Signed-off-by: Rui Wang <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
We can support more cache attributes (e.g., CC, SUC and WUC) and page
protection when we use TLB for ioremap(). The implementation is based
on GENERIC_IOREMAP.
The existing simple ioremap() implementation has better performance so
we keep it and introduce ARCH_IOREMAP to control the selection.
We move pagetable_init() earlier to make early ioremap() works, and we
modify the PCI ecam mapping because the TLB-based version of ioremap()
will actually take the size into account.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Accidental access to /dev/mem is obviously disastrous, but specific
access can be used by people debugging the kernel. So select GENERIC_
LIB_DEVMEM_IS_ALLOWED, as well as define ARCH_HAS_VALID_PHYS_ADDR_RANGE
and related helpers, to support access filter to /dev/mem interface.
Signed-off-by: Weihao Li <liweihao@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Current cache probe and flush methods have some drawbacks:
1, Assume there are 3 cache levels and only 3 levels;
2, Assume L1 = I + D, L2 = V, L3 = S, V is exclusive, S is inclusive.
However, the fact is I + D, I + D + V, I + D + S and I + D + V + S are
all valid. So, refactor the cache probe and flush methods to adapt more
types of cache hierarchy.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
This patch simplifies TLB load, store and modify exception handlers:
1. Reduce instructions, such as alu/csr and memory access;
2. Execute tlb search instruction only in the fast path;
3. Return directly from the fast path for both normal and huge pages;
4. Re-tab the assembly for better vertical alignment.
And fixes the concurrent modification issue of fast path for huge pages.
This issue will occur in the following steps:
CPU-1 (In TLB exception) CPU-2 (In THP splitting)
1: Load PMD entry (HUGE=1)
2: Goto huge path
3: Store PMD entry (HUGE=0)
4: Reload PMD entry (HUGE=0)
5: Fill TLB entry (PA is incorrect)
This patch also slightly improves the TLB processing performance:
* Normal pages: 2.15%, Huge pages: 1.70%.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
size_t page_size;
size_t mem_size;
size_t off;
void *base;
int flags;
int i;
if (argc < 2) {
fprintf(stderr, "%s MEM_SIZE [HUGE]\n", argv[0]);
return -1;
}
page_size = sysconf(_SC_PAGESIZE);
flags = MAP_PRIVATE | MAP_ANONYMOUS;
mem_size = strtoul(argv[1], NULL, 10);
if (argc > 2)
flags |= MAP_HUGETLB;
for (i = 0; i < 10; i++) {
base = mmap(NULL, mem_size, PROT_READ, flags, -1, 0);
if (base == MAP_FAILED) {
fprintf(stderr, "Map memory failed!\n");
return -1;
}
for (off = 0; off < mem_size; off += page_size)
*(volatile int *)(base + off);
munmap(base, mem_size);
}
return 0;
}
Signed-off-by: Rui Wang <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
GCC >= 13 and GNU assembler >= 2.40 use these relocations to address
external symbols, so we need to add them.
Let the module loader emit GOT entries for data symbols so we would be
able to handle GOT relocations. The GOT entry is just the data's symbol
address.
In module.lds, emit a stub .got section for a section header entry. The
actual content of the section entry will be filled at runtime by module_
frob_arch_sections().
Tested-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Binutils >= 2.40 uses R_LARCH_B26 instead of R_LARCH_SOP_PUSH_PLT_PCREL,
and R_LARCH_PCALA* instead of R_LARCH_SOP_PUSH_PCREL.
Handle R_LARCH_B26 and R_LARCH_PCALA* in the module loader. For R_LARCH_
B26, also create a PLT entry as needed.
Tested-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
These relocation types are used by GNU binutils >= 2.40 and GCC >= 13.
Add their definitions so we will be able to use them in later patches.
Link: https://github.com/loongson/LoongArch-Documentation/pull/57
Tested-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
If explicit relocation hints are used by the toolchain, -Wa,-mla-*
options will be useless for the C code. So only use them for the
!CONFIG_AS_HAS_EXPLICIT_RELOCS case.
Replace "la" with "la.pcrel" in head.S to keep the semantic consistent
with new and old toolchains for the low level startup code.
For per-CPU variables, the "address" of the symbol is actually an offset
from $r21. The value is near the loading address of main kernel image,
but far from the loading address of modules. So we use model("extreme")
attibute to tell the compiler that a PC-relative addressing with 32-bit
offset is not sufficient for local per-CPU variables.
The behavior with different assemblers and compilers are summarized in
the following table:
AS has CC has
explicit relocs explicit relocs * Behavior
==============================================================
No No Use la.* macros.
No change from Linux 6.0.
--------------------------------------------------------------
No Yes Disable explicit relocs.
No change from Linux 6.0.
--------------------------------------------------------------
Yes No Not supported.
--------------------------------------------------------------
Yes Yes Enable explicit relocs.
No -Wa,-mla* options used.
==============================================================
*: We assume CC must have model attribute if it has explicit relocs.
Both features are added in GCC 13 development cycle, so any GCC
release >= 13 should be OK. Using early GCC 13 development snapshots
may produce modules with unsupported relocations.
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=f09482a
Link: https://gcc.gnu.org/r13-1834
Link: https://gcc.gnu.org/r13-2199
Tested-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
GNU as >= 2.40 and GCC >= 13 will support using explicit relocation
hints in the assembly code, instead of la.* macros. The usage of
explicit relocation hints can improve code generation so it's enabled
by default by GCC >= 13.
Introduce a Kconfig option AS_HAS_EXPLICIT_RELOCS as the switch for
"use explicit relocation hints or not".
Tested-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
There is a spelling mistake in a commented section. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
forcibly") allows compiler to uninline functions marked as 'inline'.
In case of __xchg()/__cmpxchg() this would cause to reference
BUILD_BUG(), which is an error case for catching bugs and will not
happen for correct code, if __xchg()/__cmpxchg() is inlined.
This bug can be produced with CONFIG_DEBUG_SECTION_MISMATCH enabled,
and the solution is similar to below commits:
46f1619500d0225 ("MIPS: include: Mark __xchg as __always_inline"),
88356d09904bc60 ("MIPS: include: Mark __cmpxchg as __always_inline").
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Move local_flush_tlb_all() earlier (just after setup_ptwalker() and
before page allocation). This can avoid stale TLB entries misguiding
the later page allocation. Without this patch the second kernel of
kexec/kdump fails to boot SMP.
BTW, move output_pgtable_bits_defines() into tlb_init() since it has
nothing to do with tlb handler setup.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Now io master CPUs are not hotpluggable on LoongArch, but in the current
code only /sys/devices/system/cpu/cpu0/online is not created. Let us set
the hotpluggable field of all the io master CPUs as 0, then prevent to
create sysfs control file for all the io master CPUs which confuses some
user space tools. This is similar with commit 9cce844abf07 ("MIPS: CPU#0
is not hotpluggable").
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Don't overwrite the SMBIOS-provided CPU name on coming back from CPU-
hotplug (including S3/S4) if it is already initialized.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|