summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2022-11-30Merge patch series "riscv: kexec: Fxiup crash_save percpu and ↵Palmer Dabbelt3-13/+133
machine_kexec_mask_interrupts" guoren@kernel.org <guoren@kernel.org> says: From: Guo Ren <guoren@linux.alibaba.com> Current riscv kexec can't crash_save percpu states and disable interrupts properly. The patch series fix them, make kexec work correct. * b4-shazam-merge: riscv: kexec: Fixup crash_smp_send_stop without multi cores riscv: kexec: Fixup irq controller broken in kexec crash path Link: https://lore.kernel.org/r/20221020141603.2856206-1-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-30riscv: kexec: Fixup crash_smp_send_stop without multi coresGuo Ren3-18/+103
Current crash_smp_send_stop is the same as the generic one in kernel/panic and misses crash_save_cpu in percpu. This patch is inspired by 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()") and adds the same mechanism for riscv. Before this patch, test result: crash> help -r CPU 0: [OFFLINE] CPU 1: epc : ffffffff80009ff0 ra : ffffffff800b789a sp : ff2000001098bb40 gp : ffffffff815fca60 tp : ff60000004680000 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff2000001098bc90 s1 : ffffffff81600798 a0 : ff2000001098bb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000004690800 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff2000001098bb48 s3 : ffffffff81093ec8 s4 : ffffffff816004ac s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f720 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff2000001098b9a8 CPU 2: [OFFLINE] CPU 3: [OFFLINE] After this patch, test result: crash> help -r CPU 0: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ffffffff81403eb0 gp : ffffffff815fcb48 tp : ffffffff81413400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffff81403ec0 s1 : 0000000000000000 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eac s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 1: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff2000000068bf30 gp : ffffffff815fcb48 tp : ff6000000240d400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff2000000068bf40 s1 : 0000000000000001 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039ea8 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 2: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff20000000693f30 gp : ffffffff815fcb48 tp : ff6000000240e900 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff20000000693f40 s1 : 0000000000000002 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eb0 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 3: epc : ffffffff8000a1e4 ra : ffffffff800b7bba sp : ff200000109bbb40 gp : ffffffff815fcb48 tp : ff6000000373aa00 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff200000109bbc90 s1 : ffffffff816007a0 a0 : ff200000109bbb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000002c61c00 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff200000109bbb48 s3 : ffffffff810941a8 s4 : ffffffff816004b4 s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f7a0 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff200000109bb9a8 Fixes: ad943893d5f1 ("RISC-V: Fixup schedule out issue in machine_crash_shutdown()") Reviewed-by: Xianting Tian <xianting.tian@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Nick Kossifidis <mick@ics.forth.gr> Link: https://lore.kernel.org/r/20221020141603.2856206-3-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-30riscv: kexec: Fixup irq controller broken in kexec crash pathGuo Ren1-0/+35
If a crash happens on cpu3 and all interrupts are binding on cpu0, the bad irq routing will cause a crash kernel which can't receive any irq. Because crash kernel won't clean up all harts' PLIC enable bits in enable registers. This patch is similar to 9141a003a491 ("ARM: 7316/1: kexec: EOI active and mask all interrupts in kexec crash path") and 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()"), and PowerPC also has the same mechanism. Fixes: fba8a8674f68 ("RISC-V: Add kexec support") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Xianting Tian <xianting.tian@linux.alibaba.com> Cc: Nick Kossifidis <mick@ics.forth.gr> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20221020141603.2856206-2-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-30riscv: mm: Proper page permissions after initmem freeBjörn Töpel1-4/+5
64-bit RISC-V kernels have the kernel image mapped separately to alias the linear map. The linear map and the kernel image map are documented as "direct mapping" and "kernel" respectively in [1]. At image load time, the linear map corresponding to the kernel image is set to PAGE_READ permission, and the kernel image map is set to PAGE_READ|PAGE_EXEC. When the initmem is freed, the pages in the linear map should be restored to PAGE_READ|PAGE_WRITE, whereas the corresponding pages in the kernel image map should be restored to PAGE_READ, by removing the PAGE_EXEC permission. This is not the case. For 64-bit kernels, only the linear map is restored to its proper page permissions at initmem free, and not the kernel image map. In practise this results in that the kernel can potentially jump to dead __init code, and start executing invalid instructions, without getting an exception. Restore the freed initmem properly, by setting both the kernel image map to the correct permissions. [1] Documentation/riscv/vm-layout.rst Fixes: e5c35fa04019 ("riscv: Map the kernel with correct permissions the first time") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr> Tested-by: Alexandre Ghiti <alex@ghiti.fr> Link: https://lore.kernel.org/r/20221115090641.258476-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-30riscv: vdso: fix section overlapping under some conditionsJisheng Zhang1-0/+1
lkp reported a build error, I tried the config and can reproduce build error as below: VDSOLD arch/riscv/kernel/vdso/vdso.so.dbg ld.lld: error: section .note file range overlaps with .text >>> .note range is [0x7C8, 0x803] >>> .text range is [0x800, 0x1993] ld.lld: error: section .text file range overlaps with .dynamic >>> .text range is [0x800, 0x1993] >>> .dynamic range is [0x808, 0x937] ld.lld: error: section .note virtual address range overlaps with .text >>> .note range is [0x7C8, 0x803] >>> .text range is [0x800, 0x1993] Fix it by setting DISABLE_BRANCH_PROFILING which will disable branch tracing for vdso, thus avoid useless _ftrace_annotated_branch section and _ftrace_branch section. Although we can also fix it by removing the hardcoded .text begin address, but I think that's another story and should be put into another patch. Link: https://lore.kernel.org/lkml/202210122123.Cc4FPShJ-lkp@intel.com/#r Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20221102170254.1925-1-jszhang@kernel.org Fixes: ad5d1122b82f ("riscv: use vDSO common flow to reduce the latency of the time-related functions") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: Sync efi page table's kernel mappings before switchingAlexandre Ghiti2-4/+13
The EFI page table is initially created as a copy of the kernel page table. With VMAP_STACK enabled, kernel stacks are allocated in the vmalloc area: if the stack is allocated in a new PGD (one that was not present at the moment of the efi page table creation or not synced in a previous vmalloc fault), the kernel will take a trap when switching to the efi page table when the vmalloc kernel stack is accessed, resulting in a kernel panic. Fix that by updating the efi kernel mappings before switching to the efi page table. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Fixes: b91540d52a08 ("RISC-V: Add EFI runtime services") Tested-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20221121133303.1782246-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: Fix NR_CPUS range conditionsSamuel Holland1-3/+3
The conditions reference the symbol SBI_V01, which does not exist. The correct symbol is RISCV_SBI_V01. Fixes: e623715f3d67 ("RISC-V: Increase range and default value of NR_CPUS") Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221126061557.3541-1-samuel@sholland.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-11RISC-V: vdso: Do not add missing symbols to version section in linker scriptNathan Chancellor2-0/+5
Recently, ld.lld moved from '--undefined-version' to '--no-undefined-version' as the default, which breaks the compat vDSO build: ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_gettimeofday' failed: symbol not defined ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_clock_gettime' failed: symbol not defined ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_clock_getres' failed: symbol not defined These symbols are not present in the compat vDSO or the regular vDSO for 32-bit but they are unconditionally included in the version section of the linker script, which is prohibited with '--no-undefined-version'. Fix this issue by only including the symbols that are actually exported in the version section of the linker script. Link: https://github.com/ClangBuiltLinux/linux/issues/1756 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221108171324.3377226-1-nathan@kernel.org/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-11riscv: fix reserved memory setupConor Dooley2-1/+1
Currently, RISC-V sets up reserved memory using the "early" copy of the device tree. As a result, when trying to get a reserved memory region using of_reserved_mem_lookup(), the pointer to reserved memory regions is using the early, pre-virtual-memory address which causes a kernel panic when trying to use the buffer's name: Unable to handle kernel paging request at virtual address 00000000401c31ac Oops [#1] Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.0-rc1-00001-g0d9d6953d834 #1 Hardware name: Microchip PolarFire-SoC Icicle Kit (DT) epc : string+0x4a/0xea ra : vsnprintf+0x1e4/0x336 epc : ffffffff80335ea0 ra : ffffffff80338936 sp : ffffffff81203be0 gp : ffffffff812e0a98 tp : ffffffff8120de40 t0 : 0000000000000000 t1 : ffffffff81203e28 t2 : 7265736572203a46 s0 : ffffffff81203c20 s1 : ffffffff81203e28 a0 : ffffffff81203d22 a1 : 0000000000000000 a2 : ffffffff81203d08 a3 : 0000000081203d21 a4 : ffffffffffffffff a5 : 00000000401c31ac a6 : ffff0a00ffffff04 a7 : ffffffffffffffff s2 : ffffffff81203d08 s3 : ffffffff81203d00 s4 : 0000000000000008 s5 : ffffffff000000ff s6 : 0000000000ffffff s7 : 00000000ffffff00 s8 : ffffffff80d9821a s9 : ffffffff81203d22 s10: 0000000000000002 s11: ffffffff80d9821c t3 : ffffffff812f3617 t4 : ffffffff812f3617 t5 : ffffffff812f3618 t6 : ffffffff81203d08 status: 0000000200000100 badaddr: 00000000401c31ac cause: 000000000000000d [<ffffffff80338936>] vsnprintf+0x1e4/0x336 [<ffffffff80055ae2>] vprintk_store+0xf6/0x344 [<ffffffff80055d86>] vprintk_emit+0x56/0x192 [<ffffffff80055ed8>] vprintk_default+0x16/0x1e [<ffffffff800563d2>] vprintk+0x72/0x80 [<ffffffff806813b2>] _printk+0x36/0x50 [<ffffffff8068af48>] print_reserved_mem+0x1c/0x24 [<ffffffff808057ec>] paging_init+0x528/0x5bc [<ffffffff808031ae>] setup_arch+0xd0/0x592 [<ffffffff8080070e>] start_kernel+0x82/0x73c early_init_fdt_scan_reserved_mem() takes no arguments as it operates on initial_boot_params, which is populated by early_init_dt_verify(). On RISC-V, early_init_dt_verify() is called twice. Once, directly, in setup_arch() if CONFIG_BUILTIN_DTB is not enabled and once indirectly, very early in the boot process, by parse_dtb() when it calls early_init_dt_scan_nodes(). This first call uses dtb_early_va to set initial_boot_params, which is not usable later in the boot process when early_init_fdt_scan_reserved_mem() is called. On arm64 for example, the corresponding call to early_init_dt_scan_nodes() uses fixmap addresses and doesn't suffer the same fate. Move early_init_fdt_scan_reserved_mem() further along the boot sequence, after the direct call to early_init_dt_verify() in setup_arch() so that the names use the correct virtual memory addresses. The above supposed that CONFIG_BUILTIN_DTB was not set, but should work equally in the case where it is - unflatted_and_copy_device_tree() also updates initial_boot_params. Reported-by: Valentina Fernandez <valentina.fernandezalanis@microchip.com> Reported-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/linux-riscv/f8e67f82-103d-156c-deb0-d6d6e2756f5e@microchip.com/ Fixes: 922b0375fc93 ("riscv: Fix memblock reservation for device tree blob") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/r/20221107151524.3941467-1-conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-11riscv: vdso: fix build with llvmJisheng Zhang1-1/+1
Even after commit 89fd4a1df829 ("riscv: jump_label: mark arguments as const to satisfy asm constraints"), building with CC_OPTIMIZE_FOR_SIZE + LLVM=1 can reproduce below build error: CC arch/riscv/kernel/vdso/vgettimeofday.o In file included from <built-in>:4: In file included from lib/vdso/gettimeofday.c:5: In file included from include/vdso/datapage.h:17: In file included from include/vdso/processor.h:10: In file included from arch/riscv/include/asm/vdso/processor.h:7: In file included from include/linux/jump_label.h:112: arch/riscv/include/asm/jump_label.h:42:3: error: invalid operand for inline asm constraint 'i' " .option push \n\t" ^ 1 error generated. I think the problem is when "-Os" is passed as CFLAGS, it's removed by "CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os" which is introduced in commit e05d57dcb8c7 ("riscv: Fixup __vdso_gettimeofday broke dynamic ftrace"), thus no optimization at all for vgettimeofday.c arm64 does remove "-Os" as well, but it forces "-O2" after removing "-Os". I compared the generated vgettimeofday.o with "-O2" and "-Os", I think no big performance difference. So let's tell the kbuild not to remove "-Os" rather than follow arm64 style. vdso related performance can be improved a lot when building kernel with CC_OPTIMIZE_FOR_SIZE after this commit, ("-Os" VS no optimization) Fixes: e05d57dcb8c7 ("riscv: Fixup __vdso_gettimeofday broke dynamic ftrace") Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221031182943.2453-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-11riscv: process: fix kernel info leakageJisheng Zhang1-0/+2
thread_struct's s[12] may contain random kernel memory content, which may be finally leaked to userspace. This is a security hole. Fix it by clearing the s[12] array in thread_struct when fork. As for kthread case, it's better to clear the s[12] array as well. Fixes: 7db91e57a0ac ("RISC-V: Task implementation") Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Tested-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20221029113450.4027-1-jszhang@kernel.org Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/CAJF2gTSdVyAaM12T%2B7kXAdRPGS4VyuO08X1c7paE-n4Fr8OtRA@mail.gmail.com/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-29riscv: dts: sifive unleashed: Add PWM controlled LEDsEmil Renner Berthing1-0/+38
This adds the 4 PWM controlled green LEDs to the HiFive Unleashed device tree. The schematic doesn't specify any special function for the LEDs, so they're added here without any default triggers and named d1, d2, d3 and d4 just like in the schematic. Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221012110928.352910-1-emil.renner.berthing@canonical.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-28RISC-V: Fix /proc/cpuinfo cpumask warningAndrew Jones1-0/+3
Commit 78e5a3399421 ("cpumask: fix checking valid cpu range") has started issuing warnings[*] when cpu indices equal to nr_cpu_ids - 1 are passed to cpumask_next* functions. seq_read_iter() and cpuinfo's start and next seq operations implement a pattern like n = cpumask_next(n - 1, mask); show(n); while (1) { ++n; n = cpumask_next(n - 1, mask); if (n >= nr_cpu_ids) break; show(n); } which will issue the warning when reading /proc/cpuinfo. Ensure no warning is generated by validating the cpu index before calling cpumask_next(). [*] Warnings will only appear with DEBUG_PER_CPU_MAPS enabled. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20221014155845.1986223-2-ajones@ventanamicro.com/ Fixes: 78e5a3399421 ("cpumask: fix checking valid cpu range") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-28Merge patch series "Fix RISC-V toolchain extension support detection"Palmer Dabbelt3-9/+16
Conor Dooley <conor@kernel.org> says: From: Conor Dooley <conor.dooley@microchip.com> This came up due to a report from Kevin @ kernel-ci, who had been running a mixed configuration of GNU binutils and clang. Their compiler was relatively recent & supports Zicbom but binutils @ 2.35.2 did not. Our current checks for extension support only cover the compiler, but it appears to me that we need to check both the compiler & linker support in case of "pot-luck" configurations that mix different versions of LD,AS,CC etc. Linker support does not seem possible to actually check, since the ISA string is emitted into the object files - so I put in version checks for that. The checks have gotten a bit ugly since 32 & 64 bit support need to be checked independently but ahh well. As I was going, I fell into the trap of there being duplicated checks for CC support in both the Makefile and Kconfig, so as part of renaming the Kconfig symbol to TOOLCHAIN_HAS_FOO, I dropped the extra checks in the Makefile. This has the added advantage of the TOOLCHAIN_HAS_FOO symbol for Zihintpause appearing in .config. I pushed out a version of this that specificly checked for assember support for LKP to test & it looked /okay/ - but I did some more testing today and realised that this is redudant & have since dropped the as check. I tested locally with a fair few different combinations, to try and cover each of AS, LD, CC missing support for the extension. * b4-shazam-merge: riscv: fix detection of toolchain Zihintpause support riscv: fix detection of toolchain Zicbom support Link: https://lore.kernel.org/r/20221006173520.1785507-1-conor@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-28riscv: fix detection of toolchain Zihintpause supportConor Dooley3-3/+9
It is not sufficient to check if a toolchain supports a particular extension without checking if the linker supports that extension too. For example, Clang 15 supports Zihintpause but GNU bintutils 2.35.2 does not, leading build errors like so: riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zihintpause2p0: Invalid or unknown z ISA extension: 'zihintpause' Add a TOOLCHAIN_HAS_ZIHINTPAUSE which checks if each of the compiler, assembler and linker support the extension. Replace the ifdef in the vdso with one depending on this new symbol. Fixes: 8eb060e10185 ("arch/riscv: add Zihintpause support") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20221006173520.1785507-3-conor@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-28riscv: fix detection of toolchain Zicbom supportConor Dooley2-6/+7
It is not sufficient to check if a toolchain supports a particular extension without checking if the linker supports that extension too. For example, Clang 15 supports Zicbom but GNU bintutils 2.35.2 does not, leading build errors like so: riscv64-linux-gnu-ld: -march=rv64i2p0_m2p0_a2p0_c2p0_zicbom1p0_zihintpause2p0: Invalid or unknown z ISA extension: 'zicbom' Convert CC_HAS_ZICBOM to TOOLCHAIN_HAS_ZICBOM & check if the linker also supports Zicbom. Reported-by: Kevin Hilman <khilman@baylibre.com> Link: https://github.com/ClangBuiltLinux/linux/issues/1714 Link: https://storage.kernelci.org/next/master/next-20220920/riscv/defconfig+CONFIG_EFI=n/clang-16/logs/kernel.log Fixes: 1631ba1259d6 ("riscv: Add support for non-coherent devices using zicbom extension") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20221006173520.1785507-2-conor@kernel.org [Palmer: Check for ld-2.38, not 2.39, as 2.38 no longer errors.] Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-28riscv: mm: add missing memcpy in kasan_initQinglin Pan1-1/+6
Hi Atish, It seems that the panic is due to the missing memcpy during kasan_init. Could you please check whether this patch is helpful? When doing kasan_populate, the new allocated base_pud/base_p4d should contain kasan_early_shadow_{pud, p4d}'s content. Add the missing memcpy to avoid page fault when read/write kasan shadow region. Tested on: - qemu with sv57 and CONFIG_KASAN on. - qemu with sv48 and CONFIG_KASAN on. Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn> Tested-by: Atish Patra <atishp@rivosinc.com> Fixes: 8fbdccd2b173 ("riscv: mm: Support kasan for sv57") Link: https://lore.kernel.org/r/20221009083050.3814850-1-panqinglin2020@iscas.ac.cn Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-26riscv: jump_label: mark arguments as const to satisfy asm constraintsJisheng Zhang1-4/+4
Samuel reported that the static branch usage in cpu_relax() breaks building with CONFIG_CC_OPTIMIZE_FOR_SIZE: In file included from <command-line>: ./arch/riscv/include/asm/jump_label.h: In function 'cpu_relax': ././include/linux/compiler_types.h:285:33: warning: 'asm' operand 0 probably does not match constraints 285 | #define asm_volatile_goto(x...) asm goto(x) | ^~~ ./arch/riscv/include/asm/jump_label.h:41:9: note: in expansion of macro 'asm_volatile_goto' 41 | asm_volatile_goto( | ^~~~~~~~~~~~~~~~~ ././include/linux/compiler_types.h:285:33: error: impossible constraint in 'asm' 285 | #define asm_volatile_goto(x...) asm goto(x) | ^~~ ./arch/riscv/include/asm/jump_label.h:41:9: note: in expansion of macro 'asm_volatile_goto' 41 | asm_volatile_goto( | ^~~~~~~~~~~~~~~~~ make[1]: *** [scripts/Makefile.build:249: arch/riscv/kernel/vdso/vgettimeofday.o] Error 1 make: *** [arch/riscv/Makefile:128: vdso_prepare] Error 2 Maybe "-Os" prevents GCC from detecting that the key/branch arguments can be treated as constants and used as immediate operands. Inspired by x86's commit 864b435514b2("x86/jump_label: Mark arguments as const to satisfy asm constraints"), and as pointed out by Steven: "The "i" constraint needs to be a constant.", let's do similar modifications to riscv. Tested by CC_OPTIMIZE_FOR_SIZE + gcc and CC_OPTIMIZE_FOR_SIZE + clang. Link: https://lore.kernel.org/linux-riscv/20220922060958.44203-1-samuel@sholland.org/ Link: https://lore.kernel.org/all/20210212094059.5f8d05e8@gandalf.local.home/ Fixes: 8eb060e10185 ("arch/riscv: add Zihintpause support") Reported-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20221008145437.491-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-17Merge tag 'random-6.1-rc1-for-linus' of ↵Linus Torvalds23-26/+26
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull more random number generator updates from Jason Donenfeld: "This time with some large scale treewide cleanups. The intent of this pull is to clean up the way callers fetch random integers. The current rules for doing this right are: - If you want a secure or an insecure random u64, use get_random_u64() - If you want a secure or an insecure random u32, use get_random_u32() The old function prandom_u32() has been deprecated for a while now and is just a wrapper around get_random_u32(). Same for get_random_int(). - If you want a secure or an insecure random u16, use get_random_u16() - If you want a secure or an insecure random u8, use get_random_u8() - If you want secure or insecure random bytes, use get_random_bytes(). The old function prandom_bytes() has been deprecated for a while now and has long been a wrapper around get_random_bytes() - If you want a non-uniform random u32, u16, or u8 bounded by a certain open interval maximum, use prandom_u32_max() I say "non-uniform", because it doesn't do any rejection sampling or divisions. Hence, it stays within the prandom_*() namespace, not the get_random_*() namespace. I'm currently investigating a "uniform" function for 6.2. We'll see what comes of that. By applying these rules uniformly, we get several benefits: - By using prandom_u32_max() with an upper-bound that the compiler can prove at compile-time is ≤65536 or ≤256, internally get_random_u16() or get_random_u8() is used, which wastes fewer batched random bytes, and hence has higher throughput. - By using prandom_u32_max() instead of %, when the upper-bound is not a constant, division is still avoided, because prandom_u32_max() uses a faster multiplication-based trick instead. - By using get_random_u16() or get_random_u8() in cases where the return value is intended to indeed be a u16 or a u8, we waste fewer batched random bytes, and hence have higher throughput. This series was originally done by hand while I was on an airplane without Internet. Later, Kees and I worked on retroactively figuring out what could be done with Coccinelle and what had to be done manually, and then we split things up based on that. So while this touches a lot of files, the actual amount of code that's hand fiddled is comfortably small" * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: prandom: remove unused functions treewide: use get_random_bytes() when possible treewide: use get_random_u32() when possible treewide: use get_random_{u8,u16}() when possible, part 2 treewide: use get_random_{u8,u16}() when possible, part 1 treewide: use prandom_u32_max() when possible, part 2 treewide: use prandom_u32_max() when possible, part 1
2022-10-16Merge tag 'for-linus' of https://github.com/openrisc/linuxLinus Torvalds1-8/+8
Pull OpenRISC updates from Stafford Horne: "I have relocated to London so not much work from me while I get settled. Still, OpenRISC picked up two patches in this window: - Fix for kernel page table walking from Jann Horn - MAINTAINER entry cleanup from Palmer Dabbelt" * tag 'for-linus' of https://github.com/openrisc/linux: MAINTAINERS: git://github -> https://github.com for openrisc openrisc: Fix pagewalk usage in arch_dma_{clear, set}_uncached
2022-10-15Merge tag 'for-linus-6.1-rc1' of ↵Linus Torvalds13-57/+58
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Move to strscpy() - Improve panic notifiers - Fix NR_CPUS usage - Fixes for various comments - Fixes for virtio driver * tag 'for-linus-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: uml: Remove the initialization of statics to 0 um: Do not initialise statics to 0. um: Fix comment typo um: Improve panic notifiers consistency and ordering um: remove unused reactivate_chan() declaration um: mmaper: add __exit annotations to module exit funcs um: virt-pci: add __init/__exit annotations to module init/exit funcs hostfs: move from strlcpy with unused retval to strscpy um: move from strlcpy with unused retval to strscpy um: increase default virtual physical memory to 64 MiB UM: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK um: read multiple msg from virtio slave request fd
2022-10-14Merge tag 'asm-generic-fixes-6.1-1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic fix from Arnd Bergmann: "A last-minute arch/alpha regression fix: the previous asm-generic branch contained a new regression from a typo" * tag 'asm-generic-fixes-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: alpha: fix marvel_ioread8 build regression
2022-10-14Merge tag 'arm-fixes-6.1-1' of ↵Linus Torvalds4-11/+10
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "These are three fixes for build warnings that came in during the merge window" * tag 'arm-fixes-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: mmp: Make some symbols static ARM: spear6xx: Staticize few definitions clk: spear: Move prototype to accessible header
2022-10-14Merge tag 'arm64-fixes' of ↵Linus Torvalds7-4/+46
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Cortex-A55 errata workaround (repeat TLBI) - AMPERE1 added to the Spectre-BHB affected list - MTE fix to avoid setting PG_mte_tagged if no tags have been touched on a page - Fixed typo in the SCTLR_EL1.SPINTMASK bit naming (the commit log has other typos) - perf: return value check in ali_drw_pmu_probe(), ALIBABA_UNCORE_DRW_PMU dependency on ACPI * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Add AMPERE1 to the Spectre-BHB affected list arm64: mte: Avoid setting PG_mte_tagged if no tags cleared or restored MAINTAINERS: rectify file entry in ALIBABA PMU DRIVER drivers/perf: ALIBABA_UNCORE_DRW_PMU should depend on ACPI drivers/perf: fix return value check in ali_drw_pmu_probe() arm64: errata: Add Cortex-A55 to the repeat tlbi list arm64/sysreg: Fix typo in SCTR_EL1.SPINTMASK
2022-10-14Merge tag 'mm-stable-2022-10-13' of ↵Linus Torvalds2-9/+15
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - fix a race which causes page refcounting errors in ZONE_DEVICE pages (Alistair Popple) - fix userfaultfd test harness instability (Peter Xu) - various other patches in MM, mainly fixes * tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (29 commits) highmem: fix kmap_to_page() for kmap_local_page() addresses mm/page_alloc: fix incorrect PGFREE and PGALLOC for high-order page mm/selftest: uffd: explain the write missing fault check mm/hugetlb: use hugetlb_pte_stable in migration race check mm/hugetlb: fix race condition of uffd missing/minor handling zram: always expose rw_page LoongArch: update local TLB if PTE entry exists mm: use update_mmu_tlb() on the second thread kasan: fix array-bounds warnings in tests hmm-tests: add test for migrate_device_range() nouveau/dmem: evict device private memory during release nouveau/dmem: refactor nouveau_dmem_fault_copy_one() mm/migrate_device.c: add migrate_device_range() mm/migrate_device.c: refactor migrate_vma and migrate_deivce_coherent_page() mm/memremap.c: take a pgmap reference on page allocation mm: free device private pages have zero refcount mm/memory.c: fix race when faulting a device private page mm/damon: use damon_sz_region() in appropriate place mm/damon: move sz_damon_region to damon_sz_region lib/test_meminit: add checks for the allocation functions ...
2022-10-14Merge tag 'parisc-for-6.1-1' of ↵Linus Torvalds8-246/+61
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: "Fixes: - When we added basic vDSO support in kernel 5.18 we introduced a bug which prevented a mmap() of graphic card memory. This is because we used the DMB (data memory break trap bit) page flag as special-bit, but missed to clear that bit when loading the TLB. - Graphics card memory size was not correctly aligned - Spelling fixes (from Colin Ian King) Enhancements: - PDC console (which uses firmware calls) now rewritten as early console - Reduced size of alternative tables" * tag 'parisc-for-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix spelling mistake "mis-match" -> "mismatch" in eisa driver parisc: Fix userspace graphics card breakage due to pgtable special bit parisc: fbdev/stifb: Align graphics memory size to 4MB parisc: Convert PDC console to an early console parisc: Reduce kernel size by packing alternative tables
2022-10-14Merge tag 'riscv-for-linus-6.1-mw2' of ↵Linus Torvalds24-90/+613
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: - DT updates for the PolarFire SOC - a fix to correct the handling of write-only mappings - m{vetndor,arcd,imp}id is now in /proc/cpuinfo - the SiFive L2 cache controller support has been refactored to also support L3 caches - misc fixes, cleanups and improvements throughout the tree * tag 'riscv-for-linus-6.1-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits) MAINTAINERS: add RISC-V's patchwork RISC-V: Make port I/O string accessors actually work riscv: enable software resend of irqs RISC-V: Re-enable counter access from userspace riscv: vdso: fix NULL deference in vdso_join_timens() when vfork riscv: Add cache information in AUX vector soc: sifive: ccache: define the macro for the register shifts soc: sifive: ccache: use pr_fmt() to remove CCACHE: prefixes soc: sifive: ccache: reduce printing on init soc: sifive: ccache: determine the cache level from dts soc: sifive: ccache: Rename SiFive L2 cache to Composable cache. dt-bindings: sifive-ccache: change Sifive L2 cache to Composable cache riscv: check for kernel config option in t-head memory types errata riscv: use BIT() marco for cpufeature probing riscv: use BIT() macros in t-head errata init riscv: drop some idefs from CMO initialization riscv: cleanup svpbmt cpufeature probing riscv: Pass -mno-relax only on lld < 15.0.0 RISC-V: Avoid dereferening NULL regs in die() dt-bindings: riscv: add new riscv,isa strings for emulators ...
2022-10-14Merge tag 'powerpc-6.1-2' of ↵Linus Torvalds7-91/+149
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix 32-bit syscall wrappers with 64-bit arguments of unaligned register-pairs. Notably this broke ftruncate64 & pread/write64, which can lead to file corruption. - Fix lost interrupts when returning to soft-masked context on 64-bit. - Fix build failure when CONFIG_DTL=n. Thanks to Nicholas Piggin, Jason A. Donenfeld, Guenter Roeck, Arnd Bergmann, and Sachin Sant. * tag 'powerpc-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries: Fix CONFIG_DTL=n build powerpc/64s/interrupt: Fix lost interrupts when returning to soft-masked context powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs
2022-10-14parisc: Fix userspace graphics card breakage due to pgtable special bitHelge Deller2-1/+14
Commit df24e1783e6e ("parisc: Add vDSO support") introduced the vDSO support, for which a _PAGE_SPECIAL page table flag was needed. Since we wanted to keep every page table entry in 32-bits, this patch re-used the existing - but yet unused - _PAGE_DMB flag (which triggers a hardware break if a page is accessed) to store the special bit. But when graphics card memory is mmapped into userspace, the kernel uses vm_iomap_memory() which sets the the special flag. So, with the DMB bit set, every access to the graphics memory now triggered a hardware exception and segfaulted the userspace program. Fix this breakage by dropping the DMB bit when writing the page protection bits to the CPU TLB. In addition this patch adds a small optimization: if huge pages aren't configured (which is at least the case for 32-bit kernels), then the special bit is stored in the hpage (HUGE PAGE) bit instead. That way we can skip to reset the DMB bit. Fixes: df24e1783e6e ("parisc: Add vDSO support") Cc: <stable@vger.kernel.org> # 5.18+ Signed-off-by: Helge Deller <deller@gmx.de>
2022-10-14RISC-V: Make port I/O string accessors actually workMaciej W. Rozycki1-8/+8
Fix port I/O string accessors such as `insb', `outsb', etc. which use the physical PCI port I/O address rather than the corresponding memory mapping to get at the requested location, which in turn breaks at least accesses made by our parport driver to a PCIe parallel port such as: PCI parallel port detected: 1415:c118, I/O at 0x1000(0x1008), IRQ 20 parport0: PC-style at 0x1000 (0x1008), irq 20, using FIFO [PCSPP,TRISTATE,COMPAT,EPP,ECP] causing a memory access fault: Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000001008 Oops [#1] Modules linked in: CPU: 1 PID: 350 Comm: cat Not tainted 6.0.0-rc2-00283-g10d4879f9ef0-dirty #23 Hardware name: SiFive HiFive Unmatched A00 (DT) epc : parport_pc_fifo_write_block_pio+0x266/0x416 ra : parport_pc_fifo_write_block_pio+0xb4/0x416 epc : ffffffff80542c3e ra : ffffffff80542a8c sp : ffffffd88899fc60 gp : ffffffff80fa2700 tp : ffffffd882b1e900 t0 : ffffffd883d0b000 t1 : ffffffffff000002 t2 : 4646393043330a38 s0 : ffffffd88899fcf0 s1 : 0000000000001000 a0 : 0000000000000010 a1 : 0000000000000000 a2 : ffffffd883d0a010 a3 : 0000000000000023 a4 : 00000000ffff8fbb a5 : ffffffd883d0a001 a6 : 0000000100000000 a7 : ffffffc800000000 s2 : ffffffffff000002 s3 : ffffffff80d28880 s4 : ffffffff80fa1f50 s5 : 0000000000001008 s6 : 0000000000000008 s7 : ffffffd883d0a000 s8 : 0004000000000000 s9 : ffffffff80dc1d80 s10: ffffffd8807e4000 s11: 0000000000000000 t3 : 00000000000000ff t4 : 393044410a303930 t5 : 0000000000001000 t6 : 0000000000040000 status: 0000000200000120 badaddr: 0000000000001008 cause: 000000000000000f [<ffffffff80543212>] parport_pc_compat_write_block_pio+0xfe/0x200 [<ffffffff8053bbc0>] parport_write+0x46/0xf8 [<ffffffff8050530e>] lp_write+0x158/0x2d2 [<ffffffff80185716>] vfs_write+0x8e/0x2c2 [<ffffffff80185a74>] ksys_write+0x52/0xc2 [<ffffffff80185af2>] sys_write+0xe/0x16 [<ffffffff80003770>] ret_from_syscall+0x0/0x2 ---[ end trace 0000000000000000 ]--- For simplicity address the problem by adding PCI_IOBASE to the physical address requested in the respective wrapper macros only, observing that the raw accessors such as `__insb', `__outsb', etc. are not supposed to be used other than by said macros. Remove the cast to `long' that is no longer needed on `addr' now that it is used as an offset from PCI_IOBASE and add parentheses around `addr' needed for predictable evaluation in macro expansion. No need to make said adjustments in separate changes given that current code is gravely broken and does not ever work. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: fab957c11efe2 ("RISC-V: Atomic and Locking Code") Cc: stable@vger.kernel.org # v4.15+ Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/alpine.DEB.2.21.2209220223080.29493@angie.orcam.me.uk Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-14RISC-V: Add mvendorid, marchid, and mimpid to /proc/cpuinfo outputPalmer Dabbelt1-0/+51
I'm merging this in as a single commit as it's a dependency for some other work. * commit '3baca1a4d490484fcd555413f1fec85b2e071912': RISC-V: Add mvendorid, marchid, and mimpid to /proc/cpuinfo output
2022-10-13RISC-V: Make mmap() with PROT_WRITE imply PROT_READPalmer Dabbelt2-4/+2
Commit 2139619bcad7 ("riscv: mmap with PROT_WRITE but no PROT_READ is invalid") made mmap() reject mappings with only PROT_WRITE set in an attempt to fix an observed inconsistency in behavior when attempting to read from a PROT_WRITE-only mapping. The root cause of this behavior was actually that while RISC-V's protection_map maps VM_WRITE to readable PTE permissions (since write-only PTEs are considered reserved by the privileged spec), the page fault handler considered loads from VM_WRITE-only VMAs illegal accesses. Fix the underlying cause by handling faults in VM_WRITE-only VMAs (patch 1) and then re-enable use of mmap(PROT_WRITE) (patch 2), making RISC-V's behavior consistent with all other architectures that don't support write-only PTEs. * remotes/palmer/riscv-wonly: riscv: Allow PROT_WRITE-only mmap() riscv: Make VM_WRITE imply VM_READ Link: https://lore.kernel.org/r/20220915193702.2201018-1-abrestic@rivosinc.com/ Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13riscv: enable software resend of irqsConor Dooley1-0/+1
The PLIC specification does not describe the interrupt pendings bits as read-write, only that they "can be read". To allow for retriggering of interrupts (and the use of the irq debugfs interface) enable HARDIRQS_SW_RESEND for RISC-V. Link: https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc#interrupt-pending-bits Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Tested-by: Palmer Dabbelt <palmer@rivosinc.com> # on QEMU Reviewed-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20220729111116.259146-1-conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13riscv: vdso: fix NULL deference in vdso_join_timens() when vforkJisheng Zhang2-4/+10
Testing tools/testing/selftests/timens/vfork_exec.c got below kernel log: [ 6.838454] Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000000020 [ 6.842255] Oops [#1] [ 6.842871] Modules linked in: [ 6.844249] CPU: 1 PID: 64 Comm: vfork_exec Not tainted 6.0.0-rc3-rt15+ #8 [ 6.845861] Hardware name: riscv-virtio,qemu (DT) [ 6.848009] epc : vdso_join_timens+0xd2/0x110 [ 6.850097] ra : vdso_join_timens+0xd2/0x110 [ 6.851164] epc : ffffffff8000635c ra : ffffffff8000635c sp : ff6000000181fbf0 [ 6.852562] gp : ffffffff80cff648 tp : ff60000000fdb700 t0 : 3030303030303030 [ 6.853852] t1 : 0000000000000030 t2 : 3030303030303030 s0 : ff6000000181fc40 [ 6.854984] s1 : ff60000001e6c000 a0 : 0000000000000010 a1 : ffffffff8005654c [ 6.856221] a2 : 00000000ffffefff a3 : 0000000000000000 a4 : 0000000000000000 [ 6.858114] a5 : 0000000000000000 a6 : 0000000000000008 a7 : 0000000000000038 [ 6.859484] s2 : ff60000001e6c068 s3 : ff6000000108abb0 s4 : 0000000000000000 [ 6.860751] s5 : 0000000000001000 s6 : ffffffff8089dc40 s7 : ffffffff8089dc38 [ 6.862029] s8 : ffffffff8089dc30 s9 : ff60000000fdbe38 s10: 000000000000005e [ 6.863304] s11: ffffffff80cc3510 t3 : ffffffff80d1112f t4 : ffffffff80d1112f [ 6.864565] t5 : ffffffff80d11130 t6 : ff6000000181fa00 [ 6.865561] status: 0000000000000120 badaddr: 0000000000000020 cause: 000000000000000d [ 6.868046] [<ffffffff8008dc94>] timens_commit+0x38/0x11a [ 6.869089] [<ffffffff8008dde8>] timens_on_fork+0x72/0xb4 [ 6.870055] [<ffffffff80190096>] begin_new_exec+0x3c6/0x9f0 [ 6.871231] [<ffffffff801d826c>] load_elf_binary+0x628/0x1214 [ 6.872304] [<ffffffff8018ee7a>] bprm_execve+0x1f2/0x4e4 [ 6.873243] [<ffffffff8018f90c>] do_execveat_common+0x16e/0x1ee [ 6.874258] [<ffffffff8018f9c8>] sys_execve+0x3c/0x48 [ 6.875162] [<ffffffff80003556>] ret_from_syscall+0x0/0x2 [ 6.877484] ---[ end trace 0000000000000000 ]--- This is because the mm->context.vdso_info is NULL in vfork case. From another side, mm->context.vdso_info either points to vdso info for RV64 or vdso info for compat, there's no need to bloat riscv's mm_context_t, we can handle the difference when setup the additional page for vdso. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Suggested-by: Palmer Dabbelt <palmer@rivosinc.com> Fixes: 3092eb456375 ("riscv: compat: vdso: Add setup additional pages implementation") Link: https://lore.kernel.org/r/20220924070737.3048-1-jszhang@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13Merge patch series "Use composable cache instead of L2 cache"Palmer Dabbelt2-1/+7
Zong Li <zong.li@sifive.com> says: Since composable cache may be L3 cache if private L2 cache exists, we should use its original name "composable cache" to prevent confusion. This patchset contains the modification which is related to ccache, such as DT binding and EDAC driver. * b4-shazam-merge: riscv: Add cache information in AUX vector soc: sifive: ccache: define the macro for the register shifts soc: sifive: ccache: use pr_fmt() to remove CCACHE: prefixes soc: sifive: ccache: reduce printing on init soc: sifive: ccache: determine the cache level from dts soc: sifive: ccache: Rename SiFive L2 cache to Composable cache. dt-bindings: sifive-ccache: change Sifive L2 cache to Composable cache Link: https://lore.kernel.org/r/20220913061817.22564-1-zong.li@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13riscv: Add cache information in AUX vectorGreentime Hu2-1/+7
There are no standard CSR registers to provide cache information, the way for RISC-V is to get this information from DT. sysconf syscall could use them to get information of cache through AUX vector. The result of 'getconf -a|grep -i cache' as follows: LEVEL1_ICACHE_SIZE 32768 LEVEL1_ICACHE_ASSOC 2 LEVEL1_ICACHE_LINESIZE 64 LEVEL1_DCACHE_SIZE 32768 LEVEL1_DCACHE_ASSOC 4 LEVEL1_DCACHE_LINESIZE 64 LEVEL2_CACHE_SIZE 524288 LEVEL2_CACHE_ASSOC 8 LEVEL2_CACHE_LINESIZE 64 LEVEL3_CACHE_SIZE 4194304 LEVEL3_CACHE_ASSOC 16 LEVEL3_CACHE_LINESIZE 64 LEVEL4_CACHE_SIZE 0 LEVEL4_CACHE_ASSOC 0 LEVEL4_CACHE_LINESIZE 0 Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Signed-off-by: Zong Li <zong.li@sifive.com> Suggested-by: Zong Li <zong.li@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220913061817.22564-8-zong.li@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13Merge patch series "Some style cleanups for recent extension additions"Palmer Dabbelt3-29/+26
Heiko Stuebner <heiko@sntech.de> says: As noted by some people, some parts of the recently added extensions (svpbmt, zicbom) + t-head errata could use some styling upgrades. So this series provides these. changes in v2: - add patch also converting cpufeature probe to BIT() - update commit message in patch1 (Conor) Heiko Stuebner (5): riscv: cleanup svpbmt cpufeature probing riscv: drop some idefs from CMO initialization riscv: use BIT() macros in t-head errata init riscv: use BIT() marco for cpufeature probing riscv: check for kernel config option in t-head memory types errata arch/riscv/errata/thead/errata.c | 14 ++++++----- arch/riscv/include/asm/cacheflush.h | 2 ++ arch/riscv/kernel/cpufeature.c | 39 ++++++++++++----------------- 3 files changed, 26 insertions(+), 29 deletions(-) Link: https://lore.kernel.org/r/20220905111027.2463297-1-heiko@sntech.de * b4-shazam-merge: riscv: check for kernel config option in t-head memory types errata riscv: use BIT() marco for cpufeature probing riscv: use BIT() macros in t-head errata init riscv: drop some idefs from CMO initialization riscv: cleanup svpbmt cpufeature probing Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13riscv: check for kernel config option in t-head memory types errataHeiko Stuebner1-0/+3
The t-head variant of page-based memory types should also check first for the enabled kernel config option. Fixes: a35707c3d850 ("riscv: add memory-type errata for T-Head") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20220905111027.2463297-6-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13riscv: use BIT() marco for cpufeature probingHeiko Stuebner1-2/+2
Using the appropriate BIT macro makes the code better readable. Suggested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220905111027.2463297-5-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13riscv: use BIT() macros in t-head errata initHeiko Stuebner1-2/+2
Using the appropriate BIT macro makes the code better readable. Suggested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20220905111027.2463297-4-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13riscv: drop some idefs from CMO initializationHeiko Stuebner3-17/+14
Wrapping things in #ifdefs makes the code harder to read while we also have IS_ENABLED() macros to do this in regular code and the extension detection is not _that_ runtime critical. So define a stub for riscv_noncoherent_supported() in the non-CONFIG_RISCV_DMA_NONCOHERENT case and move the code to us IS_ENABLED. Suggested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20220905111027.2463297-3-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13riscv: cleanup svpbmt cpufeature probingHeiko Stuebner1-8/+5
For better readability (and compile time coverage) use IS_ENABLED instead of ifdef and drop the new unneeded switch statement. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20220905111027.2463297-2-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13riscv: Pass -mno-relax only on lld < 15.0.0Fangrui Song1-0/+2
lld since llvm:6611d58f5bbc ("[ELF] Relax R_RISCV_ALIGN"), which will be included in the 15.0.0 release, has implemented some RISC-V linker relaxation. -mno-relax is no longer needed in KBUILD_CFLAGS/KBUILD_AFLAGS to suppress R_RISCV_ALIGN which older lld can not handle: ld.lld: error: capability.c:(.fixup+0x0): relocation R_RISCV_ALIGN requires unimplemented linker relaxation; recompile with -mno-relax but the .o is already compiled with -mno-relax Signed-off-by: Fangrui Song <maskray@google.com> Link: https://lore.kernel.org/r/20220710071117.446112-1-maskray@google.com/ Link: https://lore.kernel.org/r/20220918092933.19943-1-palmer@rivosinc.com Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Conor Dooley <conor.dooley@microchip.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13powerpc/pseries: Fix CONFIG_DTL=n buildNicholas Piggin2-74/+80
The recently moved dtl code must be compiled-in if CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y even if CONFIG_DTL=n. Fixes: 6ba5aa541aaa0 ("powerpc/pseries: Move dtl scanning and steal time accounting to pseries platform") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221013073131.1485742-1-npiggin@gmail.com
2022-10-13powerpc/64s/interrupt: Fix lost interrupts when returning to soft-masked contextNicholas Piggin1-2/+13
It's possible for an interrupt returning to an irqs-disabled context to lose a pending soft-masked irq because it branches to part of the exit code for irqs-enabled contexts, which is meant to clear only the PACA_IRQS_HARD_DIS flag from PACAIRQHAPPENED by zeroing the byte. This just looks like a simple thinko from a recent commit (if there was no hard mask pending, there would be no reason to clear it anyway). This also adds comment to the code that actually does need to clear the flag. Fixes: e485f6c751e0a ("powerpc/64/interrupt: Fix return to masked context after hard-mask irq becomes pending") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221013064418.1311104-1-npiggin@gmail.com
2022-10-13RISC-V: Avoid dereferening NULL regs in die()Palmer Dabbelt1-3/+6
I don't think we can actually die() without a regs pointer, but the compiler was warning about a NULL check after a dereference. It seems prudent to just avoid the possibly-NULL dereference, given that when die()ing the system is already toast so who knows how we got there. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20220920200037.6727-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-13LoongArch: update local TLB if PTE entry existsQi Zheng1-0/+3
Currently, the implementation of update_mmu_tlb() is empty if __HAVE_ARCH_UPDATE_MMU_TLB is not defined. Then if two threads concurrently fault at the same page, the second thread that did not win the race will give up and do nothing. In the LoongArch architecture, this second thread will trigger another fault, and only updates its local TLB. Instead of triggering another fault, it's better to implement update_mmu_tlb() to directly update the local TLB of the second thread. Just do it. Link: https://lkml.kernel.org/r/20220929112318.32393-3-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Suggested-by: Bibo Mao <maobibo@loongson.cn> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Cc: Chris Zankel <chris@zankel.net> Cc: David Hildenbrand <david@redhat.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-13mm: free device private pages have zero refcountAlistair Popple1-1/+1
Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page refcount") device private pages have no longer had an extra reference count when the page is in use. However before handing them back to the owning device driver we add an extra reference count such that free pages have a reference count of one. This makes it difficult to tell if a page is free or not because both free and in use pages will have a non-zero refcount. Instead we should return pages to the drivers page allocator with a zero reference count. Kernel code can then safely use kernel functions such as get_page_unless_zero(). Link: https://lkml.kernel.org/r/cf70cf6f8c0bdb8aaebdbfb0d790aea4c683c3c6.1664366292.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Yang Shi <shy828301@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-13mm/memory.c: fix race when faulting a device private pageAlistair Popple1-8/+11
Patch series "Fix several device private page reference counting issues", v2 This series aims to fix a number of page reference counting issues in drivers dealing with device private ZONE_DEVICE pages. These result in use-after-free type bugs, either from accessing a struct page which no longer exists because it has been removed or accessing fields within the struct page which are no longer valid because the page has been freed. During normal usage it is unlikely these will cause any problems. However without these fixes it is possible to crash the kernel from userspace. These crashes can be triggered either by unloading the kernel module or unbinding the device from the driver prior to a userspace task exiting. In modules such as Nouveau it is also possible to trigger some of these issues by explicitly closing the device file-descriptor prior to the task exiting and then accessing device private memory. This involves some minor changes to both PowerPC and AMD GPU code. Unfortunately I lack hardware to test either of those so any help there would be appreciated. The changes mimic what is done in for both Nouveau and hmm-tests though so I doubt they will cause problems. This patch (of 8): When the CPU tries to access a device private page the migrate_to_ram() callback associated with the pgmap for the page is called. However no reference is taken on the faulting page. Therefore a concurrent migration of the device private page can free the page and possibly the underlying pgmap. This results in a race which can crash the kernel due to the migrate_to_ram() function pointer becoming invalid. It also means drivers can't reliably read the zone_device_data field because the page may have been freed with memunmap_pages(). Close the race by getting a reference on the page while holding the ptl to ensure it has not been freed. Unfortunately the elevated reference count will cause the migration required to handle the fault to fail. To avoid this failure pass the faulting page into the migrate_vma functions so that if an elevated reference count is found it can be checked to see if it's expected or not. [mpe@ellerman.id.au: fix build] Link: https://lkml.kernel.org/r/87fsgbf3gh.fsf@mpe.ellerman.id.au Link: https://lkml.kernel.org/r/cover.60659b549d8509ddecafad4f498ee7f03bb23c69.1664366292.git-series.apopple@nvidia.com Link: https://lkml.kernel.org/r/d3e813178a59e565e8d78d9b9a4e2562f6494f90.1664366292.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Lyude Paul <lyude@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Yang Shi <shy828301@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-13Merge tag 'dt-for-palmer-v6.1-mw1' of ↵Palmer Dabbelt9-39/+498
git://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into for-next Microchip RISC-V devicetrees for v6.1 Fixups, reference design changes and new boards: - The addition of QSPI support for mpfs had a corresponding change to the devicetree node. - The v2022.{09,10} reference designs brought with them several memory map changes which are not backwards compatible. The old devicetrees from the v2022.08 and earlier releases still work with current kernels. - Two new devicetrees for a first-party development kit and for the Aries Embedded M100FPSEVP kit. - Corresponding dt-bindings changes for the above. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> * tag 'dt-for-palmer-v6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/conor/linux: riscv: dts: microchip: fix fabric i2c reg size riscv: dts: microchip: update memory configuration for v2022.10 riscv: dts: microchip: add a devicetree for aries' m100pfsevp riscv: dts: microchip: add sevkit device tree riscv: dts: microchip: reduce the fic3 clock rate riscv: dts: microchip: icicle: re-jig fabric peripheral addresses riscv: dts: microchip: icicle: update pci address properties riscv: dts: microchip: move the mpfs' pci node to -fabric.dtsi riscv: dts: microchip: add pci dma ranges for the icicle kit dt-bindings: riscv: microchip: document the sev kit dt-bindings: riscv: microchip: document the aries m100pfsevp dt-bindings: riscv: microchip: document icicle reference design riscv: dts: microchip: add qspi compatible fallback Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>