summaryrefslogtreecommitdiff
path: root/arch/riscv/kernel
AgeCommit message (Collapse)AuthorFilesLines
2021-11-17perf: Protect perf_guest_cbs with RCUSean Christopherson1-2/+5
Protect perf_guest_cbs with RCU to fix multiple possible errors. Luckily, all paths that read perf_guest_cbs already require RCU protection, e.g. to protect the callback chains, so only the direct perf_guest_cbs touchpoints need to be modified. Bug #1 is a simple lack of WRITE_ONCE/READ_ONCE behavior to ensure perf_guest_cbs isn't reloaded between a !NULL check and a dereference. Fixed via the READ_ONCE() in rcu_dereference(). Bug #2 is that on weakly-ordered architectures, updates to the callbacks themselves are not guaranteed to be visible before the pointer is made visible to readers. Fixed by the smp_store_release() in rcu_assign_pointer() when the new pointer is non-NULL. Bug #3 is that, because the callbacks are global, it's possible for readers to run in parallel with an unregisters, and thus a module implementing the callbacks can be unloaded while readers are in flight, resulting in a use-after-free. Fixed by a synchronize_rcu() call when unregistering callbacks. Bug #1 escaped notice because it's extremely unlikely a compiler will reload perf_guest_cbs in this sequence. perf_guest_cbs does get reloaded for future derefs, e.g. for ->is_user_mode(), but the ->is_in_guest() guard all but guarantees the consumer will win the race, e.g. to nullify perf_guest_cbs, KVM has to completely exit the guest and teardown down all VMs before KVM start its module unload / unregister sequence. This also makes it all but impossible to encounter bug #3. Bug #2 has not been a problem because all architectures that register callbacks are strongly ordered and/or have a static set of callbacks. But with help, unloading kvm_intel can trigger bug #1 e.g. wrapping perf_guest_cbs with READ_ONCE in perf_misc_flags() while spamming kvm_intel module load/unload leads to: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 6 PID: 1825 Comm: stress Not tainted 5.14.0-rc2+ #459 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:perf_misc_flags+0x1c/0x70 Call Trace: perf_prepare_sample+0x53/0x6b0 perf_event_output_forward+0x67/0x160 __perf_event_overflow+0x52/0xf0 handle_pmi_common+0x207/0x300 intel_pmu_handle_irq+0xcf/0x410 perf_event_nmi_handler+0x28/0x50 nmi_handle+0xc7/0x260 default_do_nmi+0x6b/0x170 exc_nmi+0x103/0x130 asm_exc_nmi+0x76/0xbf Fixes: 39447b386c84 ("perf: Enhance perf to allow for guest statistic collection from host") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211111020738.2512932-2-seanjc@google.com
2021-11-13Merge tag 'riscv-for-linus-5.16-mw1' of ↵Linus Torvalds5-59/+228
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for time namespaces in the VDSO, along with some associated cleanups. - Support for building rv32 randconfigs. - Improvements to the XIP port that allow larger kernels to function - Various device tree cleanups for both the SiFive and Microchip boards - A handful of defconfig updates, including enabling Nouveau. There are also various small cleanups. * tag 'riscv-for-linus-5.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: defconfig: enable DRM_NOUVEAU riscv/vdso: Drop unneeded part due to merge issue riscv: remove .text section size limitation for XIP riscv: dts: sifive: add missing compatible for plic riscv: dts: microchip: add missing compatibles for clint and plic riscv: dts: sifive: drop duplicated nodes and properties in sifive riscv: dts: sifive: fix Unleashed board compatible riscv: dts: sifive: use only generic JEDEC SPI NOR flash compatible riscv: dts: microchip: use vendor compatible for Cadence SD4HC riscv: dts: microchip: drop unused pinctrl-names riscv: dts: microchip: drop duplicated MMC/SDHC node riscv: dts: microchip: fix board compatible riscv: dts: microchip: drop duplicated nodes dt-bindings: mmc: cdns: document Microchip MPFS MMC/SDHCI controller riscv: add rv32 and rv64 randconfig build targets riscv: mm: don't advertise 1 num_asid for 0 asid bits riscv: set default pm_power_off to NULL riscv/vdso: Add support for time namespaces
2021-11-07Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-2/+2
Merge misc updates from Andrew Morton: "257 patches. Subsystems affected by this patch series: scripts, ocfs2, vfs, and mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache, gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc, pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools, memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm, vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram, cleanups, kfence, and damon)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits) mm/damon: remove return value from before_terminate callback mm/damon: fix a few spelling mistakes in comments and a pr_debug message mm/damon: simplify stop mechanism Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions Docs/admin-guide/mm/damon/start: simplify the content Docs/admin-guide/mm/damon/start: fix a wrong link Docs/admin-guide/mm/damon/start: fix wrong example commands mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on mm/damon: remove unnecessary variable initialization Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM) selftests/damon: support watermarks mm/damon/dbgfs: support watermarks mm/damon/schemes: activate schemes based on a watermarks mechanism tools/selftests/damon: update for regions prioritization of schemes mm/damon/dbgfs: support prioritization weights mm/damon/vaddr,paddr: support pageout prioritization mm/damon/schemes: prioritize regions within the quotas mm/damon/selftests: support schemes quotas mm/damon/dbgfs: support quotas of schemes ...
2021-11-06memblock: use memblock_free for freeing virtual pointersMike Rapoport1-3/+2
Rename memblock_free_ptr() to memblock_free() and use memblock_free() when freeing a virtual pointer so that memblock_free() will be a counterpart of memblock_alloc() The callers are updated with the below semantic patch and manual addition of (void *) casting to pointers that are represented by unsigned long variables. @@ identifier vaddr; expression size; @@ ( - memblock_phys_free(__pa(vaddr), size); + memblock_free(vaddr, size); | - memblock_free_ptr(vaddr, size); + memblock_free(vaddr, size); ) [sfr@canb.auug.org.au: fixup] Link: https://lkml.kernel.org/r/20211018192940.3d1d532f@canb.auug.org.au Link: https://lkml.kernel.org/r/20210930185031.18648-7-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Juergen Gross <jgross@suse.com> Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06memblock: rename memblock_free to memblock_phys_freeMike Rapoport1-2/+3
Since memblock_free() operates on a physical range, make its name reflect it and rename it to memblock_phys_free(), so it will be a logical counterpart to memblock_phys_alloc(). The callers are updated with the below semantic patch: @@ expression addr; expression size; @@ - memblock_free(addr, size); + memblock_phys_free(addr, size); Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Juergen Gross <jgross@suse.com> Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-03Merge tag 'devicetree-for-5.16' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: - Convert /reserved-memory bindings to schemas - Convert a bunch of NFC bindings to schemas - Convert bindings to schema: Xilinx USB, Freescale DDR controller, Arm CCI-400, UBlox Neo-6M, 1-Wire GPIO, MSI controller, ASpeed LPC, OMAP and Inside-Secure HWRNG, register-bit-led, OV5640, Silead GSL1680, Elan ekth3000, Marvell bluetooth, TI wlcore, TI bluetooth, ESP ESP8089, tlm,trusted-foundations, Microchip cap11xx, Ralink SoCs and boards, and TI sysc - New binding schemas for: msi-ranges, Aspeed UART routing controller, palmbus, Xylon LogiCVC display controller, Mediatek's MT7621 SDRAM memory controller, and Apple M1 PCIe host - Run schema checks for %.dtb targets - Improve build time when using DT_SCHEMA_FILES - Improve error message when dtschema is not found - Various doc reference fixes in MAINTAINERS - Convert architectures to common CPU h/w ID parsing function of_get_cpu_hwid(). - Allow for empty NUMA node IDs which may be hotplugged - Cleanup of __fdt_scan_reserved_mem() - Constify device_node parameters - Update dtc to upstream v1.6.1-19-g0a3a9d3449c8. Adds new checks 'node_name_vs_property_name' and 'interrupt_map'. - Enable dtc 'unit_address_format' warning by default - Fix unittest EXPECT text for gpio hog errors * tag 'devicetree-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (97 commits) dt-bindings: net: ti,bluetooth: Document default max-speed dt-bindings: pci: rcar-pci-ep: Document r8a7795 dt-bindings: net: qcom,ipa: IPA does support up to two iommus of/fdt: Remove of_scan_flat_dt() usage for __fdt_scan_reserved_mem() of: unittest: document intentional interrupt-map provider build warning of: unittest: fix EXPECT text for gpio hog errors of/unittest: Disable new dtc node_name_vs_property_name and interrupt_map warnings scripts/dtc: Update to upstream version v1.6.1-19-g0a3a9d3449c8 dt-bindings: arm: firmware: tlm,trusted-foundations: Convert txt bindings to yaml dt-bindings: display: tilcd: Fix endpoint addressing in example dt-bindings: input: microchip,cap11xx: Convert txt bindings to yaml dt-bindings: ufs: exynos-ufs: add exynosautov9 compatible dt-bindings: ufs: exynos-ufs: add io-coherency property dt-bindings: mips: convert Ralink SoCs and boards to schema dt-bindings: display: xilinx: Fix example with psgtr dt-bindings: net: nfc: nxp,pn544: Convert txt bindings to yaml dt-bindings: Add a help message when dtschema tools are missing dt-bindings: bus: ti-sysc: Update to use yaml binding dt-bindings: sram: Allow numbers in sram region node name dt-bindings: display: Document the Xylon LogiCVC display controller ...
2021-11-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+156
Pull KVM updates from Paolo Bonzini: "ARM: - More progress on the protected VM front, now with the full fixed feature set as well as the limitation of some hypercalls after initialisation. - Cleanup of the RAZ/WI sysreg handling, which was pointlessly complicated - Fixes for the vgic placement in the IPA space, together with a bunch of selftests - More memcg accounting of the memory allocated on behalf of a guest - Timer and vgic selftests - Workarounds for the Apple M1 broken vgic implementation - KConfig cleanups - New kvmarm.mode=none option, for those who really dislike us RISC-V: - New KVM port. x86: - New API to control TSC offset from userspace - TSC scaling for nested hypervisors on SVM - Switch masterclock protection from raw_spin_lock to seqcount - Clean up function prototypes in the page fault code and avoid repeated memslot lookups - Convey the exit reason to userspace on emulation failure - Configure time between NX page recovery iterations - Expose Predictive Store Forwarding Disable CPUID leaf - Allocate page tracking data structures lazily (if the i915 KVM-GT functionality is not compiled in) - Cleanups, fixes and optimizations for the shadow MMU code s390: - SIGP Fixes - initial preparations for lazy destroy of secure VMs - storage key improvements/fixes - Log the guest CPNC Starting from this release, KVM-PPC patches will come from Michael Ellerman's PPC tree" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) RISC-V: KVM: fix boolreturn.cocci warnings RISC-V: KVM: remove unneeded semicolon RISC-V: KVM: Fix GPA passed to __kvm_riscv_hfence_gvma_xyz() functions RISC-V: KVM: Factor-out FP virtualization into separate sources KVM: s390: add debug statement for diag 318 CPNC data KVM: s390: pv: properly handle page flags for protected guests KVM: s390: Fix handle_sske page fault handling KVM: x86: SGX must obey the KVM_INTERNAL_ERROR_EMULATION protocol KVM: x86: On emulation failure, convey the exit reason, etc. to userspace KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info KVM: x86: Clarify the kvm_run.emulation_failure structure layout KVM: s390: Add a routine for setting userspace CPU state KVM: s390: Simplify SIGP Set Arch handling KVM: s390: pv: avoid stalls when making pages secure KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm KVM: s390: pv: avoid double free of sida page KVM: s390: pv: add macros for UVC CC values s390/mm: optimize reset_guest_reference_bit() s390/mm: optimize set_guest_storage_key() s390/mm: no need for pte_alloc_map_lock() if we know the pmd is present ...
2021-11-02Merge tag 'trace-v5.16' of ↵Linus Torvalds4-17/+9
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - kprobes: Restructured stack unwinder to show properly on x86 when a stack dump happens from a kretprobe callback. - Fix to bootconfig parsing - Have tracefs allow owner and group permissions by default (only denying others). There's been pressure to allow non root to tracefs in a controlled fashion, and using groups is probably the safest. - Bootconfig memory managament updates. - Bootconfig clean up to have the tools directory be less dependent on changes in the kernel tree. - Allow perf to be traced by function tracer. - Rewrite of function graph tracer to be a callback from the function tracer instead of having its own trampoline (this change will happen on an arch by arch basis, and currently only x86_64 implements it). - Allow multiple direct trampolines (bpf hooks to functions) be batched together in one synchronization. - Allow histogram triggers to add variables that can perform calculations against the event's fields. - Use the linker to determine architecture callbacks from the ftrace trampoline to allow for proper parameter prototypes and prevent warnings from the compiler. - Extend histogram triggers to key off of variables. - Have trace recursion use bit magic to determine preempt context over if branches. - Have trace recursion disable preemption as all use cases do anyway. - Added testing for verification of tracing utilities. - Various small clean ups and fixes. * tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (101 commits) tracing/histogram: Fix semicolon.cocci warnings tracing/histogram: Fix documentation inline emphasis warning tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together tracing: Show size of requested perf buffer bootconfig: Initialize ret in xbc_parse_tree() ftrace: do CPU checking after preemption disabled ftrace: disable preemption when recursion locked tracing/histogram: Document expression arithmetic and constants tracing/histogram: Optimize division by a power of 2 tracing/histogram: Covert expr to const if both operands are constants tracing/histogram: Simplify handling of .sym-offset in expressions tracing: Fix operator precedence for hist triggers expression tracing: Add division and multiplication support for hist triggers tracing: Add support for creating hist trigger variables from literal selftests/ftrace: Stop tracing while reading the trace file by default MAINTAINERS: Update KPROBES and TRACING entries test_kprobes: Move it from kernel/ to lib/ docs, kprobes: Remove invalid URL and add new reference samples/kretprobes: Fix return value if register_kretprobe() failed lib/bootconfig: Fix the xbc_get_info kerneldoc ...
2021-11-02Merge tag 'cpu-to-thread_info-v5.16-rc1' of ↵Linus Torvalds3-7/+0
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull thread_info update to move 'cpu' back from task_struct from Kees Cook: "Cross-architecture update to move task_struct::cpu back into thread_info on arm64, x86, s390, powerpc, and riscv. All Acked by arch maintainers. Quoting Ard Biesheuvel: 'Move task_struct::cpu back into thread_info Keeping CPU in task_struct is problematic for architectures that define raw_smp_processor_id() in terms of this field, as it requires linux/sched.h to be included, which causes a lot of pain in terms of circular dependencies (aka 'header soup') This series moves it back into thread_info (where it came from) for all architectures that enable THREAD_INFO_IN_TASK, addressing the header soup issue as well as some pointless differences in the implementations of task_cpu() and set_task_cpu()'" * tag 'cpu-to-thread_info-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: riscv: rely on core code to keep thread_info::cpu updated powerpc: smp: remove hack to obtain offset of task_struct::cpu sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y powerpc: add CPU field to struct thread_info s390: add CPU field to struct thread_info x86: add CPU field to struct thread_info arm64: add CPU field to struct thread_info
2021-11-01Merge tag 'sched-core-2021-11-01' of ↵Linus Torvalds1-7/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Thomas Gleixner: - Revert the printk format based wchan() symbol resolution as it can leak the raw value in case that the symbol is not resolvable. - Make wchan() more robust and work with all kind of unwinders by enforcing that the task stays blocked while unwinding is in progress. - Prevent sched_fork() from accessing an invalid sched_task_group - Improve asymmetric packing logic - Extend scheduler statistics to RT and DL scheduling classes and add statistics for bandwith burst to the SCHED_FAIR class. - Properly account SCHED_IDLE entities - Prevent a potential deadlock when initial priority is assigned to a newly created kthread. A recent change to plug a race between cpuset and __sched_setscheduler() introduced a new lock dependency which is now triggered. Break the lock dependency chain by moving the priority assignment to the thread function. - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems. - Improve idle balancing in general and especially for NOHZ enabled systems. - Provide proper interfaces for live patching so it does not have to fiddle with scheduler internals. - Add cluster aware scheduling support. - A small set of tweaks for RT (irqwork, wait_task_inactive(), various scheduler options and delaying mmdrop) - The usual small tweaks and improvements all over the place * tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits) sched/fair: Cleanup newidle_balance sched/fair: Remove sysctl_sched_migration_cost condition sched/fair: Wait before decaying max_newidle_lb_cost sched/fair: Skip update_blocked_averages if we are defering load balance sched/fair: Account update_blocked_averages in newidle_balance cost x86: Fix __get_wchan() for !STACKTRACE sched,x86: Fix L2 cache mask sched/core: Remove rq_relock() sched: Improve wake_up_all_idle_cpus() take #2 irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support. sched/rt: Annotate the RT balancing logic irqwork as IRQ_WORK_HARD_IRQ sched: Add cluster scheduler level for x86 sched: Add cluster scheduler level in core and related Kconfig for ARM64 topology: Represent clusters of CPUs within a die sched: Disable -Wunused-but-set-variable sched: Add wrapper for get_wchan() to keep task blocked x86: Fix get_wchan() to support the ORC unwinder proc: Use task_is_running() for wchan in /proc/$pid/stat ...
2021-11-01Merge tag 'irq-core-2021-10-31' of ↵Linus Torvalds2-10/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Updates for the interrupt subsystem: Core changes: - Prevent a potential deadlock when initial priority is assigned to a newly created interrupt thread. A recent change to plug a race between cpuset and __sched_setscheduler() introduced a new lock dependency which is now triggered. Break the lock dependency chain by moving the priority assignment to the thread function. - A couple of small updates to make the irq core RT safe. - Confine the irq_cpu_online/offline() API to the only left unfixable user Cavium Octeon so that it does not grow new usage. - A small documentation update Driver changes: - A large cross architecture rework to move irq_enter/exit() into the architecture code to make addressing the NOHZ_FULL/RCU issues simpler. - The obligatory new irq chip driver for Microchip EIC - Modularize a few irq chip drivers - Expand usage of devm_*() helpers throughout the driver code - The usual small fixes and improvements all over the place" * tag 'irq-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) h8300: Fix linux/irqchip.h include mess dt-bindings: irqchip: renesas-irqc: Document r8a774e1 bindings MIPS: irq: Avoid an unused-variable error genirq: Hide irq_cpu_{on,off}line() behind a deprecated option irqchip/mips-gic: Get rid of the reliance on irq_cpu_online() MIPS: loongson64: Drop call to irq_cpu_offline() irq: remove handle_domain_{irq,nmi}() irq: remove CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY irq: riscv: perform irqentry in entry code irq: openrisc: perform irqentry in entry code irq: csky: perform irqentry in entry code irq: arm64: perform irqentry in entry code irq: arm: perform irqentry in entry code irq: add a (temporary) CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY irq: nds32: avoid CONFIG_HANDLE_DOMAIN_IRQ irq: arc: avoid CONFIG_HANDLE_DOMAIN_IRQ irq: add generic_handle_arch_irq() irq: unexport handle_irq_desc() irq: simplify handle_domain_{irq,nmi}() irq: mips: simplify do_domain_IRQ() ...
2021-10-27riscv: fix misalgned trap vector base addressChen Lu1-0/+1
The trap vector marked by label .Lsecondary_park must align on a 4-byte boundary, as the {m,s}tvec is defined to require 4-byte alignment. Signed-off-by: Chen Lu <181250012@smail.nju.edu.cn> Reviewed-by: Anup Patel <anup.patel@wdc.com> Fixes: e011995e826f ("RISC-V: Move relocate and few other functions out of __init") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-27ftrace: disable preemption when recursion locked王贇1-2/+0
As the documentation explained, ftrace_test_recursion_trylock() and ftrace_test_recursion_unlock() were supposed to disable and enable preemption properly, however currently this work is done outside of the function, which could be missing by mistake. And since the internal using of trace_test_and_set_recursion() and trace_clear_recursion() also require preemption disabled, we can just merge the logical. This patch will make sure the preemption has been disabled when trace_test_and_set_recursion() return bit >= 0, and trace_clear_recursion() will enable the preemption if previously enabled. Link: https://lkml.kernel.org/r/13bde807-779c-aa4c-0672-20515ae365ea@linux.alibaba.com CC: Petr Mladek <pmladek@suse.com> Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Jisheng Zhang <jszhang@kernel.org> CC: Steven Rostedt <rostedt@goodmis.org> CC: Miroslav Benes <mbenes@suse.cz> Reported-by: Abaci <abaci@linux.alibaba.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com> [ Removed extra line in comment - SDR ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-27riscv: remove .text section size limitation for XIPVitaly Wool2-3/+19
Currently there's a limit of 8MB for the .text section of a RISC-V image in the XIP case. This breaks compilation of many automatic builds and is generally inconvenient. This patch removes that limitation and optimizes XIP image file size at the same time. Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-26irq: riscv: perform irqentry in entry codeMark Rutland2-10/+2
In preparation for removing HANDLE_DOMAIN_IRQ_IRQENTRY, have arch/riscv perform all the irqentry accounting in its entry code. As arch/riscv uses GENERIC_IRQ_MULTI_HANDLER, we can use generic_handle_arch_irq() to do so. Since generic_handle_arch_irq() handles the irq entry and setting the irq regs, and happens before the irqchip code calls handle_IPI(), we can remove the redundant irq entry and irq regs manipulation from handle_IPI(). There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Guo Ren <guoren@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Thomas Gleixner <tglx@linutronix.de>
2021-10-20riscv: Use of_get_cpu_hwid()Rob Herring1-1/+2
Replace open coded parsing of CPU nodes' 'reg' property with of_get_cpu_hwid(). Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: linux-riscv@lists.infradead.org Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211006164332.1981454-9-robh@kernel.org
2021-10-15sched: Add wrapper for get_wchan() to keep task blockedKees Cook1-7/+5
Having a stable wchan means the process must be blocked and for it to stay that way while performing stack unwinding. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm] Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Link: https://lkml.kernel.org/r/20211008111626.332092234@infradead.org
2021-10-09ftrace: Cleanup ftrace_dyn_arch_init()Weizhao Ouyang1-5/+0
Most of ARCHs use empty ftrace_dyn_arch_init(), introduce a weak common ftrace_dyn_arch_init() to cleanup them. Link: https://lkml.kernel.org/r/20210909090216.1955240-1-o451686892@gmail.com Acked-by: Heiko Carstens <hca@linux.ibm.com> (s390) Acked-by: Helge Deller <deller@gmx.de> (parisc) Signed-off-by: Weizhao Ouyang <o451686892@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-05riscv: set default pm_power_off to NULLDimitri John Ledkov1-3/+9
Set pm_power_off to NULL like on all other architectures, check if it is set in machine_halt() and machine_power_off() and fallback to default_power_off if no other power driver got registered. This brings riscv architecture inline with all other architectures, and allows to reuse exiting power drivers unmodified. Kernels without legacy SBI v0.1 extensions (CONFIG_RISCV_SBI_V01 is not set), do not set pm_power_off to sbi_shutdown(). There is no support for SBI v0.3 system reset extension either. This prevents using gpio_poweroff on SiFive HiFive Unmatched. Tested on SiFive HiFive unmatched, with a dtb specifying gpio-poweroff node and kernel complied without CONFIG_RISCV_SBI_V01. BugLink: https://bugs.launchpad.net/bugs/1942806 Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com> Reviewed-by: Anup Patel <anup@brainfault.org> Tested-by: Ron Economos <w6rz@comcast.net> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-05riscv/vdso: Add support for time namespacesTong Tiangen2-53/+200
Implement generic vdso time namespace support which also enables time namespaces for riscv. This is quite similar to what arm64 does. selftest/timens test result: 1..10 ok 1 Passed for CLOCK_BOOTTIME (syscall) ok 2 Passed for CLOCK_BOOTTIME (vdso) ok 3 # SKIP CLOCK_BOOTTIME_ALARM isn't supported ok 4 # SKIP CLOCK_BOOTTIME_ALARM isn't supported ok 5 Passed for CLOCK_MONOTONIC (syscall) ok 6 Passed for CLOCK_MONOTONIC (vdso) ok 7 Passed for CLOCK_MONOTONIC_COARSE (syscall) ok 8 Passed for CLOCK_MONOTONIC_COARSE (vdso) ok 9 Passed for CLOCK_MONOTONIC_RAW (syscall) ok 10 Passed for CLOCK_MONOTONIC_RAW (vdso) # Totals: pass:8 fail:0 xfail:0 xpass:0 skip:2 error:0 Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04RISC-V: KVM: FP lazy save/restoreAtish Patra1-0/+72
This patch adds floating point (F and D extension) context save/restore for guest VCPUs. The FP context is saved and restored lazily only when kernel enter/exits the in-kernel run loop and not during the KVM world switch. This way FP save/restore has minimal impact on KVM performance. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Anup Patel <anup.patel@wdc.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alexander Graf <graf@amazon.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04RISC-V: KVM: Handle MMIO exits for VCPUAnup Patel1-0/+6
We will get stage2 page faults whenever Guest/VM access SW emulated MMIO device or unmapped Guest RAM. This patch implements MMIO read/write emulation by extracting MMIO details from the trapped load/store instruction and forwarding the MMIO read/write to user-space. The actual MMIO emulation will happen in user-space and KVM kernel module will only take care of register updates before resuming the trapped VCPU. The handling for stage2 page faults for unmapped Guest RAM will be implemeted by a separate patch later. [jiangyifei: ioeventfd and in-kernel mmio device support] Signed-off-by: Yifei Jiang <jiangyifei@huawei.com> Signed-off-by: Anup Patel <anup.patel@wdc.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alexander Graf <graf@amazon.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04RISC-V: KVM: Implement VCPU world-switchAnup Patel1-0/+78
This patch implements the VCPU world-switch for KVM RISC-V. The KVM RISC-V world-switch (i.e. __kvm_riscv_switch_to()) mostly switches general purpose registers, SSTATUS, STVEC, SSCRATCH and HSTATUS CSRs. Other CSRs are switched via vcpu_load() and vcpu_put() interface in kvm_arch_vcpu_load() and kvm_arch_vcpu_put() functions respectively. Signed-off-by: Anup Patel <anup.patel@wdc.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alexander Graf <graf@amazon.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-02riscv/vdso: make arch_setup_additional_pages wait for mmap_sem for write ↵Tong Tiangen1-1/+3
killable riscv architectures relying on mmap_sem for write in their arch_setup_additional_pages. If the waiting task gets killed by the oom killer it would block oom_reaper from asynchronous address space reclaim and reduce the chances of timely OOM resolving. Wait for the lock in the killable mode and return with EINTR if the task got killed while waiting. Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-02riscv/vdso: Move vdso data page up frontTong Tiangen2-19/+28
As commit 601255ae3c98 ("arm64: vdso: move data page before code pages"), the same issue exists on riscv, testcase is shown below, make sure that vdso.so is bigger than page size, struct timespec tp; clock_gettime(5, &tp); printf("tv_sec: %ld, tv_nsec: %ld\n", tp.tv_sec, tp.tv_nsec); without this patch, test result : tv_sec: 0, tv_nsec: 0 with this patch, test result : tv_sec: 1629271537, tv_nsec: 748000000 Move the vdso data page in front of the VDSO area to fix the issue. Fixes: ad5d1122b82fb ("riscv: use vDSO common flow to reduce the latency of the time-related functions") Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-02riscv/vdso: Refactor asm/vdso.hTong Tiangen2-2/+4
The asm/vdso.h will be included in vdso.lds.S in the next patch, the following cleanup is needed to avoid syntax error: 1.the declaration of sys_riscv_flush_icache() is moved into asm/syscall.h. 2.the definition of struct vdso_data is moved into kernel/vdso.c. 2.the definition of VDSO_SYMBOL is placed under "#ifndef __ASSEMBLY__". Also remove the redundant linux/types.h include. Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-01kprobes: treewide: Make it harder to refer kretprobe_trampoline directlyMasami Hiramatsu2-3/+3
Since now there is kretprobe_trampoline_addr() for referring the address of kretprobe trampoline code, we don't need to access kretprobe_trampoline directly. Make it harder to refer by renaming it to __kretprobe_trampoline(). Link: https://lkml.kernel.org/r/163163045446.489837.14510577516938803097.stgit@devnote2 Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-01kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler()Masami Hiramatsu1-1/+1
The __kretprobe_trampoline_handler() callback, called from low level arch kprobes methods, has the 'trampoline_address' parameter, which is entirely superfluous as it basically just replicates: dereference_kernel_function_descriptor(kretprobe_trampoline) In fact we had bugs in arch code where it wasn't replicated correctly. So remove this superfluous parameter and use kretprobe_trampoline_addr() instead. Link: https://lkml.kernel.org/r/163163044546.489837.13505751885476015002.stgit@devnote2 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-01kprobes: treewide: Cleanup the error messages for kprobesMasami Hiramatsu1-6/+5
This clean up the error/notification messages in kprobes related code. Basically this defines 'pr_fmt()' macros for each files and update the messages which describes - what happened, - what is the kernel going to do or not do, - is the kernel fine, - what can the user do about it. Also, if the message is not needed (e.g. the function returns unique error code, or other error message is already shown.) remove it, and replace the message with WARN_*() macros if suitable. Link: https://lkml.kernel.org/r/163163036568.489837.14085396178727185469.stgit@devnote2 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-30riscv: rely on core code to keep thread_info::cpu updatedArd Biesheuvel3-7/+0
Now that the core code switched back to using thread_info::cpu to keep a task's CPU number, we no longer need to keep it in sync explicitly. So just drop the code that does this. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com>
2021-09-12Merge tag 'smp-urgent-2021-09-12' of ↵Linus Torvalds1-5/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull CPU hotplug updates from Thomas Gleixner: "Updates for the SMP and CPU hotplug: - Remove DEFINE_SMP_CALL_CACHE_FUNCTION() which is a left over of the original hotplug code and now causing trouble with the ARM64 cache topology setup due to the pointless SMP function call. It's not longer required as the hotplug callbacks are guaranteed to be invoked on the upcoming CPU. - Remove the deprecated and now unused CPU hotplug functions - Rewrite the CPU hotplug API documentation" * tag 'smp-urgent-2021-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation: core-api/cpuhotplug: Rewrite the API section cpu/hotplug: Remove deprecated CPU-hotplug functions. thermal: Replace deprecated CPU-hotplug functions. drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()
2021-09-12Merge tag 'riscv-for-linus-5.15-mw1' of ↵Linus Torvalds2-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: - A pair of defconfig additions, for NVMe and the EFI filesystem localization options. - A larger address space for stack randomization. - A cleanup to our install rules. - A DTS update for the Microchip Icicle board, to fix the serial console. - Support for build-time table sorting, which allows us to have __ex_table read-only. * tag 'riscv-for-linus-5.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Move EXCEPTION_TABLE to RO_DATA segment riscv: Enable BUILDTIME_TABLE_SORT riscv: dts: microchip: mpfs-icicle: Fix serial console riscv: move the (z)install rules to arch/riscv/Makefile riscv: Improve stack randomisation on RV64 riscv: defconfig: enable NLS_CODEPAGE_437, NLS_ISO8859_1 riscv: defconfig: enable BLK_DEV_NVME
2021-09-11riscv: Move EXCEPTION_TABLE to RO_DATA segmentJisheng Zhang2-3/+2
_ex_table section is read-only, so move it to RO_DATA. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-09-11Merge branch 'linus' into smp/urgentThomas Gleixner7-29/+132
Ensure that all usage sites of get/put_online_cpus() except for the struggler in drivers/thermal are gone. So the last user and the deprecated inlines can be removed.
2021-09-08Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-5/+0
Merge more updates from Andrew Morton: "147 patches, based on 7d2a07b769330c34b4deabeed939325c77a7ec2f. Subsystems affected by this patch series: mm (memory-hotplug, rmap, ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan), alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib, checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig, selftests, ipc, and scripts" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits) scripts: check_extable: fix typo in user error message mm/workingset: correct kernel-doc notations ipc: replace costly bailout check in sysvipc_find_ipc() selftests/memfd: remove unused variable Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH configs: remove the obsolete CONFIG_INPUT_POLLDEV prctl: allow to setup brk for et_dyn executables pid: cleanup the stale comment mentioning pidmap_init(). kernel/fork.c: unexport get_{mm,task}_exe_file coredump: fix memleak in dump_vma_snapshot() fs/coredump.c: log if a core dump is aborted due to changed file permissions nilfs2: use refcount_dec_and_lock() to fix potential UAF nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group nilfs2: fix NULL pointer in nilfs_##name##_attr_release nilfs2: fix memory leak in nilfs_sysfs_create_device_group trap: cleanup trap_init() init: move usermodehelper_enable() to populate_rootfs() ...
2021-09-08trap: cleanup trap_init()Kefeng Wang1-5/+0
There are some empty trap_init() definitions in different ARCHs, Introduce a new weak trap_init() function to clean them up. Link: https://lkml.kernel.org/r/20210812123602.76356-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm32] Acked-by: Vineet Gupta [arc] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <palmerdabbelt@google.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-05Merge tag 'riscv-for-linus-5.15-mw0' of ↵Linus Torvalds6-24/+132
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - support PC-relative instructions (auipc and branches) in kprobes - support for forced IRQ threading - support for the hlt/nohlt kernel command line options, via the generic idle loop - show the edge/level triggered behavior of interrupts in /proc/interrupts - a handful of cleanups to our address mapping mechanisms - support for allocating gigantic hugepages via CMA - support for the undefined behavior sanitizer (UBSAN) - a handful of cleanups to the VDSO that allow the kernel to build with LLD. - support for hugepage migration * tag 'riscv-for-linus-5.15-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits) riscv: add support for hugepage migration RISC-V: Fix VDSO build for !MMU riscv: use strscpy to replace strlcpy riscv: explicitly use symbol offsets for VDSO riscv: Enable Undefined Behavior Sanitizer UBSAN riscv: Keep the riscv Kconfig selects sorted riscv: Support allocating gigantic hugepages using CMA riscv: fix the global name pfn_base confliction error riscv: Move early fdt mapping creation in its own function riscv: Simplify BUILTIN_DTB device tree mapping handling riscv: Use __maybe_unused instead of #ifdefs around variable declarations riscv: Get rid of map_size parameter to create_kernel_page_table riscv: Introduce va_kernel_pa_offset for 32-bit kernel riscv: Optimize kernel virtual address conversion macro dt-bindings: riscv: add starfive jh7100 bindings riscv: Enable GENERIC_IRQ_SHOW_LEVEL riscv: Enable idle generic idle loop riscv: Allow forced irq threading riscv: Implement thread_struct whitelist for hardened usercopy riscv: kprobes: implement the branch instructions ...
2021-09-01drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()Thomas Gleixner1-5/+2
DEFINE_SMP_CALL_CACHE_FUNCTION() was usefel before the CPU hotplug rework to ensure that the cache related functions are called on the upcoming CPU because the notifier itself could run on any online CPU. The hotplug state machine guarantees that the callbacks are invoked on the upcoming CPU. So there is no need to have this SMP function call obfuscation. That indirection was missed when the hotplug notifiers were converted. This also solves the problem of ARM64 init_cache_level() invoking ACPI functions which take a semaphore in that context. That's invalid as SMP function calls run with interrupts disabled. Running it just from the callback in context of the CPU hotplug thread solves this. Fixes: 8571890e1513 ("arm64: Add support for ACPI based firmware tables") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/871r69ersb.ffs@tglx
2021-08-26riscv: use strscpy to replace strlcpyJason Wang1-1/+1
The strlcpy should not be used because it doesn't limit the source length. As linus says, it's a completely useless function if you can't implicitly trust the source string - but that is almost always why people think they should use it! All in all the BSD function will lead some potential bugs. But the strscpy doesn't require reading memory from the src string beyond the specified "count" bytes, and since the return value is easier to error-check than strlcpy()'s. In addition, the implementation is robust to the string changing out from underneath it, unlike the current strlcpy() implementation. Thus, We prefer using strscpy instead of strlcpy. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-25riscv: explicitly use symbol offsets for VDSOSaleem Abdulrasool3-20/+16
The current implementation of the `__rt_sigaction` reference computed an absolute offset relative to the mapped base of the VDSO. While this can be handled in the medlow model, the medany model cannot handle this as it is meant to be position independent. The current implementation relied on the BFD linker relaxing the PC-relative relocation into an absolute relocation as it was a near-zero address allowing it to be referenced relative to `zero`. We now extract the offsets and create a generated header allowing the build with LLVM and lld to succeed as we no longer depend on the linker rewriting address references near zero. This change was largely modelled after the ARM64 target which does something similar. Signed-off-by: Saleem Abdulrasool <abdulras@google.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-25riscv: Enable Undefined Behavior Sanitizer UBSANJisheng Zhang1-0/+1
Select ARCH_HAS_UBSAN_SANITIZE_ALL in order to allow the user to enable CONFIG_UBSAN_SANITIZE_ALL and instrument the entire kernel for ubsan checks. VDSO is excluded because its build doesn't include the __ubsan_handle_*() functions from lib/ubsan.c, and the VDSO has no sane way to report errors even if it has definitions of these functions. Passed lib/test_ubsan.c test. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-25riscv: Ensure the value of FP registers in the core dump file is up to dateVincent Chen1-0/+4
The value of FP registers in the core dump file comes from the thread.fstate. However, kernel saves the FP registers to the thread.fstate only before scheduling out the process. If no process switch happens during the exception handling process, kernel will not have a chance to save the latest value of FP registers to thread.fstate. It will cause the value of FP registers in the core dump file may be incorrect. To solve this problem, this patch force lets kernel save the FP register into the thread.fstate if the target task_struct equals the current. Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Fixes: b8c8a9590e4f ("RISC-V: Add FP register ptrace support for gdb.") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-20riscv: Fix a number of free'd resources in init_resources()Petr Pavlu1-2/+2
Function init_resources() allocates a boot memory block to hold an array of resources which it adds to iomem_resource. The array is filled in from its end and the function then attempts to free any unused memory at the beginning. The problem is that size of the unused memory is incorrectly calculated and this can result in releasing memory which is in use by active resources. Their data then gets corrupted later when the memory is reused by a different part of the system. Fix the size of the released memory to correctly match the number of unused resource entries. Fixes: ffe0e5261268 ("RISC-V: Improve init_resources()") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Nick Kossifidis <mick@ics.forth.gr> Tested-by: Sunil V L <sunilvl@ventanamicro.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-12riscv: kexec: do not add '-mno-relax' flag if compiler doesn't support itChangbin Du1-1/+1
The RISC-V special option '-mno-relax' which to disable linker relaxations is supported by GCC8+. For GCC7 and lower versions do not support this option. Fixes: fba8a8674f68 ("RISC-V: Add kexec support") Signed-off-by: Changbin Du <changbin.du@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-24riscv: stacktrace: Fix NULL pointer dereferenceJisheng Zhang1-1/+1
When CONFIG_FRAME_POINTER=y, calling dump_stack() can always trigger NULL pointer dereference panic similar as below: [ 0.396060] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5+ #47 [ 0.396692] Hardware name: riscv-virtio,qemu (DT) [ 0.397176] Call Trace: [ 0.398191] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000960 [ 0.399487] Oops [#1] [ 0.399739] Modules linked in: [ 0.400135] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5+ #47 [ 0.400570] Hardware name: riscv-virtio,qemu (DT) [ 0.400926] epc : walk_stackframe+0xc4/0xdc [ 0.401291] ra : dump_backtrace+0x30/0x38 [ 0.401630] epc : ffffffff80004922 ra : ffffffff8000496a sp : ffffffe000f3bd00 [ 0.402115] gp : ffffffff80cfdcb8 tp : ffffffe000f30000 t0 : ffffffff80d0b0cf [ 0.402602] t1 : ffffffff80d0b0c0 t2 : 0000000000000000 s0 : ffffffe000f3bd60 [ 0.403071] s1 : ffffffff808bc2e8 a0 : 0000000000001000 a1 : 0000000000000000 [ 0.403448] a2 : ffffffff803d7088 a3 : ffffffff808bc2e8 a4 : 6131725dbc24d400 [ 0.403820] a5 : 0000000000001000 a6 : 0000000000000002 a7 : ffffffffffffffff [ 0.404226] s2 : 0000000000000000 s3 : 0000000000000000 s4 : 0000000000000000 [ 0.404634] s5 : ffffffff803d7088 s6 : ffffffff808bc2e8 s7 : ffffffff80630650 [ 0.405085] s8 : ffffffff80912a80 s9 : 0000000000000008 s10: ffffffff804000fc [ 0.405388] s11: 0000000000000000 t3 : 0000000000000043 t4 : ffffffffffffffff [ 0.405616] t5 : 000000000000003d t6 : ffffffe000f3baa8 [ 0.405793] status: 0000000000000100 badaddr: 0000000000000960 cause: 000000000000000d [ 0.406135] [<ffffffff80004922>] walk_stackframe+0xc4/0xdc [ 0.407032] [<ffffffff8000496a>] dump_backtrace+0x30/0x38 [ 0.407797] [<ffffffff803d7100>] show_stack+0x40/0x4c [ 0.408234] [<ffffffff803d9e5c>] dump_stack+0x90/0xb6 [ 0.409019] [<ffffffff8040423e>] ptdump_init+0x20/0xc4 [ 0.409681] [<ffffffff800015b6>] do_one_initcall+0x4c/0x226 [ 0.410110] [<ffffffff80401094>] kernel_init_freeable+0x1f4/0x258 [ 0.410562] [<ffffffff803dba88>] kernel_init+0x22/0x148 [ 0.410959] [<ffffffff800029e2>] ret_from_exception+0x0/0x14 [ 0.412241] ---[ end trace b2ab92c901b96251 ]--- [ 0.413099] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b The reason is the task is NULL when we finally call walk_stackframe() the NULL is passed from __dump_stack(): |static void __dump_stack(void) |{ | dump_stack_print_info(KERN_DEFAULT); | show_stack(NULL, NULL, KERN_DEFAULT); |} Fix this issue by checking "task == NULL" case in walk_stackframe(). Fixes: eac2f3059e02 ("riscv: stacktrace: fix the riscv stacktrace when CONFIG_FRAME_POINTER enabled") Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Atish Patra <atish.patra@wdc.com> Tested-by: Wende Tan <twd2.me@gmail.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-24riscv: stacktrace: pin the task's stack in get_wchanJisheng Zhang1-1/+5
Pin the task's stack before calling walk_stackframe() in get_wchan(). This can fix the panic as reported by Andreas when CONFIG_VMAP_STACK=y: [ 65.609696] Unable to handle kernel paging request at virtual address ffffffd0003bbde8 [ 65.610460] Oops [#1] [ 65.610626] Modules linked in: virtio_blk virtio_mmio rtc_goldfish btrfs blake2b_generic libcrc32c xor raid6_pq sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua efivarfs [ 65.611670] CPU: 2 PID: 1 Comm: systemd Not tainted 5.14.0-rc1-1.g34fe32a-default #1 openSUSE Tumbleweed (unreleased) c62f7109153e5a0897ee58ba52393ad99b070fd2 [ 65.612334] Hardware name: riscv-virtio,qemu (DT) [ 65.613008] epc : get_wchan+0x5c/0x88 [ 65.613334] ra : get_wchan+0x42/0x88 [ 65.613625] epc : ffffffff800048a4 ra : ffffffff8000488a sp : ffffffd00021bb90 [ 65.614008] gp : ffffffff817709f8 tp : ffffffe07fe91b80 t0 : 00000000000001f8 [ 65.614411] t1 : 0000000000020000 t2 : 0000000000000000 s0 : ffffffd00021bbd0 [ 65.614818] s1 : ffffffd0003bbdf0 a0 : 0000000000000001 a1 : 0000000000000002 [ 65.615237] a2 : ffffffff81618008 a3 : 0000000000000000 a4 : 0000000000000000 [ 65.615637] a5 : ffffffd0003bc000 a6 : 0000000000000002 a7 : ffffffe27d370000 [ 65.616022] s2 : ffffffd0003bbd90 s3 : ffffffff8071a81e s4 : 0000000000003fff [ 65.616407] s5 : ffffffffffffc000 s6 : 0000000000000000 s7 : ffffffff81618008 [ 65.616845] s8 : 0000000000000001 s9 : 0000000180000040 s10: 0000000000000000 [ 65.617248] s11: 000000000000016b t3 : 000000ff00000000 t4 : 0c6aec92de5e3fd7 [ 65.617672] t5 : fff78f60608fcfff t6 : 0000000000000078 [ 65.618088] status: 0000000000000120 badaddr: ffffffd0003bbde8 cause: 000000000000000d [ 65.618621] [<ffffffff800048a4>] get_wchan+0x5c/0x88 [ 65.619008] [<ffffffff8022da88>] do_task_stat+0x7a2/0xa46 [ 65.619325] [<ffffffff8022e87e>] proc_tgid_stat+0xe/0x16 [ 65.619637] [<ffffffff80227dd6>] proc_single_show+0x46/0x96 [ 65.619979] [<ffffffff801ccb1e>] seq_read_iter+0x190/0x31e [ 65.620341] [<ffffffff801ccd70>] seq_read+0xc4/0x104 [ 65.620633] [<ffffffff801a6bfe>] vfs_read+0x6a/0x112 [ 65.620922] [<ffffffff801a701c>] ksys_read+0x54/0xbe [ 65.621206] [<ffffffff801a7094>] sys_read+0xe/0x16 [ 65.621474] [<ffffffff8000303e>] ret_from_syscall+0x0/0x2 [ 65.622169] ---[ end trace f24856ed2b8789c5 ]--- [ 65.622832] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-22riscv: kprobes: implement the branch instructionsChen Lifu2-2/+79
This has been tested by probing a module that contains each of the flavors of branches we have. Signed-off-by: Chen Lifu <chenlifu@huawei.com> [Palmer: commit message, fix kconfig errors] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-22riscv: kprobes: implement the auipc instructionChen Lifu2-1/+35
This has been tested by probing a module that contains an auipc instruction. Signed-off-by: Chen Lifu <chenlifu@huawei.com> [Palmer: commit message] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-09Merge tag 'riscv-for-linus-5.14-mw0' of ↵Linus Torvalds13-59/+169
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "We have a handful of new features for 5.14: - Support for transparent huge pages. - Support for generic PCI resources mapping. - Support for the mem= kernel parameter. - Support for KFENCE. - A handful of fixes to avoid W+X mappings in the kernel. - Support for VMAP_STACK based overflow detection. - An optimized copy_{to,from}_user" * tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (37 commits) riscv: xip: Fix duplicate included asm/pgtable.h riscv: Fix PTDUMP output now BPF region moved back to module region riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall riscv: add VMAP_STACK overflow detection riscv: ptrace: add argn syntax riscv: mm: fix build errors caused by mk_pmd() riscv: Introduce structure that group all variables regarding kernel mapping riscv: Map the kernel with correct permissions the first time riscv: Introduce set_kernel_memory helper riscv: Enable KFENCE for riscv64 RISC-V: Use asm-generic for {in,out}{bwlq} riscv: add ASID-based tlbflushing methods riscv: pass the mm_struct to __sbi_tlb_flush_range riscv: Add mem kernel parameter support riscv: Simplify xip and !xip kernel address conversion macros riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED riscv: Only initialize swiotlb when necessary riscv: fix typo in init.c riscv: Cleanup unused functions riscv: mm: Use better bitmap_zalloc() ...
2021-07-08riscv: convert to setup_initial_init_mm()Kefeng Wang1-4/+1
Use setup_initial_init_mm() helper to simplify code. Link: https://lkml.kernel.org/r/20210608083418.137226-13-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>