BMC/Intel-BMC/linux.git - Intel OpenBMC Linux kernel source tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2022-03-08	arm64: Mark start_backtrace() notrace and NOKPROBE_SYMBOL	Masami Hiramatsu	1	-1/+2
	[ Upstream commit 1e0924bd09916fab795fc2a21ec1d148f24299fd ] Mark the start_backtrace() as notrace and NOKPROBE_SYMBOL because this function is called from ftrace and lockdep to get the caller address via return_address(). The lockdep is used in kprobes, it should also be NOKPROBE_SYMBOL. Fixes: b07f3499661c ("arm64: stacktrace: Move start_backtrace() out of the header") Cc: <stable@vger.kernel.org> # 5.13.x Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/164301227374.1433152.12808232644267107415.stgit@devnote2 Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-03	arm64: stacktrace: avoid tracing arch_stack_walk()	Mark Rutland	1	-1/+1
	When the function_graph tracer is in use, arch_stack_walk() may unwind the stack incorrectly, erroneously reporting itself, missing the final entry which is being traced, and reporting all traced entries between these off-by-one from where they should be. When ftrace hooks a function return, the original return address is saved to the fgraph ret_stack, and the return address in the LR (or the function's frame record) is replaced with `return_to_handler`. When arm64's unwinder encounter frames returning to `return_to_handler`, it finds the associated original return address from the fgraph ret stack, assuming the most recent `ret_to_hander` entry on the stack corresponds to the most recent entry in the fgraph ret stack, and so on. When arch_stack_walk() is used to dump the current task's stack, it starts from the caller of arch_stack_walk(). However, arch_stack_walk() can be traced, and so may push an entry on to the fgraph ret stack, leaving the fgraph ret stack offset by one from the expected position. This can be seen when dumping the stack via /proc/self/stack, where enabling the graph tracer results in an unexpected `stack_trace_save_tsk` entry at the start of the trace, and `el0_svc` missing form the end of the trace. This patch fixes this by marking arch_stack_walk() as notrace, as we do for all other functions on the path to ftrace_graph_get_ret_stack(). While a few helper functions are not marked notrace, their calls/returns are balanced, and will have no observable effect when examining the fgraph ret stack. It is possible for an exeption boundary to cause a similar offset if the return address of the interrupted context was in the LR. Fixing those cases will require some more substantial rework, and is left for subsequent patches. Before: \| # cat /proc/self/stack \| [<0>] proc_pid_stack+0xc4/0x140 \| [<0>] proc_single_show+0x6c/0x120 \| [<0>] seq_read_iter+0x240/0x4e0 \| [<0>] seq_read+0xe8/0x140 \| [<0>] vfs_read+0xb8/0x1e4 \| [<0>] ksys_read+0x74/0x100 \| [<0>] __arm64_sys_read+0x28/0x3c \| [<0>] invoke_syscall+0x50/0x120 \| [<0>] el0_svc_common.constprop.0+0xc4/0xd4 \| [<0>] do_el0_svc+0x30/0x9c \| [<0>] el0_svc+0x2c/0x54 \| [<0>] el0t_64_sync_handler+0x1a8/0x1b0 \| [<0>] el0t_64_sync+0x198/0x19c \| # echo function_graph > /sys/kernel/tracing/current_tracer \| # cat /proc/self/stack \| [<0>] stack_trace_save_tsk+0xa4/0x110 \| [<0>] proc_pid_stack+0xc4/0x140 \| [<0>] proc_single_show+0x6c/0x120 \| [<0>] seq_read_iter+0x240/0x4e0 \| [<0>] seq_read+0xe8/0x140 \| [<0>] vfs_read+0xb8/0x1e4 \| [<0>] ksys_read+0x74/0x100 \| [<0>] __arm64_sys_read+0x28/0x3c \| [<0>] invoke_syscall+0x50/0x120 \| [<0>] el0_svc_common.constprop.0+0xc4/0xd4 \| [<0>] do_el0_svc+0x30/0x9c \| [<0>] el0t_64_sync_handler+0x1a8/0x1b0 \| [<0>] el0t_64_sync+0x198/0x19c After: \| # cat /proc/self/stack \| [<0>] proc_pid_stack+0xc4/0x140 \| [<0>] proc_single_show+0x6c/0x120 \| [<0>] seq_read_iter+0x240/0x4e0 \| [<0>] seq_read+0xe8/0x140 \| [<0>] vfs_read+0xb8/0x1e4 \| [<0>] ksys_read+0x74/0x100 \| [<0>] __arm64_sys_read+0x28/0x3c \| [<0>] invoke_syscall+0x50/0x120 \| [<0>] el0_svc_common.constprop.0+0xc4/0xd4 \| [<0>] do_el0_svc+0x30/0x9c \| [<0>] el0_svc+0x2c/0x54 \| [<0>] el0t_64_sync_handler+0x1a8/0x1b0 \| [<0>] el0t_64_sync+0x198/0x19c \| # echo function_graph > /sys/kernel/tracing/current_tracer \| # cat /proc/self/stack \| [<0>] proc_pid_stack+0xc4/0x140 \| [<0>] proc_single_show+0x6c/0x120 \| [<0>] seq_read_iter+0x240/0x4e0 \| [<0>] seq_read+0xe8/0x140 \| [<0>] vfs_read+0xb8/0x1e4 \| [<0>] ksys_read+0x74/0x100 \| [<0>] __arm64_sys_read+0x28/0x3c \| [<0>] invoke_syscall+0x50/0x120 \| [<0>] el0_svc_common.constprop.0+0xc4/0xd4 \| [<0>] do_el0_svc+0x30/0x9c \| [<0>] el0_svc+0x2c/0x54 \| [<0>] el0t_64_sync_handler+0x1a8/0x1b0 \| [<0>] el0t_64_sync+0x198/0x19c Cc: <stable@vger.kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviwed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210802164845.45506-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-07-08	arm64: stacktrace: use %pSb for backtrace printing	Stephen Boyd	1	-1/+1
	Let's use the new printk format to print the stacktrace entry when printing a backtrace to the kernel logs. This will include any module's build ID[1] in it so that offline/crash debugging can easily locate the debuginfo for a module via something like debuginfod[2]. Link: https://lkml.kernel.org/r/20210511003845.2429846-7-swboyd@chromium.org Link: https://fedoraproject.org/wiki/Releases/FeatureBuildId [1] Link: https://sourceware.org/elfutils/Debuginfod.html [2] Signed-off-by: Stephen Boyd <swboyd@chromium.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Evan Green <evgreen@chromium.org> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Young <dyoung@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Sasha Levin <sashal@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-26	arm64: stacktrace: Relax frame record alignment requirement to 8 bytes	Peter Collingbourne	1	-1/+1
	The AAPCS places no requirements on the alignment of the frame record. In theory it could be placed anywhere, although it seems sensible to require it to be aligned to 8 bytes. With an upcoming enhancement to tag-based KASAN Clang will begin creating frame records located at an address that is only aligned to 8 bytes. Accommodate such frame records in the stack unwinding code. As pointed out by Mark Rutland, the userspace stack unwinding code has the same problem, so fix it there as well. Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/Ia22c375230e67ca055e9e4bb639383567f7ad268 Acked-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210526174927.2477847-2-pcc@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26	arm64: Change the on_*stack functions to take a size argument	Peter Collingbourne	1	-1/+1
	unwind_frame() was previously implicitly checking that the frame record is in bounds of the stack by enforcing that FP is both aligned to 16 and in bounds of the stack. Once the FP alignment requirement is relaxed to 8 this will not be sufficient because it does not account for the case where FP points to 8 bytes before the end of the stack. Make the check explicit by changing the on_*stack functions to take a size argument and adjusting the callers to pass the appropriate sizes. Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/Ib7a3eb3eea41b0687ffaba045ceb2012d077d8b4 Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210526174927.2477847-1-pcc@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25	arm64: Implement stack trace termination record	Madhavan T. Venkataraman	1	-9/+7
	Reliable stacktracing requires that we identify when a stacktrace is terminated early. We can do this by ensuring all tasks have a final frame record at a known location on their task stack, and checking that this is the final frame record in the chain. We'd like to use task_pt_regs(task)->stackframe as the final frame record, as this is already setup upon exception entry from EL0. For kernel tasks we need to consistently reserve the pt_regs and point x29 at this, which we can do with small changes to __primary_switched, __secondary_switched, and copy_process(). Since the final frame record must be at a specific location, we must create the final frame record in __primary_switched and __secondary_switched rather than leaving this to start_kernel and secondary_start_kernel. Thus, __primary_switched and __secondary_switched will now show up in stacktraces for the idle tasks. Since the final frame record is now identified by its location rather than by its contents, we identify it at the start of unwind_frame(), before we read any values from it. External debuggers may terminate the stack trace when FP == 0. In the pt_regs->stackframe, the PC is 0 as well. So, stack traces taken in the debugger may print an extra record 0x0 at the end. While this is not pretty, this does not do any harm. This is a small price to pay for having reliable stack trace termination in the kernel. That said, gdb does not show the extra record probably because it uses DWARF and not frame pointers for stack traces. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> [Mark: rebase, use ASM_BUG(), update comments, update commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210510110026.18061-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-07	Merge tag 'arm64-fixes' of ↵	Linus Torvalds	1	-4/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull more arm64 updates from Catalin Marinas: "A mix of fixes and clean-ups that turned up too late for the first pull request: - Restore terminal stack frame records. Their previous removal caused traces which cross secondary_start_kernel to terminate one entry too late, with a spurious "0" entry. - Fix boot warning with pseudo-NMI due to the way we manipulate the PMR register. - ACPI fixes: avoid corruption of interrupt mappings on watchdog probe failure (GTDT), prevent unregistering of GIC SGIs. - Force SPARSEMEM_VMEMMAP as the only memory model, it saves with having to test all the other combinations. - Documentation fixes and updates: tagged address ABI exceptions on brk/mmap/mremap(), event stream frequency, update booting requirements on the configuration of traps" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: kernel: Update the stale comment arm64: Fix the documented event stream frequency arm64: entry: always set GIC_PRIO_PSR_I_SET during entry arm64: Explicitly document boot requirements for SVE arm64: Explicitly require that FPSIMD instructions do not trap arm64: Relax booting requirements for configuration of traps arm64: cpufeatures: use min and max arm64: stacktrace: restore terminal records arm64/vdso: Discard .note.gnu.property sections in vDSO arm64: doc: Add brk/mmap/mremap() to the Tagged Address ABI Exceptions psci: Remove unneeded semicolon ACPI: irq: Prevent unregistering of GIC SGIs ACPI: GTDT: Don't corrupt interrupt mappings on watchdow probe failure arm64: Show three registers per line arm64: remove HAVE_DEBUG_BUGVERBOSE arm64: alternative: simplify passing alt_region arm64: Force SPARSEMEM_VMEMMAP as the only memory management model arm64: vdso32: drop -no-integrated-as flag
2021-04-30	arm64: stacktrace: restore terminal records	Mark Rutland	1	-4/+6
	We removed the terminal frame records in commit: 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") ... on the assumption that as we no longer used them to find the pt_regs at exception boundaries, they were no longer necessary. However, Leo reports that as an unintended side-effect, this causes traces which cross secondary_start_kernel to terminate one entry too late, with a spurious "0" entry. There are a few ways we could sovle this, but as we're planning to use terminal records for RELIABLE_STACKTRACE, let's revert the logic change for now, keeping the update comments and accounting for the changes in commit: 3c02600144bdb0a1 ("arm64: stacktrace: Report when we reach the end of the stack") This is effectively a partial revert of commit: 6106e1112cc69a36 ("arm64: remove EL0 exception frame record") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: 6106e1112cc6 ("arm64: remove EL0 exception frame record") Reported-by: Leo Yan <leo.yan@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> Link: https://lore.kernel.org/r/20210429104813.GA33550@C02TD0UTHF1T.local Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-26	Merge tag 'arm64-upstream' of ↵	Linus Torvalds	1	-0/+24
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - MTE asynchronous support for KASan. Previously only synchronous (slower) mode was supported. Asynchronous is faster but does not allow precise identification of the illegal access. - Run kernel mode SIMD with softirqs disabled. This allows using NEON in softirq context for crypto performance improvements. The conditional yield support is modified to take softirqs into account and reduce the latency. - Preparatory patches for Apple M1: handle CPUs that only have the VHE mode available (host kernel running at EL2), add FIQ support. - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers, new functions for the HiSilicon HHA and L3C PMU, cleanups. - Re-introduce support for execute-only user permissions but only when the EPAN (Enhanced Privileged Access Never) architecture feature is available. - Disable fine-grained traps at boot and improve the documented boot requirements. - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC). - Add hierarchical eXecute Never permissions for all page tables. - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs to control which PAC keys are enabled in a particular task. - arm64 kselftests for BTI and some improvements to the MTE tests. - Minor improvements to the compat vdso and sigpage. - Miscellaneous cleanups. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (86 commits) arm64/sve: Add compile time checks for SVE hooks in generic functions arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG. arm64: pac: Optimize kernel entry/exit key installation code paths arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere arm64/sve: Remove redundant system_supports_sve() tests arm64: fpsimd: run kernel mode NEON with softirqs disabled arm64: assembler: introduce wxN aliases for wN registers arm64: assembler: remove conditional NEON yield macros kasan, arm64: tests supports for HW_TAGS async mode arm64: mte: Report async tag faults before suspend arm64: mte: Enable async tag check fault arm64: mte: Conditionally compile mte_enable_kernel_*() arm64: mte: Enable TCO in functions that can read beyond buffer limits kasan: Add report for async mode arm64: mte: Drop arch_enable_tagging() kasan: Add KASAN mode kernel parameter arm64: mte: Add asynchronous mode support arm64: Get rid of CONFIG_ARM64_VHE arm64: Cope with CPUs stuck in VHE mode ...
2021-03-28	arm64: stacktrace: Move start_backtrace() out of the header	Mark Brown	1	-0/+24
	Currently start_backtrace() is a static inline function in the header. Since it really shouldn't be sufficiently performance critical that we actually need to have it inlined move it into a C file, this will save anyone else scratching their head about why it is defined in the header. As far as I can see it's only there because it was factored out of the various callers. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210319174022.33051-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-03-22	arm64: stacktrace: don't trace arch_stack_walk()	Mark Rutland	1	-4/+5
	We recently converted arm64 to use arch_stack_walk() in commit: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") The core stacktrace code expects that (when tracing the current task) arch_stack_walk() starts a trace at its caller, and does not include itself in the trace. However, arm64's arch_stack_walk() includes itself, and so traces include one more entry than callers expect. The core stacktrace code which calls arch_stack_walk() tries to skip a number of entries to prevent itself appearing in a trace, and the additional entry prevents skipping one of the core stacktrace functions, leaving this in the trace unexpectedly. We can fix this by having arm64's arch_stack_walk() begin the trace with its caller. The first value returned by the trace will be __builtin_return_address(0), i.e. the caller of arch_stack_walk(). The first frame record to be unwound will be __builtin_frame_address(1), i.e. the caller's frame record. To prevent surprises, arch_stack_walk() is also marked noinline. While __builtin_frame_address(1) is not safe in portable code, local GCC developers have confirmed that it is safe on arm64. To find the caller's frame record, the builtin can safely dereference the current function's frame record or (in theory) could stash the original FP into another GPR at function entry time, neither of which are problematic. Prior to this patch, the tracing code would unexpectedly show up in traces of the current task, e.g. \| # cat /proc/self/stack \| [<0>] stack_trace_save_tsk+0x98/0x100 \| [<0>] proc_pid_stack+0xb4/0x130 \| [<0>] proc_single_show+0x60/0x110 \| [<0>] seq_read_iter+0x230/0x4d0 \| [<0>] seq_read+0xdc/0x130 \| [<0>] vfs_read+0xac/0x1e0 \| [<0>] ksys_read+0x6c/0xfc \| [<0>] __arm64_sys_read+0x20/0x30 \| [<0>] el0_svc_common.constprop.0+0x60/0x120 \| [<0>] do_el0_svc+0x24/0x90 \| [<0>] el0_svc+0x2c/0x54 \| [<0>] el0_sync_handler+0x1a4/0x1b0 \| [<0>] el0_sync+0x170/0x180 After this patch, the tracing code will not show up in such traces: \| # cat /proc/self/stack \| [<0>] proc_pid_stack+0xb4/0x130 \| [<0>] proc_single_show+0x60/0x110 \| [<0>] seq_read_iter+0x230/0x4d0 \| [<0>] seq_read+0xdc/0x130 \| [<0>] vfs_read+0xac/0x1e0 \| [<0>] ksys_read+0x6c/0xfc \| [<0>] __arm64_sys_read+0x20/0x30 \| [<0>] el0_svc_common.constprop.0+0x60/0x120 \| [<0>] do_el0_svc+0x24/0x90 \| [<0>] el0_svc+0x2c/0x54 \| [<0>] el0_sync_handler+0x1a4/0x1b0 \| [<0>] el0_sync+0x170/0x180 Erring on the side of caution, I've given this a spin with a bunch of toolchains, verifying the output of /proc/self/stack and checking that the assembly looked sound. For GCC (where we require version 5.1.0 or later) I tested with the kernel.org crosstool binares for versions 5.5.0, 6.4.0, 6.5.0, 7.3.0, 7.5.0, 8.1.0, 8.3.0, 8.4.0, 9.2.0, and 10.1.0. For clang (where we require version 10.0.1 or later) I tested with the llvm.org binary releases of 11.0.0, and 11.0.1. Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Jun <chenjun102@huawei.com> Cc: Marco Elver <elver@google.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> # 5.10.x Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210319184106.5688-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-02-25	arm64: stacktrace: Report when we reach the end of the stack	Mark Brown	1	-1/+1
	Currently the arm64 unwinder code returns -EINVAL whenever it can't find the next stack frame, not distinguishing between cases where the stack has been corrupted or is otherwise in a state it shouldn't be and cases where we have reached the end of the stack. At the minute none of the callers care what error code is returned but this will be important for reliable stack trace which needs to be sure that the stack is intact. Change to return -ENOENT in the case where we reach the bottom of the stack. The error codes from this function are only used in kernel, this particular code is chosen as we are indicating that we know there is no frame there. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210224165037.24138-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20	arm64: remove EL0 exception frame record	Mark Rutland	1	-9/+4
	When entering an exception from EL0, the entry code creates a synthetic frame record with a NULL PC. This was used by the code introduced in commit: 7326749801396105 ("arm64: unwind: reference pt_regs via embedded stack frame") ... to discover exception entries on the stack and dump the associated pt_regs. Since the NULL PC was undesirable for the stacktrace, we added a special case to unwind_frame() to prevent the NULL PC from being logged. Since commit: a25ffd3a6302a678 ("arm64: traps: Don't print stack or raw PC/LR values in backtraces") ... we no longer try to dump the pt_regs as part of a stacktrace, and hence no longer need the synthetic exception record. This patch removes the synthetic exception record and the associated special case in unwind_frame(). Instead, EL0 exceptions set the FP to NULL, as is the case for other terminal records (e.g. when a kernel thread starts). The synthetic record for exceptions from EL1 is retrained as this has useful unwind information for the interrupted context. To make the terminal case a bit clearer, an explicit check is added to the start of unwind_frame(). This would otherwise be caught implicitly by the on_accessible_stack() checks. Reported-by: Mark Brown <broonie@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210113173155.43063-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-21	arm64: Move console stack display code to stacktrace.c	Mark Brown	1	-0/+65
	Currently the code for displaying a stack trace on the console is located in traps.c rather than stacktrace.c, using the unwinding code that is in stacktrace.c. This can be confusing and make the code hard to find since such output is often referred to as a stack trace which might mislead the unwary. Due to this and since traps.c doesn't interact with this code except for via the public interfaces move the code to stacktrace.c to make it easier to find. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20200921122341.11280-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-18	arm64: stacktrace: Convert to ARCH_STACKWALK	Mark Brown	1	-69/+10
	Historically architectures have had duplicated code in their stack trace implementations for filtering what gets traced. In order to avoid this duplication some generic code has been provided using a new interface arch_stack_walk(), enabled by selecting ARCH_STACKWALK in Kconfig, which factors all this out into the generic stack trace code. Convert arm64 to use this common infrastructure. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lore.kernel.org/r/20200914153409.25097-4-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-18	arm64: stacktrace: Make stack walk callback consistent with generic code	Mark Brown	1	-6/+5
	As with the generic arch_stack_walk() code the arm64 stack walk code takes a callback that is called per stack frame. Currently the arm64 code always passes a struct stackframe to the callback and the generic code just passes the pc, however none of the users ever reference anything in the struct other than the pc value. The arm64 code also uses a return type of int while the generic code uses a return type of bool though in both cases the return value is a boolean value and the sense is inverted between the two. In order to reduce code duplication when arm64 is converted to use arch_stack_walk() change the signature and return sense of the arm64 specific callback to match that of the generic code. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lore.kernel.org/r/20200914153409.25097-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-14	arm64: stacktrace: Move export for save_stack_trace_tsk()	Mark Brown	1	-1/+1
	Due to refactoring way back in bb53c820c5b0f1 ("arm64: stacktrace: avoid listing stacktrace functions in stacktrace") the EXPORT_SYMBOL_GPL() for save_stack_trace_tsk() is at the end of __save_stack_trace() rather than the function it exports. Move it to the expected location. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200710182402.50473-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-18	arm64: unwind: strip PAC from kernel addresses	Mark Rutland	1	-1/+4
	When we enable pointer authentication in the kernel, LR values saved to the stack will have a PAC which we must strip in order to retrieve the real return address. Strip PACs when unwinding the stack in order to account for this. When function graph tracer is used with patchable-function-entry then return_to_handler will also have pac bits so strip it too. Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [Amit: Re-position ptrauth_strip_insn_pac, comment] Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-08-01	arm64: unwind: Prohibit probing on return_address()	Masami Hiramatsu	1	-0/+3
	Prohibit probing on return_address() and subroutines which is called from return_address(), since the it is invoked from trace_hardirqs_off() which is also kprobe blacklisted. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22	arm64: stacktrace: Better handle corrupted stacks	Mark Rutland	1	-1/+39
	The arm64 stacktrace code is careful to only dereference frame records in valid stack ranges, ensuring that a corrupted frame record won't result in a faulting access. However, it's still possible for corrupt frame records to result in infinite loops in the stacktrace code, which is also undesirable. This patch ensures that we complete a stacktrace in finite time, by keeping track of which stacks we have already completed unwinding, and verifying that if the next frame record is on the same stack, it is at a higher address. As this has turned out to be particularly subtle, comments are added to explain the procedure. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> Acked-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Tengfei Fan <tengfeif@codeaurora.org> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22	arm64: stacktrace: Factor out backtrace initialisation	Dave Martin	1	-13/+6
	Some common code is required by each stacktrace user to initialise struct stackframe before the first call to unwind_frame(). In preparation for adding to the common code, this patch factors it out into a separate function start_backtrace(), and modifies the stacktrace callers appropriately. No functional change. Signed-off-by: Dave Martin <dave.martin@arm.com> [Mark: drop tsk argument, update more callsites] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-19	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234	Thomas Gleixner	1	-12/+1
	Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-14	arm64/stacktrace: Remove the pointless ULONG_MAX marker	Thomas Gleixner	1	-4/+0
	Terminating the last trace entry with ULONG_MAX is a completely pointless exercise and none of the consumers can rely on it because it's inconsistently implemented across architectures. In fact quite some of the callers remove the entry and adjust stack_trace.nr_entries afterwards. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Potapenko <glider@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lkml.kernel.org/r/20190410103644.220247845@linutronix.de
2019-03-19	arm64/stacktrace: Export save_stack_trace_regs()	William Cohen	1	-0/+1
	The ARM64 implements the save_stack_trace_regs function, but it is unusable for any diagnostic tooling compiled as a kernel module due the missing EXPORT_SYMBOL_GPL for the function. Export save_stack_trace_regs() to align with other architectures such as s390, openrisc, and powerpc. This is similar to the ARM64 export of save_stack_trace_tsk() added in git commit e27c7fa015d6. Signed-off-by: William Cohen <wcohen@redhat.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-12-22	arm64: Use ftrace_graph_get_ret_stack() instead of curr_ret_stack	Steven Rostedt (VMware)	1	-5/+7
	The structure of the ret_stack array on the task struct is going to change, and accessing it directly via the curr_ret_stack index will no longer give the ret_stack entry that holds the return address. To access that, architectures must now use ftrace_graph_get_ret_stack() to get the associated ret_stack that matches the saved return address. Cc: linux-arm-kernel@lists.infradead.org Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-12-09	arm64: function_graph: Remove use of FTRACE_NOTRACE_DEPTH	Steven Rostedt (VMware)	1	-3/+0
	Functions in the set_graph_notrace no longer subtract FTRACE_NOTRACE_DEPTH from curr_ret_stack, as that is now implemented via the trace_recursion flags. Access to curr_ret_stack no longer needs to worry about checking for this. curr_ret_stack is still initialized to -1, when there's not a shadow stack allocated. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-arm-kernel@lists.infradead.org Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-07-26	arm64: Add stack information to on_accessible_stack	Laura Abbott	1	-1/+1
	In preparation for enabling the stackleak plugin on arm64, we need a way to get the bounds of the current stack. Extend on_accessible_stack to get this information. Acked-by: Alexander Popov <alex.popov@linux.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Laura Abbott <labbott@redhat.com> [will: folded in fix for allmodconfig build breakage w/ sdei] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-12	Revert "arm64: fix infinite stacktrace"	Will Deacon	1	-3/+0
	This reverts commit 7e7df71fd57ff2894d96abb0080922bf39460a79. When unwinding out of the IRQ stack and onto the interrupted EL1 stack, we cannot rely on the frame pointer being strictly increasing, as this could terminate the backtrace early depending on how the stacks have been allocated. Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-04	arm64: fix infinite stacktrace	Mikulas Patocka	1	-0/+3
	I've got this infinite stacktrace when debugging another problem: [ 908.795225] INFO: rcu_preempt detected stalls on CPUs/tasks: [ 908.796176] 1-...!: (1 GPs behind) idle=952/1/4611686018427387904 softirq=1462/1462 fqs=355 [ 908.797692] 2-...!: (1 GPs behind) idle=f42/1/4611686018427387904 softirq=1550/1551 fqs=355 [ 908.799189] (detected by 0, t=2109 jiffies, g=130, c=129, q=235) [ 908.800284] Task dump for CPU 1: [ 908.800871] kworker/1:1 R running task 0 32 2 0x00000022 [ 908.802127] Workqueue: writecache-writeabck writecache_writeback [dm_writecache] [ 908.820285] Call trace: [ 908.824785] __switch_to+0x68/0x90 [ 908.837661] 0xfffffe00603afd90 [ 908.844119] 0xfffffe00603afd90 [ 908.850091] 0xfffffe00603afd90 [ 908.854285] 0xfffffe00603afd90 [ 908.863538] 0xfffffe00603afd90 [ 908.865523] 0xfffffe00603afd90 The machine just locked up and kept on printing the same line over and over again. This patch fixes it. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-02-23	arm64: fix unwind_frame() for filtered out fn for function graph tracing	Pratyush Anand	1	-0/+5
	do_task_stat() calls get_wchan(), which further does unwind_frame(). unwind_frame() restores frame->pc to original value in case function graph tracer has modified a return address (LR) in a stack frame to hook a function return. However, if function graph tracer has hit a filtered function, then we can't unwind it as ftrace_push_return_trace() has biased the index(frame->graph) with a 'huge negative' offset(-FTRACE_NOTRACE_DEPTH). Moreover, arm64 stack walker defines index(frame->graph) as unsigned int, which can not compare a -ve number. Similar problem we can have with calling of walk_stackframe() from save_stack_trace_tsk() or dump_backtrace(). This patch fixes unwind_frame() to test the index for -ve value and restore index accordingly before we can restore frame->pc. Reproducer: cd /sys/kernel/debug/tracing/ echo schedule > set_graph_notrace echo 1 > options/display-graph echo wakeup > current_tracer ps -ef \| grep -i agent Above commands result in: Unable to handle kernel paging request at virtual address ffff801bd3d1e000 pgd = ffff8003cbe97c00 [ffff801bd3d1e000] pgd=0000000000000000, pud=0000000000000000 Internal error: Oops: 96000006 [#1] SMP [...] CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33 [...] task: ffff8003c21ba000 task.stack: ffff8003cc6c0000 PC is at unwind_frame+0x12c/0x180 LR is at get_wchan+0xd4/0x134 pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145 sp : ffff8003cc6c3ab0 x29: ffff8003cc6c3ab0 x28: 0000000000000001 x27: 0000000000000026 x26: 0000000000000026 x25: 00000000000012d8 x24: 0000000000000000 x23: ffff8003c1c04000 x22: ffff000008c83000 x21: ffff8003c1c00000 x20: 000000000000000f x19: ffff8003c1bc0000 x18: 0000fffffc593690 x17: 0000000000000000 x16: 0000000000000001 x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f x13: 0000000000000001 x12: 0000000000000000 x11: 00000000e8f4883e x10: 0000000154f47ec8 x9 : 0000000070f367c0 x8 : 0000000000000000 x7 : 00008003f7290000 x6 : 0000000000000018 x5 : 0000000000000000 x4 : ffff8003c1c03cb0 x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000 x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000 Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000) Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000) [...] [<ffff00000808892c>] unwind_frame+0x12c/0x180 [<ffff000008305008>] do_task_stat+0x864/0x870 [<ffff000008305c44>] proc_tgid_stat+0x3c/0x48 [<ffff0000082fde0c>] proc_single_show+0x5c/0xb8 [<ffff0000082b27e0>] seq_read+0x160/0x414 [<ffff000008289e6c>] __vfs_read+0x58/0x164 [<ffff00000828b164>] vfs_read+0x88/0x144 [<ffff00000828c2e8>] SyS_read+0x60/0xc0 [<ffff0000080834a0>] __sys_trace_return+0x0/0x4 Fixes: 20380bb390a4 (arm64: ftrace: fix a stack tracer's output under function graph tracer) Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Jerome Marchand <jmarchan@redhat.com> [catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-09-14	arm64: stacktrace: avoid listing stacktrace functions in stacktrace	Prakash Gupta	1	-5/+13
	The stacktraces always begin as follows: [<c00117b4>] save_stack_trace_tsk+0x0/0x98 [<c0011870>] save_stack_trace+0x24/0x28 ... This is because the stack trace code includes the stack frames for itself. This is incorrect behaviour, and also leads to "skip" doing the wrong thing (which is the number of stack frames to avoid recording.) Perversely, it does the right thing when passed a non-current thread. Fix this by ensuring that we have a known constant number of frames above the main stack trace function, and always skip these. This was fixed for arch arm by commit 3683f44c42e9 ("ARM: stacktrace: avoid listing stacktrace functions in stacktrace") Link: http://lkml.kernel.org/r/1504078343-28754-1-git-send-email-guptap@codeaurora.org Signed-off-by: Prakash Gupta <guptap@codeaurora.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-15	arm64: add on_accessible_stack()	Mark Rutland	1	-6/+1
	Both unwind_frame() and dump_backtrace() try to check whether a stack address is sane to access, with very similar logic. Both will need updating in order to handle overflow stacks. Factor out this logic into a helper, so that we can avoid further duplication when we add overflow stacks. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com>
2017-08-09	arm64: unwind: remove sp from struct stackframe	Ard Biesheuvel	1	-4/+0
	The unwind code sets the sp member of struct stackframe to 'frame pointer + 0x10' unconditionally, without regard for whether doing so produces a legal value. So let's simply remove it now that we have stopped using it anyway. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-08-09	arm64: unwind: reference pt_regs via embedded stack frame	Ard Biesheuvel	1	-27/+6
	As it turns out, the unwind code is slightly broken, and probably has been for a while. The problem is in the dumping of the exception stack, which is intended to dump the contents of the pt_regs struct at each level in the call stack where an exception was taken and routed to a routine marked as __exception (which means its stack frame is right below the pt_regs struct on the stack). 'Right below the pt_regs struct' is ill defined, though: the unwind code assigns 'frame pointer + 0x10' to the .sp member of the stackframe struct at each level, and dump_backtrace() happily dereferences that as the pt_regs pointer when encountering an __exception routine. However, the actual size of the stack frame created by this routine (which could be one of many __exception routines we have in the kernel) is not known, and so frame.sp is pretty useless to figure out where struct pt_regs really is. So it seems the only way to ensure that we can find our struct pt_regs when walking the stack frames is to put it at a known fixed offset of the stack frame pointer that is passed to such __exception routines. The simplest way to do that is to put it inside pt_regs itself, which is the main change implemented by this patch. As a bonus, doing this allows us to get rid of a fair amount of cruft related to walking from one stack to the other, which is especially nice since we intend to introduce yet another stack for overflow handling once we add support for vmapped stacks. It also fixes an inconsistency where we only add a stack frame pointing to ELR_EL1 if we are executing from the IRQ stack but not when we are executing from the task stack. To consistly identify exceptions regs even in the presence of exceptions taken from entry code, we must check whether the next frame was created by entry text, rather than whether the current frame was crated by exception text. To avoid backtracing using PCs that fall in the idmap, or are controlled by userspace, we must explcitly zero the FP and LR in startup paths, and must ensure that the frame embedded in pt_regs is zeroed upon entry from EL0. To avoid these NULL entries showin in the backtrace, unwind_frame() is updated to avoid them. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [Mark: compare current frame against .entry.text, avoid bogus PCs] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-08-08	arm64: unwind: disregard frame.sp when validating frame pointer	Ard Biesheuvel	1	-17/+7
	Currently, when unwinding the call stack, we validate the frame pointer of each frame against frame.sp, whose value is not clearly defined, and which makes it more difficult to link stack frames together across different stacks. It is far better to simply check whether the frame pointer itself points into a valid stack. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-08-08	arm64: unwind: avoid percpu indirection for irq stack	Mark Rutland	1	-2/+2
	Our IRQ_STACK_PTR() and on_irq_stack() helpers both take a cpu argument, used to generate a percpu address. In all cases, they are passed {raw_,}smp_processor_id(), so this parameter is redundant. Since {raw_,}smp_processor_id() use a percpu variable internally, this approach means we generate a percpu offset to find the current cpu, then use this to index an array of percpu offsets, which we then use to find the current CPU's IRQ stack pointer. Thus, most of the work is redundant. Instead, we can consistently use raw_cpu_ptr() to generate the CPU's irq_stack pointer by simply adding the percpu offset to the irq_stack address, which is simpler in both respects. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
2017-06-15	arm64: Export save_stack_trace_tsk()	Dustin Brown	1	-0/+1
	The kernel watchdog is a great debugging tool for finding tasks that consume a disproportionate amount of CPU time in contiguous chunks. One can imagine building a similar watchdog for arbitrary driver threads using save_stack_trace_tsk() and print_stack_trace(). However, this is not viable for dynamically loaded driver modules on ARM platforms because save_stack_trace_tsk() is not exported for those architectures. Export save_stack_trace_tsk() for the ARM64 architecture to align with x86 and support various debugging use cases such as arbitrary driver thread watchdog timers. Signed-off-by: Dustin Brown <dustinb@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-03-02	sched/headers: Prepare for new header dependencies before moving code to ↵	Ingo Molnar	1	-0/+1
	<linux/sched/task_stack.h> We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/task_stack.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02	sched/headers: Prepare for new header dependencies before moving code to ↵	Ingo Molnar	1	-0/+1
	<linux/sched/debug.h> We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/debug.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-11	arm64: prep stack walkers for THREAD_INFO_IN_TASK	Mark Rutland	1	-0/+5
	When CONFIG_THREAD_INFO_IN_TASK is selected, task stacks may be freed before a task is destroyed. To account for this, the stacks are refcounted, and when manipulating the stack of another task, it is necessary to get/put the stack to ensure it isn't freed and/or re-used while we do so. This patch reworks the arm64 stack walking code to account for this. When CONFIG_THREAD_INFO_IN_TASK is not selected these perform no refcounting, and this should only be a structural change that does not affect behaviour. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-11	arm64: unexport walk_stackframe	Mark Rutland	1	-1/+0
	The walk_stackframe functions is architecture-specific, with a varying prototype, and common code should not use it directly. None of its current users can be built as modules. With THREAD_INFO_IN_TASK, users will also need to hold a stack reference before calling it. There's no reason for it to be exported, and it's very easy to misuse, so unexport it for now. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-11	arm64: factor out current_stack_pointer	Mark Rutland	1	-0/+1
	We define current_stack_pointer in <asm/thread_info.h>, though other files and header relying upon it do not have this necessary include, and are thus fragile to changes in the header soup. Subsequent patches will affect the header soup such that directly including <asm/thread_info.h> may result in a circular header include in some of these cases, so we can't simply include <asm/thread_info.h>. Instead, factor current_thread_info into its own header, and have all existing users include this explicitly. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-09-26	arm64: fix dump_backtrace/unwind_frame with NULL tsk	Mark Rutland	1	-1/+4
	In some places, dump_backtrace() is called with a NULL tsk parameter, e.g. in bug_handler() in arch/arm64, or indirectly via show_stack() in core code. The expectation is that this is treated as if current were passed instead of NULL. Similar is true of unwind_frame(). Commit a80a0eb70c358f8c ("arm64: make irq_stack_ptr more robust") didn't take this into account. In dump_backtrace() it compares tsk against current before we check if tsk is NULL, and in unwind_frame() we never set tsk if it is NULL. Due to this, we won't initialise irq_stack_ptr in either function. In dump_backtrace() this results in calling dump_mem() for memory immediately above the IRQ stack range, rather than for the relevant range on the task stack. In unwind_frame we'll reject unwinding frames on the IRQ stack. In either case this results in incomplete or misleading backtrace information, but is not otherwise problematic. The initial percpu areas (including the IRQ stacks) are allocated in the linear map, and dump_mem uses __get_user(), so we shouldn't access anything with side-effects, and will handle holes safely. This patch fixes the issue by having both functions handle the NULL tsk case before doing anything else with tsk. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: a80a0eb70c358f8c ("arm64: make irq_stack_ptr more robust") Acked-by: James Morse <james.morse@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yang Shi <yang.shi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-05	arm64: ftrace: add save_stack_trace_regs()	Pratyush Anand	1	-0/+21
	Currently, enabling stacktrace of a kprobe events generates warning: echo stacktrace > /sys/kernel/debug/tracing/trace_options echo "p xhci_irq" > /sys/kernel/debug/tracing/kprobe_events echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable save_stack_trace_regs() not implemented yet. ------------[ cut here ]------------ WARNING: CPU: 1 PID: 0 at ../kernel/stacktrace.c:74 save_stack_trace_regs+0x3c/0x48 Modules linked in: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-rc4-dirty #5128 Hardware name: ARM Juno development board (r1) (DT) task: ffff800975dd1900 task.stack: ffff800975ddc000 PC is at save_stack_trace_regs+0x3c/0x48 LR is at save_stack_trace_regs+0x3c/0x48 pc : [<ffff000008126c64>] lr : [<ffff000008126c64>] pstate: 600003c5 sp : ffff80097ef52c00 Call trace: save_stack_trace_regs+0x3c/0x48 __ftrace_trace_stack+0x168/0x208 trace_buffer_unlock_commit_regs+0x5c/0x7c kprobe_trace_func+0x308/0x3d8 kprobe_dispatcher+0x58/0x60 kprobe_breakpoint_handler+0xbc/0x18c brk_handler+0x50/0x90 do_debug_exception+0x50/0xbc This patch implements save_stack_trace_regs(), so that stacktrace of a kprobe events can be obtained. After this patch, there is no warning and we can see the stacktrace for kprobe events in trace buffer. more /sys/kernel/debug/tracing/trace <idle>-0 [004] d.h. 1356.000496: p_xhci_irq_0:(xhci_irq+0x0/0x9ac) <idle>-0 [004] d.h. 1356.000497: <stack trace> => xhci_irq => __handle_irq_event_percpu => handle_irq_event_percpu => handle_irq_event => handle_fasteoi_irq => generic_handle_irq => __handle_domain_irq => gic_handle_irq => el1_irq => arch_cpu_idle => default_idle_call => cpu_startup_entry => secondary_start_kernel => Tested-by: David A. Long <dave.long@linaro.org> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-12	arm64: make irq_stack_ptr more robust	Yang Shi	1	-7/+6
	Switching between stacks is only valid if we are tracing ourselves while on the irq_stack, so it is only valid when in current and non-preemptible context, otherwise is is just zeroed off. Fixes: 132cd887b5c5 ("arm64: Modify stack trace and dump for use with irq_stack") Acked-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> Signed-off-by: Yang Shi <yang.shi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-09	arm64: disable kasan when accessing frame->fp in unwind_frame	Yang Shi	1	-2/+2
	When boot arm64 kernel with KASAN enabled, the below error is reported by kasan: BUG: KASAN: out-of-bounds in unwind_frame+0xec/0x260 at addr ffffffc064d57ba0 Read of size 8 by task pidof/499 page:ffffffbdc39355c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x0() page dumped because: kasan: bad access detected CPU: 2 PID: 499 Comm: pidof Not tainted 4.5.0-rc1 #119 Hardware name: Freescale Layerscape 2085a RDB Board (DT) Call trace: [<ffffffc00008d078>] dump_backtrace+0x0/0x290 [<ffffffc00008d32c>] show_stack+0x24/0x30 [<ffffffc0006a981c>] dump_stack+0x8c/0xd8 [<ffffffc0002e4400>] kasan_report_error+0x558/0x588 [<ffffffc0002e4958>] kasan_report+0x60/0x70 [<ffffffc0002e3188>] __asan_load8+0x60/0x78 [<ffffffc00008c92c>] unwind_frame+0xec/0x260 [<ffffffc000087e60>] get_wchan+0x110/0x160 [<ffffffc0003b647c>] do_task_stat+0xb44/0xb68 [<ffffffc0003b7730>] proc_tgid_stat+0x40/0x50 [<ffffffc0003ac840>] proc_single_show+0x88/0xd8 [<ffffffc000345be8>] seq_read+0x370/0x770 [<ffffffc00030aba0>] __vfs_read+0xc8/0x1d8 [<ffffffc00030c0ec>] vfs_read+0x94/0x168 [<ffffffc00030d458>] SyS_read+0xb8/0x128 [<ffffffc000086530>] el0_svc_naked+0x24/0x28 Memory state around the buggy address: ffffffc064d57a80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 f4 f4 ffffffc064d57b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffffffc064d57b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ ffffffc064d57c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffffc064d57c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Since the shadow byte pointed by the report is 0, so it may mean it is just hit oob in non-current task. So, disable the instrumentation to silence these warnings. Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Yang Shi <yang.shi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-21	arm64: ftrace: fix a stack tracer's output under function graph tracer	AKASHI Takahiro	1	-0/+17
	Function graph tracer modifies a return address (LR) in a stack frame to hook a function return. This will result in many useless entries (return_to_handler) showing up in a) a stack tracer's output b) perf call graph (with perf record -g) c) dump_backtrace (at panic et al.) For example, in case of a), $ echo function_graph > /sys/kernel/debug/tracing/current_tracer $ echo 1 > /proc/sys/kernel/stack_trace_enabled $ cat /sys/kernel/debug/tracing/stack_trace Depth Size Location (54 entries) ----- ---- -------- 0) 4504 16 gic_raise_softirq+0x28/0x150 1) 4488 80 smp_cross_call+0x38/0xb8 2) 4408 48 return_to_handler+0x0/0x40 3) 4360 32 return_to_handler+0x0/0x40 ... In case of b), $ echo function_graph > /sys/kernel/debug/tracing/current_tracer $ perf record -e mem:XXX:x -ag -- sleep 10 $ perf report ... \| \| \|--0.22%-- 0x550f8 \| \| \| 0x10888 \| \| \| el0_svc_naked \| \| \| sys_openat \| \| \| return_to_handler \| \| \| return_to_handler ... In case of c), $ echo function_graph > /sys/kernel/debug/tracing/current_tracer $ echo c > /proc/sysrq-trigger ... Call trace: [<ffffffc00044d3ac>] sysrq_handle_crash+0x24/0x30 [<ffffffc000092250>] return_to_handler+0x0/0x40 [<ffffffc000092250>] return_to_handler+0x0/0x40 ... This patch replaces such entries with real addresses preserved in current->ret_stack[] at unwind_frame(). This way, we can cover all the cases. Reviewed-by: Jungseok Lee <jungseoklee85@gmail.com> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> [will: fixed minor context changes conflicting with irq stack bits] Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-21	arm64: pass a task parameter to unwind_frame()	AKASHI Takahiro	1	-4/+4
	Function graph tracer modifies a return address (LR) in a stack frame to hook a function's return. This will result in many useless entries (return_to_handler) showing up in a call stack list. We will fix this problem in a later patch ("arm64: ftrace: fix a stack tracer's output under function graph tracer"). But since real return addresses are saved in ret_stack[] array in struct task_struct, unwind functions need to be notified of, in addition to a stack pointer address, which task is being traced in order to find out real return addresses. This patch extends unwind functions' interfaces by adding an extra argument of a pointer to task_struct. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-15	arm64: reduce stack use in irq_handler	James Morse	1	-3/+16
	The code for switching to irq_stack stores three pieces of information on the stack, fp+lr, as a fake stack frame (that lets us walk back onto the interrupted tasks stack frame), and the address of the struct pt_regs that contains the register values from kernel entry. (which dump_backtrace() will print in any stack trace). To reduce this, we store fp, and the pointer to the struct pt_regs. unwind_frame() can recognise this as the irq_stack dummy frame, (as it only appears at the top of the irq_stack), and use the struct pt_regs values to find the missing interrupted link-register. Suggested-by: Will Deacon <will.deacon@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-12-10	arm64: when walking onto the task stack, check sp & fp are in current->stack	James Morse	1	-2/+10
	When unwind_frame() reaches the bottom of the irq_stack, the last fp points to the original task stack. unwind_frame() uses IRQ_STACK_TO_TASK_STACK() to find the sp value. If either values is wrong, we may end up walking a corrupt stack. Check these values are sane by testing if they are both on the stack pointed to by current->stack. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>