starfive-tech/linux.git - StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2020-12-02	entry: Add syscall_exit_to_user_mode_work()	Sven Schnelle	2	-2/+32
	This is the same as syscall_exit_to_user_mode() but without calling exit_to_user_mode(). This can be used if there is an architectural reason to avoid the combo function, e.g. restarting a syscall without returning to userspace. Before returning to user space the caller has to invoke exit_to_user_mode(). [ tglx: Amended comments ] Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201201142755.31931-6-svens@linux.ibm.com
2020-12-02	entry: Add exit_to_user_mode() wrapper	Sven Schnelle	2	-14/+27
	Called from architecture specific code when syscall_exit_to_user_mode() is not suitable. It simply calls __exit_to_user_mode(). This way __exit_to_user_mode() can still be inlined because it is declared static __always_inline. [ tglx: Amended comments and moved it to a different place in the header ] Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201201142755.31931-5-svens@linux.ibm.com
2020-12-02	entry_Add_enter_from_user_mode_wrapper	Sven Schnelle	2	-11/+29
	To be called from architecture specific code if the combo interfaces are not suitable. It simply calls __enter_from_user_mode(). This way __enter_from_user_mode will still be inlined because it is declared static __always_inline. [ tglx: Amend comments and move it to a different location in the header ] Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201201142755.31931-4-svens@linux.ibm.com
2020-12-02	entry: Rename exit_to_user_mode()	Sven Schnelle	1	-4/+4
	In order to make this function publicly available rename it so it can still be inlined. An additional exit_to_user_mode() function will be added with a later commit. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201201142755.31931-3-svens@linux.ibm.com
2020-12-02	entry: Rename enter_from_user_mode()	Sven Schnelle	1	-5/+5
	In order to make this function publicly available rename it so it can still be inlined. An additional enter_from_user_mode() function will be added with a later commit. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201201142755.31931-2-svens@linux.ibm.com
2020-12-02	docs: Document Syscall User Dispatch	Gabriel Krisman Bertazi	2	-0/+91
	Explain the interface, provide some background and security notes. [ tglx: Add note about non-visibility, add it to the index and fix the kerneldoc warning ] Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20201127193238.821364-8-krisman@collabora.com
2020-12-02	selftests: Add benchmark for syscall user dispatch	Gabriel Krisman Bertazi	2	-1/+201
	This is the patch I'm using to evaluate the impact syscall user dispatch has on native syscall (syscalls not redirected to userspace) when enabled for the process and submiting syscalls though the unblocked dispatch selector. It works by running a step to define a baseline of the cost of executing sysinfo, then enabling SUD, and rerunning that step. On my test machine, an AMD Ryzen 5 1500X, I have the following results with the latest version of syscall user dispatch patches. root@olga:~# syscall_user_dispatch/sud_benchmark Calibrating test set to last ~5 seconds... test iterations = 37500000 Avg syscall time 134ns. Caught sys_ff00 trapped_call_count 1, native_call_count 0. Avg syscall time 147ns. Interception overhead: 9.7% (+13ns). Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20201127193238.821364-7-krisman@collabora.com
2020-12-02	selftests: Add kselftest for syscall user dispatch	Gabriel Krisman Bertazi	5	-0/+324
	Implement functionality tests for syscall user dispatch. In order to make the test portable, refrain from open coding syscall dispatchers and calculating glibc memory ranges. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20201127193238.821364-6-krisman@collabora.com
2020-12-02	entry: Support Syscall User Dispatch on common syscall entry	Gabriel Krisman Bertazi	2	-0/+27
	Syscall User Dispatch (SUD) must take precedence over seccomp and ptrace, since the use case is emulation (it can be invoked with a different ABI) such that seccomp filtering by syscall number doesn't make sense in the first place. In addition, either the syscall is dispatched back to userspace, in which case there is no resource for to trace, or the syscall will be executed, and seccomp/ptrace will execute next. Since SUD runs before tracepoints, it needs to be a SYSCALL_WORK_EXIT as well, just to prevent a trace exit event when dispatch was triggered. For that, the on_syscall_dispatch() examines context to skip the tracepoint, audit and other work. [ tglx: Add a comment on the exit side ] Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20201127193238.821364-5-krisman@collabora.com
2020-12-02	kernel: Implement selective syscall userspace redirection	Gabriel Krisman Bertazi	10	-1/+170
	Introduce a mechanism to quickly disable/enable syscall handling for a specific process and redirect to userspace via SIGSYS. This is useful for processes with parts that require syscall redirection and parts that don't, but who need to perform this boundary crossing really fast, without paying the cost of a system call to reconfigure syscall handling on each boundary transition. This is particularly important for Windows games running over Wine. The proposed interface looks like this: prctl(PR_SET_SYSCALL_USER_DISPATCH, <op>, <off>, <length>, [selector]) The range [<offset>,<offset>+<length>) is a part of the process memory map that is allowed to by-pass the redirection code and dispatch syscalls directly, such that in fast paths a process doesn't need to disable the trap nor the kernel has to check the selector. This is essential to return from SIGSYS to a blocked area without triggering another SIGSYS from rt_sigreturn. selector is an optional pointer to a char-sized userspace memory region that has a key switch for the mechanism. This key switch is set to either PR_SYS_DISPATCH_ON, PR_SYS_DISPATCH_OFF to enable and disable the redirection without calling the kernel. The feature is meant to be set per-thread and it is disabled on fork/clone/execv. Internally, this doesn't add overhead to the syscall hot path, and it requires very little per-architecture support. I avoided using seccomp, even though it duplicates some functionality, due to previous feedback that maybe it shouldn't mix with seccomp since it is not a security mechanism. And obviously, this should never be considered a security mechanism, since any part of the program can by-pass it by using the syscall dispatcher. For the sysinfo benchmark, which measures the overhead added to executing a native syscall that doesn't require interception, the overhead using only the direct dispatcher region to issue syscalls is pretty much irrelevant. The overhead of using the selector goes around 40ns for a native (unredirected) syscall in my system, and it is (as expected) dominated by the supervisor-mode user-address access. In fact, with SMAP off, the overhead is consistently less than 5ns on my test box. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20201127193238.821364-4-krisman@collabora.com
2020-12-02	signal: Expose SYS_USER_DISPATCH si_code type	Gabriel Krisman Bertazi	2	-2/+3
	SYS_USER_DISPATCH will be triggered when a syscall is sent to userspace by the Syscall User Dispatch mechanism. This adjusts eventual BUILD_BUG_ON around the tree. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20201127193238.821364-3-krisman@collabora.com
2020-12-02	x86: vdso: Expose sigreturn address on vdso to the kernel	Gabriel Krisman Bertazi	5	-0/+23
	Syscall user redirection requires the signal trampoline code to not be captured, in order to support returning with a locked selector while avoiding recursion back into the signal handler. For ia-32, which has the trampoline in the vDSO, expose the entry points to the kernel, such that it can avoid dispatching syscalls from that region to userspace. Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: Andy Lutomirski <luto@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20201127193238.821364-2-krisman@collabora.com
2020-12-02	MAINTAINERS: Add entry for common entry code	Thomas Gleixner	1	-0/+11
	Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2020-11-25	entry: Fix boot for !CONFIG_GENERIC_ENTRY	Gabriel Krisman Bertazi	1	-3/+5
	A copy-pasta mistake tries to set SYSCALL_WORK flags instead of TIF flags for !CONFIG_GENERIC_ENTRY. Also, add safeguards to catch this at compilation time. Fixes: 3136b93c3fb2 ("entry: Expose helpers to migrate TIF to SYSCALL_WORK flags") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Suggested-by: Jann Horn <jannh@google.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/87a6v8qd9p.fsf_-_@collabora.com
2020-11-19	x86: Support HAVE_CONTEXT_TRACKING_OFFSTACK	Frederic Weisbecker	1	-0/+1
	A lot of ground work has been performed on x86 entry code. Fragile path between user_enter() and user_exit() have IRQs disabled. Uses of RCU and intrumentation in these fragile areas have been explicitly annotated and protected. This architecture doesn't need exception_enter()/exception_exit() anymore and has therefore earned CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201117151637.259084-6-frederic@kernel.org
2020-11-19	context_tracking: Only define schedule_user() on ↵	Frederic Weisbecker	1	-1/+1
	!HAVE_CONTEXT_TRACKING_OFFSTACK archs schedule_user() was traditionally used by the entry code's tail to preempt userspace after the call to user_enter(). Indeed the call to user_enter() used to be performed upon syscall exit slow path which was right before the last opportunity to schedule() while resuming to userspace. The context tracking state had to be saved on the task stack and set back to CONTEXT_KERNEL temporarily in order to safely switch to another task. Only a few archs use it now (namely sparc64 and powerpc64) and those implementing HAVE_CONTEXT_TRACKING_OFFSTACK definetly can't rely on it. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201117151637.259084-5-frederic@kernel.org
2020-11-19	sched: Detect call to schedule from critical entry code	Frederic Weisbecker	1	-0/+1
	Detect calls to schedule() between user_enter() and user_exit(). Those are symptoms of early entry code that either forgot to protect a call to schedule() inside exception_enter()/exception_exit() or, in the case of HAVE_CONTEXT_TRACKING_OFFSTACK, enabled interrupts or preemption in a wrong spot. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201117151637.259084-4-frederic@kernel.org
2020-11-19	context_tracking: Don't implement exception_enter/exit() on ↵	Frederic Weisbecker	1	-2/+4
	CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK The typical steps with context tracking are: 1) Task runs in userspace 2) Task enters the kernel (syscall/exception/IRQ) 3) Task switches from context tracking state CONTEXT_USER to CONTEXT_KERNEL (user_exit()) 4) Task does stuff in kernel 5) Task switches from context tracking state CONTEXT_KERNEL to CONTEXT_USER (user_enter()) 6) Task exits the kernel If an exception fires between 5) and 6), the pt_regs and the context tracking disagree on the context of the faulted/trapped instruction. CONTEXT_KERNEL must be set before the exception handler, that's unconditional for those handlers that want to be able to call into schedule(), but CONTEXT_USER must be restored when the exception exits whereas pt_regs tells that we are resuming to kernel space. This can't be fixed with storing the context tracking state in a per-cpu or per-task variable since another exception may fire onto the current one and overwrite the saved state. Also the task can schedule. So it has to be stored in a per task stack. This is how exception_enter()/exception_exit() paper over the problem: 5) Task switches from context tracking state CONTEXT_KERNEL to CONTEXT_USER (user_enter()) 5.1) Exception fires 5.2) prev_state = exception_enter() // save CONTEXT_USER to prev_state // and set CONTEXT_KERNEL 5.3) Exception handler 5.4) exception_enter(prev_state) // restore CONTEXT_USER 5.5) Exception resumes 6) Task exits the kernel The condition to live without exception_enter()/exception_exit() is to forbid exceptions and IRQs between 2) and 3) and between 5) and 6), or if any is allowed to trigger, it won't call into context tracking, eg: NMIs, and it won't schedule. These requirements are met by architectures supporting CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK and those can therefore afford not to implement this hack. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201117151637.259084-3-frederic@kernel.org
2020-11-19	context_tracking: Introduce HAVE_CONTEXT_TRACKING_OFFSTACK	Frederic Weisbecker	1	-0/+17
	Historically, context tracking had to deal with fragile entry code path, ie: before user_exit() is called and after user_enter() is called, in case some of those spots would call schedule() or use RCU. On such cases, the site had to be protected between exception_enter() and exception_exit() that save the context tracking state in the task stack. Such sleepable fragile code path had many different origins: tracing, exceptions, early or late calls to context tracking on syscalls... Aside of that not being pretty, saving the context tracking state on the task stack forces us to run context tracking on all CPUs, including housekeepers, and prevents us to completely shutdown nohz_full at runtime on a CPU in the future as context tracking and its overhead would still need to run system wide. Now thanks to the extensive efforts to sanitize x86 entry code, those conditions have been removed and we can now get rid of these workarounds in this architecture. Create a Kconfig feature to express this achievement. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201117151637.259084-2-frederic@kernel.org
2020-11-16	x86: Reclaim unused x86 TI flags	Gabriel Krisman Bertazi	1	-10/+0
	Reclaim TI flags that were migrated to syscall_work flags. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20201116174206.2639648-11-krisman@collabora.com
2020-11-16	entry: Drop usage of TIF flags in the generic syscall code	Gabriel Krisman Bertazi	2	-24/+19
	Now that the flags migration in the common syscall entry code is complete and the code relies exclusively on thread_info::syscall_work, clean up the accesses to TI flags in that path. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20201116174206.2639648-10-krisman@collabora.com
2020-11-16	audit: Migrate to use SYSCALL_WORK flag	Gabriel Krisman Bertazi	4	-23/+24
	On architectures using the generic syscall entry code the architecture independent syscall work is moved to flags in thread_info::syscall_work. This removes architecture dependencies and frees up TIF bits. Define SYSCALL_WORK_SYSCALL_AUDIT, use it in the generic entry code and convert the code which uses the TIF specific helper functions to use the new *_syscall_work() helpers which either resolve to the new mode for users of the generic entry code or to the TIF based functions for the other architectures. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20201116174206.2639648-9-krisman@collabora.com
2020-11-16	ptrace: Migrate TIF_SYSCALL_EMU to use SYSCALL_WORK flag	Gabriel Krisman Bertazi	6	-23/+22
	On architectures using the generic syscall entry code the architecture independent syscall work is moved to flags in thread_info::syscall_work. This removes architecture dependencies and frees up TIF bits. Define SYSCALL_WORK_SYSCALL_EMU, use it in the generic entry code and convert the code which uses the TIF specific helper functions to use the new *_syscall_work() helpers which either resolve to the new mode for users of the generic entry code or to the TIF based functions for the other architectures. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20201116174206.2639648-8-krisman@collabora.com
2020-11-16	ptrace: Migrate to use SYSCALL_TRACE flag	Gabriel Krisman Bertazi	7	-25/+31
	On architectures using the generic syscall entry code the architecture independent syscall work is moved to flags in thread_info::syscall_work. This removes architecture dependencies and frees up TIF bits. Define SYSCALL_WORK_SYSCALL_TRACE, use it in the generic entry code and convert the code which uses the TIF specific helper functions to use the new *_syscall_work() helpers which either resolve to the new mode for users of the generic entry code or to the TIF based functions for the other architectures. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20201116174206.2639648-7-krisman@collabora.com
2020-11-16	tracepoints: Migrate to use SYSCALL_WORK flag	Gabriel Krisman Bertazi	6	-19/+18
	On architectures using the generic syscall entry code the architecture independent syscall work is moved to flags in thread_info::syscall_work. This removes architecture dependencies and frees up TIF bits. Define SYSCALL_WORK_SYSCALL_TRACEPOINT, use it in the generic entry code and convert the code which uses the TIF specific helper functions to use the new *_syscall_work() helpers which either resolve to the new mode for users of the generic entry code or to the TIF based functions for the other architectures. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20201116174206.2639648-6-krisman@collabora.com
2020-11-16	seccomp: Migrate to use SYSCALL_WORK flag	Gabriel Krisman Bertazi	7	-13/+15
	On architectures using the generic syscall entry code the architecture independent syscall work is moved to flags in thread_info::syscall_work. This removes architecture dependencies and frees up TIF bits. Define SYSCALL_WORK_SECCOMP, use it in the generic entry code and convert the code which uses the TIF specific helper functions to use the new *_syscall_work() helpers which either resolve to the new mode for users of the generic entry code or to the TIF based functions for the other architectures. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20201116174206.2639648-5-krisman@collabora.com
2020-11-16	entry: Wire up syscall_work in common entry code	Gabriel Krisman Bertazi	2	-6/+12
	Prepare the common entry code to use the SYSCALL_WORK flags. They will be defined in subsequent patches for each type of syscall work. SYSCALL_WORK_ENTRY/EXIT are defined for the transition, as they will replace the TIF_ equivalent defines. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20201116174206.2639648-4-krisman@collabora.com
2020-11-16	entry: Expose helpers to migrate TIF to SYSCALL_WORK flags	Gabriel Krisman Bertazi	1	-0/+32
	With the goal to split the syscall work related flags into a separate field that is architecture independent, expose transitional helpers that resolve to either the TIF flags or to the corresponding SYSCALL_WORK flags. This will allow architectures to migrate only when they port to the generic syscall entry code. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20201116174206.2639648-3-krisman@collabora.com
2020-11-16	x86: Expose syscall_work field in thread_info	Gabriel Krisman Bertazi	1	-0/+1
	This field will be used by SYSCALL_WORK flags, migrated from TI flags. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20201116174206.2639648-2-krisman@collabora.com
2020-11-16	Merge branch 'x86/entry' into core/entry	Thomas Gleixner	16	-50/+73
	Prepare for the merging of the syscall_work series which conflicts with the TIF bits overhaul in X86.
2020-11-16	entry: Fix spelling/typo errors in irq entry code	Ira Weiny	2	-4/+4
	s/reguired/required/ s/Interupts/Interrupts/ s/quiescient/quiescent/ s/assemenbly/assembly/ Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201104230157.3378023-1-ira.weiny@intel.com
2020-11-05	x86/entry: Move nmi entry/exit into common code	Thomas Gleixner	7	-50/+87
	Lockdep state handling on NMI enter and exit is nothing specific to X86. It's not any different on other architectures. Also the extra state type is not necessary, irqentry_state_t can carry the necessary information as well. Move it to common code and extend irqentry_state_t to carry lockdep state. [ Ira: Make exit_rcu and lockdep a union as they are mutually exclusive between the IRQ and NMI exceptions, and add kernel documentation for struct irqentry_state_t ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201102205320.1458656-7-ira.weiny@intel.com
2020-11-04	Merge branch 'core/urgent' into core/entry	Thomas Gleixner	15211	-310189/+581524
	Pick up the entry fix before further modifications.
2020-11-04	entry: Fix the incorrect ordering of lockdep and RCU check	Thomas Gleixner	1	-2/+2
	When an exception/interrupt hits kernel space and the kernel is not currently in the idle task then RCU must be watching. irqentry_enter() validates this via rcu_irq_enter_check_tick(), which in turn invokes lockdep when taking a lock. But at that point lockdep does not yet know about the fact that interrupts have been disabled by the CPU, which triggers a lockdep splat complaining about inconsistent state. Invoking trace_hardirqs_off() before rcu_irq_enter_check_tick() defeats the point of rcu_irq_enter_check_tick() because trace_hardirqs_off() uses RCU. So use the same sequence as for the idle case and tell lockdep about the irq state change first, invoke the RCU check and then do the lockdep and tracer update. Fixes: a5497bab5f72 ("entry: Provide generic interrupt entry/exit code") Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Mark Rutland <mark.rutland@arm.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87y2jhl19s.fsf@nanos.tec.linutronix.de
2020-11-02	Linux 5.10-rc2	Linus Torvalds	1	-1/+1

2020-11-01	Merge tag 'x86-urgent-2020-11-01' of ↵	Linus Torvalds	1	-13/+30
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Three fixes all related to #DB: - Handle the BTF bit correctly so it doesn't get lost due to a kernel #DB - Only clear and set the virtual DR6 value used by ptrace on user space triggered #DB. A kernel #DB must leave it alone to ensure data consistency for ptrace. - Make the bitmasking of the virtual DR6 storage correct so it does not lose DR_STEP" * tag 'x86-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/debug: Fix DR_STEP vs ptrace_get_debugreg(6) x86/debug: Only clear/set ->virtual_dr6 for userspace #DB x86/debug: Fix BTF handling
2020-11-01	Merge tag 'timers-urgent-2020-11-01' of ↵	Linus Torvalds	5	-16/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A few fixes for timers/timekeeping: - Prevent undefined behaviour in the timespec64_to_ns() conversion which is used for converting user supplied time input to nanoseconds. It lacked overflow protection. - Mark sched_clock_read_begin/retry() to prevent recursion in the tracer - Remove unused debug functions in the hrtimer and timerlist code" * tag 'timers-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: time: Prevent undefined behaviour in timespec64_to_ns() timers: Remove unused inline funtion debug_timer_free() hrtimer: Remove unused inline function debug_hrtimer_free() time/sched_clock: Mark sched_clock_read_begin/retry() as notrace
2020-11-01	Merge tag 'smp-urgent-2020-11-01' of ↵	Linus Torvalds	2	-2/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp fix from Thomas Gleixner: "A single fix for stop machine. Mark functions no trace to prevent a crash caused by recursion when enabling or disabling a tracer on RISC-V (probably all architectures which patch through stop machine)" * tag 'smp-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: stop_machine, rcu: Mark functions as notrace
2020-11-01	Merge tag 'locking-urgent-2020-11-01' of ↵	Linus Torvalds	2	-14/+10
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "A couple of locking fixes: - Fix incorrect failure injection handling in the fuxtex code - Prevent a preemption warning in lockdep when tracking local_irq_enable() and interrupts are already enabled - Remove more raw_cpu_read() usage from lockdep which causes state corruption on !X86 architectures. - Make the nr_unused_locks accounting in lockdep correct again" * tag 'locking-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lockdep: Fix nr_unused_locks accounting locking/lockdep: Remove more raw_cpu_read() usage futex: Fix incorrect should_fail_futex() handling lockdep: Fix preemption WARN for spurious IRQ-enable
2020-11-01	Merge tag 'char-misc-5.10-rc2' of ↵	Linus Torvalds	95	-26793/+34
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes/removals from Greg KH: "Here's some small fixes for 5.10-rc2 and a big driver removal. The fixes are for some reported issues in the interconnect and coresight drivers, nothing major. The "big" driver removal is the MIC drivers have been asked to be removed as the hardware never shipped and Intel no longer wants to maintain something that no one can use. This is welcomed by many as the DMA usage of these drivers was "interesting" and the security people were starting to question some issues that were starting to be found in the codebase. Note, one of the subsystems for this driver, the "VOP" code, will probably come back in future kernel versions as it was looking to potentially solve some PCIe virtualization issues that a number of other vendors were wanting to solve. But as-is, this codebase didn't work for anyone else so no actual functionality is being removed. All of these have been in linux-next with no reported issues" * tag 'char-misc-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: coresight: cti: Initialize dynamic sysfs attributes coresight: Fix uninitialised pointer bug in etm_setup_aux() coresight: add module license misc: mic: remove the MIC drivers interconnect: qcom: use icc_sync state for sm8[12]50 interconnect: qcom: Ensure that the floor bandwidth value is enforced interconnect: qcom: sc7180: Init BCMs before creating the nodes interconnect: qcom: sdm845: Init BCMs before creating the nodes interconnect: Aggregate before setting initial bandwidth interconnect: qcom: sdm845: Enable keepalive for the MM1 BCM
2020-11-01	Merge tag 'driver-core-5.10-rc2' of ↵	Linus Torvalds	250	-2493/+4134
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and documentation fixes from Greg KH: "Here is one tiny debugfs change to fix up an API where the last user was successfully fixed up in 5.10-rc1 (so it couldn't be merged earlier), and a much larger Documentation/ABI/ update to the files so they can be automatically parsed by our tools. The Documentation/ABI/ updates are just formatting issues, small ones to bring the files into parsable format, and have been acked by numerous subsystem maintainers and the documentation maintainer. I figured it was good to get this into 5.10-rc2 to help wih the merge issues that would arise if these were to stick in linux-next until 5.11-rc1. The debugfs change has been in linux-next for a long time, and the Documentation updates only for the last linux-next release" * tag 'driver-core-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (40 commits) scripts: get_abi.pl: assume ReST format by default docs: ABI: sysfs-class-led-trigger-pattern: remove hw_pattern duplication docs: ABI: sysfs-class-backlight: unify ABI documentation docs: ABI: sysfs-c2port: remove a duplicated entry docs: ABI: sysfs-class-power: unify duplicated properties docs: ABI: unify /sys/class/leds/<led>/brightness documentation docs: ABI: stable: remove a duplicated documentation docs: ABI: change read/write attributes docs: ABI: cleanup several ABI documents docs: ABI: sysfs-bus-nvdimm: use the right format for ABI docs: ABI: vdso: use the right format for ABI docs: ABI: fix syntax to be parsed using ReST notation docs: ABI: convert testing/configfs-acpi to ReST docs: Kconfig/Makefile: add a check for broken ABI files docs: abi-testing.rst: enable --rst-sources when building docs docs: ABI: don't escape ReST-incompatible chars from obsolete and removed docs: ABI: create a 2-depth index for ABI docs: ABI: make it parse ABI/stable as ReST-compatible files docs: ABI: sysfs-uevent: make it compatible with ReST output docs: ABI: testing: make the files compatible with ReST output ...
2020-11-01	Merge tag 'staging-5.10-rc2' of ↵	Linus Torvalds	8	-30/+49
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are some small staging driver fixes for issues that have been reported in 5.10-rc1: - octeon driver fixes - wfx driver fixes - memory leak fix in vchiq driver - fieldbus driver bugfix - comedi driver bugfix All of these have been in linux-next with no reported issues" * tag 'staging-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: fieldbus: anybuss: jump to correct label in an error path staging: wfx: fix test on return value of gpiod_get_value() staging: wfx: fix use of uninitialized pointer staging: mmal-vchiq: Fix memory leak for vchiq_instance staging: comedi: cb_pcidas: Allow 2-channel commands for AO subdevice staging: octeon: Drop on uncorrectable alignment or FCS error staging: octeon: repair "fixed-link" support
2020-11-01	Merge tag 'tty-5.10-rc2' of ↵	Linus Torvalds	4	-38/+37
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are some small TTY and Serial driver fixes for reported issues for 5.10-rc2. They include: - vt ioctl bugfix for reported problems - fsl_lpuart serial driver fix - 21285 serial driver bugfix All have been in linux-next with no reported issues" * tag 'tty-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: vt_ioctl: fix GIO_UNIMAP regression vt: keyboard, extend func_buf_lock to readers vt: keyboard, simplify vt_kdgkbsent tty: serial: fsl_lpuart: LS1021A has a FIFO size of 16 words, like LS1028A tty: serial: 21285: fix lockup on open
2020-11-01	Merge tag 'usb-5.10-rc2' of ↵	Linus Torvalds	22	-138/+195
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB driver fixes from Greg KH: "Here are a number of small bugfixes for reported issues in some USB drivers. They include: - typec bugfixes - xhci bugfixes and lockdep warning fixes - cdc-acm driver regression fix - kernel doc fixes - cdns3 driver bugfixes for a bunch of reported issues - other tiny USB driver fixes All have been in linux-next with no reported issues" * tag 'usb-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: cdns3: gadget: own the lock wrongly at the suspend routine usb: cdns3: Fix on-chip memory overflow issue usb: cdns3: gadget: suspicious implicit sign extension xhci: Don't create stream debugfs files with spinlock held. usb: xhci: Workaround for S3 issue on AMD SNPS 3.0 xHC xhci: Fix sizeof() mismatch usb: typec: stusb160x: fix signedness comparison issue with enum variables usb: typec: add missing MODULE_DEVICE_TABLE() to stusb160x USB: apple-mfi-fastcharge: don't probe unhandled devices usbcore: Check both id_table and match() when both available usb: host: ehci-tegra: Fix error handling in tegra_ehci_probe() usb: typec: stusb160x: fix an IS_ERR() vs NULL check in probe usb: typec: tcpm: reset hard_reset_count for any disconnect usb: cdc-acm: fix cooldown mechanism usb: host: fsl-mph-dr-of: check return of dma_set_mask() usb: fix kernel-doc markups usb: typec: stusb160x: fix some signedness bugs usb: cdns3: Variable 'length' set but not used
2020-11-01	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	26	-76/+306
	Pull kvm fixes from Paolo Bonzini: "ARM: - selftest fix - force PTE mapping on device pages provided via VFIO - fix detection of cacheable mapping at S2 - fallback to PMD/PTE mappings for composite huge pages - fix accounting of Stage-2 PGD allocation - fix AArch32 handling of some of the debug registers - simplify host HYP entry - fix stray pointer conversion on nVHE TLB invalidation - fix initialization of the nVHE code - simplify handling of capabilities exposed to HYP - nuke VCPUs caught using a forbidden AArch32 EL0 x86: - new nested virtualization selftest - miscellaneous fixes - make W=1 fixes - reserve new CPUID bit in the KVM leaves" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: vmx: remove unused variable KVM: selftests: Don't require THP to run tests KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again KVM: selftests: test behavior of unmapped L2 APIC-access address KVM: x86: Fix NULL dereference at kvm_msr_ignored_check() KVM: x86: replace static const variables with macros KVM: arm64: Handle Asymmetric AArch32 systems arm64: cpufeature: upgrade hyp caps to final arm64: cpufeature: reorder cpus_have_{const, final}_cap() KVM: arm64: Factor out is_{vhe,nvhe}_hyp_code() KVM: arm64: Force PTE mapping on fault resulting in a device mapping KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes KVM: arm64: Fix masks in stage2_pte_cacheable() KVM: arm64: Fix AArch32 handling of DBGD{CCINT,SCRext} and DBGVCR KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT KVM: arm64: Drop useless PAN setting on host EL1 to EL2 transition KVM: arm64: Remove leftover kern_hyp_va() in nVHE TLB invalidation KVM: arm64: Don't corrupt tpidr_el2 on failed HVC call x86/kvm: Reserve KVM_FEATURE_MSI_EXT_DEST_ID
2020-11-01	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost	Linus Torvalds	6	-82/+157
	Pull vhost fixes from Michael Tsirkin: "Fixes all over the place. A new UAPI is borderline: can also be considered a new feature but also seems to be the only way we could come up with to fix addressing for userspace - and it seems important to switch to it now before userspace making assumptions about addressing ability of devices is set in stone" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpasim: allow to assign a MAC address vdpasim: fix MAC address configuration vdpa: handle irq bypass register failure case vdpa_sim: Fix DMA mask Revert "vhost-vdpa: fix page pinning leakage in error path" vdpa/mlx5: Fix error return in map_direct_mr() vhost_vdpa: Return -EFAULT if copy_from_user() fails vdpa_sim: implement get_iova_range() vhost: vdpa: report iova range vdpa: introduce config op to get valid iova range
2020-11-01	Merge tag 'flexible-array-conversions-5.10-rc2' of ↵	Linus Torvalds	23	-40/+40
	git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull more flexible-array member conversions from Gustavo A. R. Silva: "Replace zero-length arrays with flexible-array members" * tag 'flexible-array-conversions-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: printk: ringbuffer: Replace zero-length array with flexible-array member net/smc: Replace zero-length array with flexible-array member net/mlx5: Replace zero-length array with flexible-array member mei: hw: Replace zero-length array with flexible-array member gve: Replace zero-length array with flexible-array member Bluetooth: btintel: Replace zero-length array with flexible-array member scsi: target: tcmu: Replace zero-length array with flexible-array member ima: Replace zero-length array with flexible-array member enetc: Replace zero-length array with flexible-array member fs: Replace zero-length array with flexible-array member Bluetooth: Replace zero-length array with flexible-array member params: Replace zero-length array with flexible-array member tracepoint: Replace zero-length array with flexible-array member platform/chrome: cros_ec_proto: Replace zero-length array with flexible-array member platform/chrome: cros_ec_commands: Replace zero-length array with flexible-array member mailbox: zynqmp-ipi-message: Replace zero-length array with flexible-array member dmaengine: ti-cppi5: Replace zero-length array with flexible-array member
2020-10-31	Merge tag 'dma-mapping-5.10-2' of git://git.infradead.org/users/hch/dma-mapping	Linus Torvalds	1	-3/+3
	Pull dma-mapping fix from Christoph Hellwig: "Fix an integer overflow on 32-bit platforms in the new DMA range code (Geert Uytterhoeven)" * tag 'dma-mapping-5.10-2' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: fix 32-bit overflow with CONFIG_ARM_LPAE=n
2020-10-31	Merge tag 'scsi-fixes' of ↵	Linus Torvalds	5	-21/+43
	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four driver fixes and one core fix. The core fix closes a race window where we could kick off a second asynchronous scan because the test and set of the variable preventing it isn't atomic" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: hisi_sas: Stop using queue #0 always for v2 hw scsi: ibmvscsi: Fix potential race after loss of transport scsi: mptfusion: Fix null pointer dereferences in mptscsih_remove() scsi: qla2xxx: Return EBUSY on fcport deletion scsi: core: Don't start concurrent async scan on same host
2020-10-31	KVM: vmx: remove unused variable	Paolo Bonzini	1	-2/+0
	Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>