summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2021-01-05rcutorture: Add writer-side tests of polling grace-period APIPaul E. McKenney1-7/+72
This commit adds writer-side testing of the polling grace-period API. One test verifies that the polling API sees a grace period caused by some other mechanism. Another test verifies that using the polling API to wait for a grace period does not result in too-short grace periods. A third test verifies that the polling API does not report completion within a read-side critical section. A fourth and final test verifies that the polling API does report completion given an intervening grace period. Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/ Reported-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05rcutorture: Prepare for ->start_gp_poll and ->poll_gp_statePaul E. McKenney1-5/+5
The new get_state_synchronize_srcu(), start_poll_synchronize_srcu() and poll_state_synchronize_srcu() functions need to be tested, and so this commit prepares by renaming the rcu_torture_ops field ->get_state to ->get_gp_state in order to be consistent with the upcoming ->start_gp_poll and ->poll_gp_state fields. Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/ Reported-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05srcu: Add comment explaining cookie overflow/wrapPaul E. McKenney1-0/+15
This commit adds to the poll_state_synchronize_srcu() header comment describing the issues surrounding SRCU cookie overflow/wrap for the different kernel configurations. Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/ Reported-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05srcu: Provide polling interfaces for Tree SRCU grace periodsPaul E. McKenney1-4/+63
There is a need for a polling interface for SRCU grace periods, so this commit supplies get_state_synchronize_srcu(), start_poll_synchronize_srcu(), and poll_state_synchronize_srcu() for this purpose. The first can be used if future grace periods are inevitable (perhaps due to a later call_srcu() invocation), the second if future grace periods might not otherwise happen, and the third to check if a grace period has elapsed since the corresponding call to either of the first two. As with get_state_synchronize_rcu() and cond_synchronize_rcu(), the return value from either get_state_synchronize_srcu() or start_poll_synchronize_srcu() must be passed in to a later call to poll_state_synchronize_srcu(). Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/ Reported-by: Kent Overstreet <kent.overstreet@gmail.com> [ paulmck: Add EXPORT_SYMBOL_GPL() per kernel test robot feedback. ] [ paulmck: Apply feedback from Neeraj Upadhyay. ] Link: https://lore.kernel.org/lkml/20201117004017.GA7444@paulmck-ThinkPad-P72/ Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05srcu: Provide polling interfaces for Tiny SRCU grace periodsPaul E. McKenney1-2/+53
There is a need for a polling interface for SRCU grace periods, so this commit supplies get_state_synchronize_srcu(), start_poll_synchronize_srcu(), and poll_state_synchronize_srcu() for this purpose. The first can be used if future grace periods are inevitable (perhaps due to a later call_srcu() invocation), the second if future grace periods might not otherwise happen, and the third to check if a grace period has elapsed since the corresponding call to either of the first two. As with get_state_synchronize_rcu() and cond_synchronize_rcu(), the return value from either get_state_synchronize_srcu() or start_poll_synchronize_srcu() must be passed in to a later call to poll_state_synchronize_srcu(). Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/ Reported-by: Kent Overstreet <kent.overstreet@gmail.com> [ paulmck: Add EXPORT_SYMBOL_GPL() per kernel test robot feedback. ] [ paulmck: Apply feedback from Neeraj Upadhyay. ] Link: https://lore.kernel.org/lkml/20201117004017.GA7444@paulmck-ThinkPad-P72/ Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05srcu: Provide internal interface to start a Tree SRCU grace periodPaul E. McKenney1-29/+37
There is a need for a polling interface for SRCU grace periods. This polling needs to initiate an SRCU grace period without having to queue (and manage) a callback. This commit therefore splits the Tree SRCU __call_srcu() function into callback-initialization and queuing/start-grace-period portions, with the latter in a new function named srcu_gp_start_if_needed(). This function may be passed a NULL callback pointer, in which case it will refrain from queuing anything. Why have the new function mess with queuing? Locking considerations, of course! Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/ Reported-by: Kent Overstreet <kent.overstreet@gmail.com> Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05srcu: Provide internal interface to start a Tiny SRCU grace periodPaul E. McKenney1-6/+11
There is a need for a polling interface for SRCU grace periods. This polling needs to initiate an SRCU grace period without having to queue (and manage) a callback. This commit therefore splits the Tiny SRCU call_srcu() function into callback-queuing and start-grace-period portions, with the latter in a new function named srcu_gp_start_if_needed(). Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/ Reported-by: Kent Overstreet <kent.overstreet@gmail.com> Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05srcu: Make Tiny SRCU use multi-bit grace-period counterPaul E. McKenney1-2/+3
There is a need for a polling interface for SRCU grace periods. This polling needs to distinguish between an SRCU instance being idle on the one hand or in the middle of a grace period on the other. This commit therefore converts the Tiny SRCU srcu_struct structure's srcu_idx from a defacto boolean to a free-running counter, using the bottom bit to indicate that a grace period is in progress. The second-from-bottom bit is thus used as the index returned by srcu_read_lock(). Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/ Reported-by: Kent Overstreet <kent.overstreet@gmail.com> [ paulmck: Fix ->srcu_lock_nesting[] indexing per Neeraj Upadhyay. ] Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05rcu: Enable rcu_normal_after_boot unconditionally for RTJulia Cartwright1-1/+3
Expedited RCU grace periods send IPIs to all non-idle CPUs, and thus can disrupt time-critical code in real-time applications. However, there is a portion of boot-time processing (presumably before any real-time applications have started) where expedited RCU grace periods are the only option. And so it is that experience with the -rt patchset indicates that PREEMPT_RT systems should always set the rcupdate.rcu_normal_after_boot kernel boot parameter. This commit therefore makes the post-boot application environment safe for real-time applications by making PREEMPT_RT systems disable the rcupdate.rcu_normal_after_boot kernel boot parameter and acting as if this parameter had been set. This means that post-boot calls to synchronize_rcu_expedited() will be treated as if they were instead calls to synchronize_rcu(), thus preventing the IPIs, and thus avoiding disrupting real-time applications. Suggested-by: Luiz Capitulino <lcapitulino@redhat.com> Acked-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Julia Cartwright <julia@ni.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> [ paulmck: Update kernel-parameters.txt accordingly. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05rcu: Unconditionally use rcuc threads on PREEMPT_RTScott Wood1-1/+3
PREEMPT_RT systems have long used the rcutree.use_softirq kernel boot parameter to avoid use of RCU_SOFTIRQ handlers, which can disrupt real-time applications by invoking callbacks during return from interrupts that arrived while executing time-critical code. This kernel boot parameter instead runs RCU core processing in an 'rcuc' kthread, thus allowing the scheduler to do its job of avoiding disrupting time-critical code. This commit therefore disables the rcutree.use_softirq kernel boot parameter on PREEMPT_RT systems, thus forcing such systems to do RCU core processing in 'rcuc' kthreads. This approach has long been in use by users of the -rt patchset, and there have been no complaints. There is therefore no way for the system administrator to override this choice, at least without modifying and rebuilding the kernel. Signed-off-by: Scott Wood <swood@redhat.com> [bigeasy: Reword commit message] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> [ paulmck: Update kernel-parameters.txt accordingly. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05rcu: Make RCU_BOOST default on CONFIG_PREEMPT_RTSebastian Andrzej Siewior1-2/+2
On PREEMPT_RT kernels, RCU callbacks are deferred to the `rcuc' kthread. This can stall RCU grace periods due to lengthy preemption not only of RCU readers but also of 'rcuc' kthreads, either of which prevent grace periods from completing, which can in turn result in OOM. Because PREEMPT_RT kernels have more kthreads that can block grace periods, it is more important for such kernels to enable RCU_BOOST. This commit therefore makes RCU_BOOST the default on PREEMPT_RT. RCU_BOOST can still be manually disabled if need be. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05rcu: Record kvfree_call_rcu() call stack for KASANZqiang1-0/+1
This commit adds a call to kasan_record_aux_stack() in kvfree_call_rcu() in order to record the call stack of the code that caused the object to be freed. Please note that this function does not update the allocated/freed state, which is important because RCU readers might still be referencing this object. Acked-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Zqiang <qiang.zhang@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-05rcu/tree: Make rcu_do_batch count how many callbacks were executedJoel Fernandes (Google)3-7/+7
The rcu_do_batch() function extracts the ready-to-invoke callbacks from the rcu_segcblist located in the ->cblist field of the current CPU's rcu_data structure. These callbacks are first moved to a local (unsegmented) rcu_cblist. The rcu_do_batch() function then uses this rcu_cblist's ->len field to count how many CBs it has invoked, but it does so by counting that field down from zero. Finally, this function negates the value in this ->len field (resulting in a positive number) and subtracts the result from the ->len field of the current CPU's ->cblist field. Except that it is sometimes necessary for rcu_do_batch() to stop invoking callbacks mid-stream, despite there being more ready to invoke, for example, if a high-priority task wakes up. In this case the remaining not-yet-invoked callbacks are requeued back onto the CPU's ->cblist, but remain in the ready-to-invoke segment of that list. As above, the negative of the local rcu_cblist's ->len field is still subtracted from the ->len field of the current CPU's ->cblist field. The design of counting down from 0 is confusing and error-prone, plus use of a positive count will make it easier to provide a uniform and consistent API to deal with the per-segment counts that are added later in this series. For example, rcu_segcblist_extract_done_cbs() can unconditionally populate the resulting unsegmented list's ->len field during extraction. This commit therefore explicitly counts how many callbacks were executed in rcu_do_batch() itself, counting up from zero, and then uses that to update the per-CPU segcb list's ->len field, without relying on the downcounting of rcl->len from zero. Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-04Merge branch 'rcu/urgent' of ↵Linus Torvalds1-4/+21
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU fix from Paul McKenney: "This is a fix for a regression in the v5.10 merge window, but it was reported quite late in the v5.10 process, plus generating and testing the fix took some time. The regression is due to commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") which on powerpc can use RCU Tasks before initialization, resulting in boot failures. The fix is straightforward, simply moving initialization of RCU Tasks before the early_initcall()s. The fix has been exposed to -next and kbuild test robot testing, and has been tested by the PowerPC guys" * 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: rcu-tasks: Move RCU-tasks initialization to before early_initcall()
2021-01-03bpf: Fix a task_iter bug caused by a merge conflict resolutionYonghong Song1-0/+1
Latest bpf tree has a bug for bpf_iter selftest: $ ./test_progs -n 4/25 test_bpf_sk_storage_get:PASS:bpf_iter_bpf_sk_storage_helpers__open_and_load 0 nsec test_bpf_sk_storage_get:PASS:socket 0 nsec ... do_dummy_read:PASS:read 0 nsec test_bpf_sk_storage_get:FAIL:bpf_map_lookup_elem map value wasn't set correctly (expected 1792, got -1, err=0) #4/25 bpf_sk_storage_get:FAIL #4 bpf_iter:FAIL Summary: 0/0 PASSED, 0 SKIPPED, 2 FAILED When doing merge conflict resolution, Commit 4bfc4714849d missed to save curr_task to seq_file private data. The task pointer in seq_file private data is passed to bpf program. This caused NULL-pointer task passed to bpf program which will immediately return upon checking whether task pointer is NULL. This patch added back the assignment of curr_task to seq_file private data and fixed the issue. Fixes: 4bfc4714849d ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20201231052418.577024-1-yhs@fb.com
2021-01-01Merge tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-blockLinus Torvalds1-0/+2
Pull io_uring fixes from Jens Axboe: "A few fixes that should go into 5.11, all marked for stable as well: - Fix issue around identity COW'ing and users that share a ring across processes - Fix a hang associated with unregistering fixed files (Pavel) - Move the 'process is exiting' cancelation a bit earlier, so task_works aren't affected by it (Pavel)" * tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block: kernel/io_uring: cancel io_uring before task works io_uring: fix io_sqe_files_unregister() hangs io_uring: add a helper for setting a ref node io_uring: don't assume mm is constant across submits
2020-12-31kernel/io_uring: cancel io_uring before task worksPavel Begunkov1-0/+2
For cancelling io_uring requests it needs either to be able to run currently enqueued task_works or having it shut down by that moment. Otherwise io_uring_cancel_files() may be waiting for requests that won't ever complete. Go with the first way and do cancellations before setting PF_EXITING and so before putting the task_work infrastructure into a transition state where task_work_run() would better not be called. Cc: stable@vger.kernel.org # 5.5+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller3-10/+10
Daniel Borkmann says: ==================== pull-request: bpf 2020-12-28 The following pull-request contains BPF updates for your *net* tree. There is a small merge conflict between bpf tree commit 69ca310f3416 ("bpf: Save correct stopping point in file seq iteration") and net tree commit 66ed594409a1 ("bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu"). The get_files_struct() does not exist anymore in net, so take the hunk in HEAD and add the `info->tid = curr_tid` to the error path: [...] curr_task = task_seq_get_next(ns, &curr_tid, true); if (!curr_task) { info->task = NULL; info->tid = curr_tid; return NULL; } /* set info->task and info->tid */ [...] We've added 10 non-merge commits during the last 9 day(s) which contain a total of 11 files changed, 75 insertions(+), 20 deletions(-). The main changes are: 1) Various AF_XDP fixes such as fill/completion ring leak on failed bind and fixing a race in skb mode's backpressure mechanism, from Magnus Karlsson. 2) Fix latency spikes on lockdep enabled kernels by adding a rescheduling point to BPF hashtab initialization, from Eric Dumazet. 3) Fix a splat in task iterator by saving the correct stopping point in the seq file iteration, from Jonathan Lemon. 4) Fix BPF maps selftest by adding retries in case hashtab returns EBUSY errors on update/deletes, from Andrii Nakryiko. 5) Fix BPF selftest error reporting to something more user friendly if the vmlinux BTF cannot be found, from Kamal Mostafa. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-28Merge branch 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds1-3/+10
Pull workqueue update from Tejun Heo: "The same as the cgroup tree - one commit which was scheduled for the 5.11 merge window. All the commit does is avoding spurious worker wakeups from workqueue allocation / config change path to help cpuisol use cases" * 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Kick a worker based on the actual activation of delayed works
2020-12-28Merge branch 'for-5.11' of ↵Linus Torvalds2-15/+17
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "These three patches were scheduled for the merge window but I forgot to send them out. Sorry about that. None of them are significant and they fit well in a fix pull request too - two are cosmetic and one fixes a memory leak in the mount option parsing path" * 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Fix memory leak when parsing multiple source parameters cgroup/cgroup.c: replace 'of->kn->priv' with of_cft() kernel: cgroup: Mundane spelling fixes throughout the file
2020-12-27Merge tag 'locking-urgent-2020-12-27' of ↵Linus Torvalds2-4/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Misc fixes/updates: - Fix static keys usage in module __init sections - Add separate MAINTAINERS entry for static branches/calls - Fix lockdep splat with CONFIG_PREEMPTIRQ_EVENTS=y tracing" * tag 'locking-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: softirq: Avoid bad tracing / lockdep interaction jump_label/static_call: Add MAINTAINERS jump_label: Fix usage in module __init
2020-12-27Merge tag 'timers-urgent-2020-12-27' of ↵Linus Torvalds3-15/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Update/fix two CPU sanity checks in the hotplug and the boot code, and fix a typo in the Kconfig help text. [ Context: the first two commits are the result of an ongoing annotation+review work of (intentional) tick_do_timer_cpu() data races reported by KCSAN, but the annotations aren't fully cooked yet ]" * tag 'timers-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Fix spelling mistake in Kconfig "fullfill" -> "fulfill" tick/sched: Remove bogus boot "safety" check tick: Remove pointless cpu valid check in hotplug code
2020-12-27Merge tag 'sched-urgent-2020-12-27' of ↵Linus Torvalds2-33/+20
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Fix a context switch performance regression" * tag 'sched-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Optimize finish_lock_switch()
2020-12-25genirq: Fix export of irq_to_desc() for powerpc KVMMichael Ellerman1-1/+1
Commit 64a1b95bb9fe ("genirq: Restrict export of irq_to_desc()") removed the export of irq_to_desc() unless powerpc KVM is being built, because there is still a use of irq_to_desc() in modular code there. However it used: #ifdef CONFIG_KVM_BOOK3S_64_HV Which doesn't work when that symbol is =m, leading to a build failure: ERROR: modpost: "irq_to_desc" [arch/powerpc/kvm/kvm-hv.ko] undefined! Fix it by checking for the definedness of the correct symbol which is CONFIG_KVM_BOOK3S_64_HV_MODULE. Fixes: 64a1b95bb9fe ("genirq: Restrict export of irq_to_desc()") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-25Merge tag 'irq-core-2020-12-23' of ↵Linus Torvalds3-20/+67
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "This is the second attempt after the first one failed miserably and got zapped to unblock the rest of the interrupt related patches. A treewide cleanup of interrupt descriptor (ab)use with all sorts of racy accesses, inefficient and disfunctional code. The goal is to remove the export of irq_to_desc() to prevent these things from creeping up again" * tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) genirq: Restrict export of irq_to_desc() xen/events: Implement irq distribution xen/events: Reduce irq_info:: Spurious_cnt storage size xen/events: Only force affinity mask for percpu interrupts xen/events: Use immediate affinity setting xen/events: Remove disfunct affinity spreading xen/events: Remove unused bind_evtchn_to_irq_lateeoi() net/mlx5: Use effective interrupt affinity net/mlx5: Replace irq_to_desc() abuse net/mlx4: Use effective interrupt affinity net/mlx4: Replace irq_to_desc() abuse PCI: mobiveil: Use irq_data_get_irq_chip_data() PCI: xilinx-nwl: Use irq_data_get_irq_chip_data() NTB/msi: Use irq_has_action() mfd: ab8500-debugfs: Remove the racy fiddling with irq_desc pinctrl: nomadik: Use irq_has_action() drm/i915/pmu: Replace open coded kstat_irqs() copy drm/i915/lpe_audio: Remove pointless irq_to_desc() usage s390/irq: Use irq_desc_kstat_cpu() in show_msi_interrupt() parisc/irq: Use irq_desc_kstat_cpu() in show_interrupts() ...
2020-12-24bpf: Use thread_group_leader()Jonathan Lemon1-1/+1
Instead of directly comparing task->tgid and task->pid, use the thread_group_leader() helper. This helps with readability, and there should be no functional change. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201218185032.2464558-3-jonathan.lemon@gmail.com
2020-12-24bpf: Save correct stopping point in file seq iterationJonathan Lemon1-1/+2
On some systems, some variant of the following splat is repeatedly seen. The common factor in all traces seems to be the entry point to task_file_seq_next(). With the patch, all warnings go away. rcu: INFO: rcu_sched self-detected stall on CPU rcu: \x0926-....: (20992 ticks this GP) idle=d7e/1/0x4000000000000002 softirq=81556231/81556231 fqs=4876 \x09(t=21033 jiffies g=159148529 q=223125) NMI backtrace for cpu 26 CPU: 26 PID: 2015853 Comm: bpftool Kdump: loaded Not tainted 5.6.13-0_fbk4_3876_gd8d1f9bf80bb #1 Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F09_3A12 10/08/2018 Call Trace: <IRQ> dump_stack+0x50/0x70 nmi_cpu_backtrace.cold.6+0x13/0x50 ? lapic_can_unplug_cpu.cold.30+0x40/0x40 nmi_trigger_cpumask_backtrace+0xba/0xca rcu_dump_cpu_stacks+0x99/0xc7 rcu_sched_clock_irq.cold.90+0x1b4/0x3aa ? tick_sched_do_timer+0x60/0x60 update_process_times+0x24/0x50 tick_sched_timer+0x37/0x70 __hrtimer_run_queues+0xfe/0x270 hrtimer_interrupt+0xf4/0x210 smp_apic_timer_interrupt+0x5e/0x120 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:get_pid_task+0x38/0x80 Code: 89 f6 48 8d 44 f7 08 48 8b 00 48 85 c0 74 2b 48 83 c6 55 48 c1 e6 04 48 29 f0 74 19 48 8d 78 20 ba 01 00 00 00 f0 0f c1 50 20 <85> d2 74 27 78 11 83 c2 01 78 0c 48 83 c4 08 c3 31 c0 48 83 c4 08 RSP: 0018:ffffc9000d293dc8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13 RAX: ffff888637c05600 RBX: ffffc9000d293e0c RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000550 RDI: ffff888637c05620 RBP: ffffffff8284eb80 R08: ffff88831341d300 R09: ffff88822ffd8248 R10: ffff88822ffd82d0 R11: 00000000003a93c0 R12: 0000000000000001 R13: 00000000ffffffff R14: ffff88831341d300 R15: 0000000000000000 ? find_ge_pid+0x1b/0x20 task_seq_get_next+0x52/0xc0 task_file_seq_get_next+0x159/0x220 task_file_seq_next+0x4f/0xa0 bpf_seq_read+0x159/0x390 vfs_read+0x8a/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x42/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f95ae73e76e Code: Bad RIP value. RSP: 002b:00007ffc02c1dbf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 000000000170faa0 RCX: 00007f95ae73e76e RDX: 0000000000001000 RSI: 00007ffc02c1dc30 RDI: 0000000000000007 RBP: 00007ffc02c1ec70 R08: 0000000000000005 R09: 0000000000000006 R10: fffffffffffff20b R11: 0000000000000246 R12: 00000000019112a0 R13: 0000000000000000 R14: 0000000000000007 R15: 00000000004283c0 If unable to obtain the file structure for the current task, proceed to the next task number after the one returned from task_seq_get_next(), instead of the next task number from the original iterator. Also, save the stopping task number from task_seq_get_next() on failure in case of restarts. Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets") Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201218185032.2464558-2-jonathan.lemon@gmail.com
2020-12-23Merge tag 'pm-5.11-rc1-2' of ↵Linus Torvalds1-30/+76
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These update the CPPC cpufreq driver and intel_pstate (which involves updating the cpufreq core and the schedutil governor) and make janitorial changes in the ACPI code handling processor objects. Specifics: - Rework the passive-mode "fast switch" path in the intel_pstate driver to allow it receive the minimum (required) and target (desired) performance information from the schedutil governor so as to avoid running some workloads too fast (Rafael Wysocki). - Make the intel_pstate driver allow the policy max limit to be increased after the guaranteed performance value for the given CPU has increased (Rafael Wysocki). - Clean up the handling of CPU coordination types in the CPPC cpufreq driver and make it export frequency domains information to user space via sysfs (Ionela Voinescu). - Fix the ACPI code handling processor objects to use a correct coordination type when it fails to map frequency domains and drop a redundant CPU map initialization from it (Ionela Voinescu, Punit Agrawal)" * tag 'pm-5.11-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Use most recent guaranteed performance values cpufreq: intel_pstate: Implement the ->adjust_perf() callback cpufreq: Add special-purpose fast-switching callback for drivers cpufreq: schedutil: Add util to struct sg_cpu cppc_cpufreq: replace per-cpu data array with a list cppc_cpufreq: expose information on frequency domains cppc_cpufreq: clarify support for coordination types cppc_cpufreq: use policy->cpu as driver of frequency setting ACPI: processor: fix NONE coordination for domain mapping failure
2020-12-23Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-2/+2
Merge KASAN updates from Andrew Morton. This adds a new hardware tag-based mode to KASAN. The new mode is similar to the existing software tag-based KASAN, but relies on arm64 Memory Tagging Extension (MTE) to perform memory and pointer tagging (instead of shadow memory and compiler instrumentation). By Andrey Konovalov and Vincenzo Frascino. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (60 commits) kasan: update documentation kasan, mm: allow cache merging with no metadata kasan: sanitize objects when metadata doesn't fit kasan: clarify comment in __kasan_kfree_large kasan: simplify assign_tag and set_tag calls kasan: don't round_up too much kasan, mm: rename kasan_poison_kfree kasan, mm: check kasan_enabled in annotations kasan: add and integrate kasan boot parameters kasan: inline (un)poison_range and check_invalid_free kasan: open-code kasan_unpoison_slab kasan: inline random_tag for HW_TAGS kasan: inline kasan_reset_tag for tag-based modes kasan: remove __kasan_unpoison_stack kasan: allow VMAP_STACK for HW_TAGS mode kasan, arm64: unpoison stack only with CONFIG_KASAN_STACK kasan: introduce set_alloc_info kasan: rename get_alloc/free_info kasan: simplify quarantine_put call site kselftest/arm64: check GCR_EL1 after context switch ...
2020-12-23Merge tag 'dma-mapping-5.11' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds6-8/+384
Pull dma-mapping updates from Christoph Hellwig: - support for a partial IOMMU bypass (Alexey Kardashevskiy) - add a DMA API benchmark (Barry Song) - misc fixes (Tiezhu Yang, tangjianqiang) * tag 'dma-mapping-5.11' of git://git.infradead.org/users/hch/dma-mapping: selftests/dma: add test application for DMA_MAP_BENCHMARK dma-mapping: add benchmark support for streaming DMA APIs dma-contiguous: fix a typo error in a comment dma-pool: no need to check return value of debugfs_create functions powerpc/dma: Fallback to dma_ops when persistent memory present dma-mapping: Allow mixing bypass and mapped DMA operation
2020-12-22kasan: rename (un)poison_shadow to (un)poison_rangeAndrey Konovalov1-2/+2
This is a preparatory commit for the upcoming addition of a new hardware tag-based (MTE-based) KASAN mode. The new mode won't be using shadow memory. Rename external annotation kasan_unpoison_shadow() to kasan_unpoison_range(), and introduce internal functions (un)poison_range() (without kasan_ prefix). Co-developed-by: Marco Elver <elver@google.com> Link: https://lkml.kernel.org/r/fccdcaa13dc6b2211bf363d6c6d499279a54fe3a.1606161801.git.andreyknvl@google.com Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Alexander Potapenko <glider@google.com> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-22Merge branch 'pm-cpufreq'Rafael J. Wysocki1-30/+76
* pm-cpufreq: cpufreq: intel_pstate: Use most recent guaranteed performance values cpufreq: intel_pstate: Implement the ->adjust_perf() callback cpufreq: Add special-purpose fast-switching callback for drivers cpufreq: schedutil: Add util to struct sg_cpu cppc_cpufreq: replace per-cpu data array with a list cppc_cpufreq: expose information on frequency domains cppc_cpufreq: clarify support for coordination types cppc_cpufreq: use policy->cpu as driver of frequency setting ACPI: processor: fix NONE coordination for domain mapping failure ACPI: processor: Drop duplicate setting of shared_cpu_map
2020-12-22bpf: Add schedule point in htab_init_buckets()Eric Dumazet1-0/+1
We noticed that with a LOCKDEP enabled kernel, allocating a hash table with 65536 buckets would use more than 60ms. htab_init_buckets() runs from process context, it is safe to schedule to avoid latency spikes. Fixes: c50eb518e262 ("bpf: Use separate lockdep class for each hashtab") Reported-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201221192506.707584-1-eric.dumazet@gmail.com
2020-12-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+16
Pull KVM updates from Paolo Bonzini: "Much x86 work was pushed out to 5.12, but ARM more than made up for it. ARM: - PSCI relay at EL2 when "protected KVM" is enabled - New exception injection code - Simplification of AArch32 system register handling - Fix PMU accesses when no PMU is enabled - Expose CSV3 on non-Meltdown hosts - Cache hierarchy discovery fixes - PV steal-time cleanups - Allow function pointers at EL2 - Various host EL2 entry cleanups - Simplification of the EL2 vector allocation s390: - memcg accouting for s390 specific parts of kvm and gmap - selftest for diag318 - new kvm_stat for when async_pf falls back to sync x86: - Tracepoints for the new pagetable code from 5.10 - Catch VFIO and KVM irqfd events before userspace - Reporting dirty pages to userspace with a ring buffer - SEV-ES host support - Nested VMX support for wait-for-SIPI activity state - New feature flag (AVX512 FP16) - New system ioctl to report Hyper-V-compatible paravirtualization features Generic: - Selftest improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: SVM: fix 32-bit compilation KVM: SVM: Add AP_JUMP_TABLE support in prep for AP booting KVM: SVM: Provide support to launch and run an SEV-ES guest KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests KVM: SVM: Provide support for SEV-ES vCPU loading KVM: SVM: Provide support for SEV-ES vCPU creation/loading KVM: SVM: Update ASID allocation to support SEV-ES guests KVM: SVM: Set the encryption mask for the SVM host save area KVM: SVM: Add NMI support for an SEV-ES guest KVM: SVM: Guest FPU state save/restore not needed for SEV-ES guest KVM: SVM: Do not report support for SMM for an SEV-ES guest KVM: x86: Update __get_sregs() / __set_sregs() to support SEV-ES KVM: SVM: Add support for CR8 write traps for an SEV-ES guest KVM: SVM: Add support for CR4 write traps for an SEV-ES guest KVM: SVM: Add support for CR0 write traps for an SEV-ES guest KVM: SVM: Add support for EFER write traps for an SEV-ES guest KVM: SVM: Support string IO operations for an SEV-ES guest KVM: SVM: Support MMIO for an SEV-ES guest KVM: SVM: Create trace events for VMGEXIT MSR protocol processing KVM: SVM: Create trace events for VMGEXIT processing ...
2020-12-19epoll: wire up syscall epoll_pwait2Willem de Bruijn1-0/+2
Split off from prev patch in the series that implements the syscall. Link: https://lkml.kernel.org/r/20201121144401.3727659-4-willemdebruijn.kernel@gmail.com Signed-off-by: Willem de Bruijn <willemb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-19timekeeping: Fix spelling mistake in Kconfig "fullfill" -> "fulfill"Colin Ian King1-1/+1
There is a spelling mistake in the Kconfig help text. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20201217171705.57586-1-colin.king@canonical.com
2020-12-18genirq/msi: Initialize msi_alloc_info before calling msi_domain_prepare_irqs()Zenghui Yu1-1/+1
Since commit 5fe71d271df8 ("irqchip/gic-v3-its: Tag ITS device as shared if allocating for a proxy device"), some of the devices are wrongly marked as "shared" by the ITS driver on systems equipped with the ITS(es). The problem is that the @info->flags may not be initialized anywhere and we end up looking at random bits on the stack. That's obviously not good. We can perform the initialization in the IRQ core layer before calling msi_domain_prepare_irqs(), which is neat enough. Fixes: 5fe71d271df8 ("irqchip/gic-v3-its: Tag ITS device as shared if allocating for a proxy device") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201218060039.1770-1-yuzenghui@huawei.com
2020-12-18softirq: Avoid bad tracing / lockdep interactionPeter Zijlstra1-1/+1
Similar to commit: 1a63dcd8765b ("softirq: Reorder trace_softirqs_on to prevent lockdep splat") __local_bh_enable_ip() can also call into tracing with inconsistent state. Unlike that commit we don't need to bother about the tracepoint because 'cnt-1' never matches preempt_count() (by construction). Reported-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lkml.kernel.org/r/20201218154519.GW3092@hirez.programming.kicks-ass.net
2020-12-18jump_label: Fix usage in module __initPeter Zijlstra1-3/+5
When the static_key is part of the module, and the module calls static_key_inc/enable() from it's __init section *AND* has a static_branch_*() user in that very same __init section, things go wobbly. If the static_key lives outside the module, jump_label_add_module() would append this module's sites to the key and jump_label_update() would take the static_key_linked() branch and all would be fine. If all the sites are outside of __init, then everything will be fine too. However, when all is aligned just as described above, jump_label_update() calls __jump_label_update(.init = false) and we'll not update sites in __init text. Fixes: 19483677684b ("jump_label: Annotate entries that operate on __init code earlier") Reported-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Jessica Yu <jeyu@kernel.org> Link: https://lkml.kernel.org/r/20201216135435.GV3092@hirez.programming.kicks-ass.net
2020-12-18bpf: Remove unused including <linux/version.h>Tian Tao1-1/+0
Remove including <linux/version.h> that don't need it. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1608086835-54523-1-git-send-email-tiantao6@hisilicon.com
2020-12-18Merge tag 'trace-v5.11' of ↵Linus Torvalds36-352/+662
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The major update to this release is that there's a new arch config option called CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS. Currently, only x86_64 enables it. All the ftrace callbacks now take a struct ftrace_regs instead of a struct pt_regs. If the architecture has HAVE_DYNAMIC_FTRACE_WITH_ARGS enabled, then the ftrace_regs will have enough information to read the arguments of the function being traced, as well as access to the stack pointer. This way, if a user (like live kernel patching) only cares about the arguments, then it can avoid using the heavier weight "regs" callback, that puts in enough information in the struct ftrace_regs to simulate a breakpoint exception (needed for kprobes). A new config option that audits the timestamps of the ftrace ring buffer at most every event recorded. Ftrace recursion protection has been cleaned up to move the protection to the callback itself (this saves on an extra function call for those callbacks). Perf now handles its own RCU protection and does not depend on ftrace to do it for it (saving on that extra function call). New debug option to add "recursed_functions" file to tracefs that lists all the places that triggered the recursion protection of the function tracer. This will show where things need to be fixed as recursion slows down the function tracer. The eval enum mapping updates done at boot up are now offloaded to a work queue, as it caused a noticeable pause on slow embedded boards. Various clean ups and last minute fixes" * tag 'trace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits) tracing: Offload eval map updates to a work queue Revert: "ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS" ring-buffer: Add rb_check_bpage in __rb_allocate_pages ring-buffer: Fix two typos in comments tracing: Drop unneeded assignment in ring_buffer_resize() tracing: Disable ftrace selftests when any tracer is running seq_buf: Avoid type mismatch for seq_buf_init ring-buffer: Fix a typo in function description ring-buffer: Remove obsolete rb_event_is_commit() ring-buffer: Add test to validate the time stamp deltas ftrace/documentation: Fix RST C code blocks tracing: Clean up after filter logic rewriting tracing: Remove the useless value assignment in test_create_synth_event() livepatch: Use the default ftrace_ops instead of REGS when ARGS is available ftrace/x86: Allow for arguments to be passed in to ftrace_regs by default ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regs MAINTAINERS: assign ./fs/tracefs to TRACING tracing: Fix some typos in comments ftrace: Remove unused varible 'ret' ring-buffer: Add recording of ring buffer recursion into recursed_functions ...
2020-12-18Merge tag 'modules-for-v5.11' of ↵Linus Torvalds2-89/+121
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull modules updates from Jessica Yu: "Summary of modules changes for the 5.11 merge window: - Fix a race condition between systemd/udev and the module loader. The module loader was sending a uevent before the module was fully initialized (i.e., before its init function has been called). This means udev can start processing the module uevent before the module has finished initializing, and some udev rules expect that the module has initialized already upon receiving the uevent. This resulted in some systemd mount units failing if udev processes the event faster than the module can finish init. This is fixed by delaying the uevent until after the module has called its init routine. - Make the linker array sections for kernel params and module version attributes more robust by switching to use the alignment of the type in question. Namely, linker section arrays will be constructed using the alignment required by the struct (using __alignof__()) as opposed to a specific value such as sizeof(void *) or sizeof(long). This is less likely to cause breakages should the size of the type ever change (Johan Hovold) - Fix module state inconsistency by setting it back to GOING when a module fails to load and is on its way out (Miroslav Benes) - Some comment and code cleanups (Sergey Shtylyov)" * tag 'modules-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module: delay kobject uevent until after module init call module: drop semicolon from version macro init: use type alignment for kernel parameters params: clean up module-param macros params: use type alignment for kernel parameters params: drop redundant "unused" attributes module: simplify version-attribute handling module: drop version-attribute alignment module: fix comment style module: add more 'kernel-doc' comments module: fix up 'kernel-doc' comments module: only handle errors with the *switch* statement in module_sig_check() module: avoid *goto*s in module_sig_check() module: merge repetitive strings in module_sig_check() module: set MODULE_STATE_GOING state when a module fails to load
2020-12-17Merge tag 'fsnotify_for_v5.11-rc1' of ↵Linus Torvalds3-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify updates from Jan Kara: "A few fsnotify fixes from Amir fixing fallout from big fsnotify overhaul a few months back and an improvement of defaults limiting maximum number of inotify watches from Waiman" * tag 'fsnotify_for_v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: fix events reported to watching parent and child inotify: convert to handle_inode_event() interface fsnotify: generalize handle_inode_event() inotify: Increase default inotify.max_user_watches limit to 1048576
2020-12-17Merge tag 'arm-soc-drivers-5.11' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "There are a couple of subsystems maintained by other people that merge their drivers through the SoC tree, those changes include: - The SCMI firmware framework gains support for sensor notifications and for controlling voltage domains. - A large update for the Tegra memory controller driver, integrating it better with the interconnect framework - The memory controller subsystem gains support for Mediatek MT8192 - The reset controller framework gains support for sharing pulsed resets For Soc specific drivers in drivers/soc, the main changes are - The Allwinner/sunxi MBUS gets a rework for the way it handles dma_map_ops and offsets between physical and dma address spaces. - An errata fix plus some cleanups for Freescale Layerscape SoCs - A cleanup for renesas drivers regarding MMIO accesses. - New SoC specific drivers for Mediatek MT8192 and MT8183 power domains - New SoC specific drivers for Aspeed AST2600 LPC bus control and SoC identification. - Core Power Domain support for Qualcomm MSM8916, MSM8939, SDM660 and SDX55. - A rework of the TI AM33xx 'genpd' power domain support to use information from DT instead of platform data - Support for TI AM64x SoCs - Allow building some Amlogic drivers as modules instead of built-in Finally, there are numerous cleanups and smaller bug fixes for Mediatek, Tegra, Samsung, Qualcomm, TI OMAP, Amlogic, Rockchips, Renesas, and Xilinx SoCs" * tag 'arm-soc-drivers-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (222 commits) soc: mediatek: mmsys: Specify HAS_IOMEM dependency for MTK_MMSYS firmware: xilinx: Properly align function parameter firmware: xilinx: Add a blank line after function declaration firmware: xilinx: Remove additional newline firmware: xilinx: Fix kernel-doc warnings firmware: xlnx-zynqmp: fix compilation warning soc: xilinx: vcu: add missing register NUM_CORE soc: xilinx: vcu: use vcu-settings syscon registers dt-bindings: soc: xlnx: extract xlnx, vcu-settings to separate binding soc: xilinx: vcu: drop useless success message clk: samsung: mark PM functions as __maybe_unused soc: samsung: exynos-chipid: initialize later - with arch_initcall soc: samsung: exynos-chipid: order list of SoCs by name memory: jz4780_nemc: Fix potential NULL dereference in jz4780_nemc_probe() memory: ti-emif-sram: only build for ARMv7 memory: tegra30: Support interconnect framework memory: tegra20: Support hardware versioning and clean up OPP table initialization dt-bindings: memory: tegra20-emc: Document opp-supported-hw property soc: rockchip: io-domain: Fix error return code in rockchip_iodomain_probe() reset-controller: ti: force the write operation when assert or deassert ...
2020-12-17Merge branch 'stable/for-linus-5.11' of ↵Linus Torvalds1-2/+18
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb update from Konrad Rzeszutek Wilk: "A generic (but for right now engaged only with AMD SEV) mechanism to adjust a larger size SWIOTLB based on the total memory of the SEV guests which right now require the bounce buffer for interacting with the outside world. Normal knobs (swiotlb=XYZ) still work" * 'stable/for-linus-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: x86,swiotlb: Adjust SWIOTLB bounce buffer size for SEV guests
2020-12-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds3-67/+0
Pull rdma updates from Jason Gunthorpe: "A smaller set of patches, nothing stands out as being particularly major this cycle. The biggest item would be the new HIP09 HW support from HNS, otherwise it was pretty quiet for new work here: - Driver bug fixes and updates: bnxt_re, cxgb4, rxe, hns, i40iw, cxgb4, mlx4 and mlx5 - Bug fixes and polishing for the new rts ULP - Cleanup of uverbs checking for allowed driver operations - Use sysfs_emit all over the place - Lots of bug fixes and clarity improvements for hns - hip09 support for hns - NDR and 50/100Gb signaling rates - Remove dma_virt_ops and go back to using the IB DMA wrappers - mlx5 optimizations for contiguous DMA regions" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (147 commits) RDMA/cma: Don't overwrite sgid_attr after device is released RDMA/mlx5: Fix MR cache memory leak RDMA/rxe: Use acquire/release for memory ordering RDMA/hns: Simplify AEQE process for different types of queue RDMA/hns: Fix inaccurate prints RDMA/hns: Fix incorrect symbol types RDMA/hns: Clear redundant variable initialization RDMA/hns: Fix coding style issues RDMA/hns: Remove unnecessary access right set during INIT2INIT RDMA/hns: WARN_ON if get a reserved sl from users RDMA/hns: Avoid filling sl in high 3 bits of vlan_id RDMA/hns: Do shift on traffic class when using RoCEv2 RDMA/hns: Normalization the judgment of some features RDMA/hns: Limit the length of data copied between kernel and userspace RDMA/mlx4: Remove bogus dev_base_lock usage RDMA/uverbs: Fix incorrect variable type RDMA/core: Do not indicate device ready when device enablement fails RDMA/core: Clean up cq pool mechanism RDMA/core: Update kernel documentation for ib_create_named_qp() MAINTAINERS: SOFT-ROCE: Change Zhu Yanjun's email address ...
2020-12-16Merge tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-blockLinus Torvalds1-136/+45
Pull block updates from Jens Axboe: "Another series of killing more code than what is being added, again thanks to Christoph's relentless cleanups and tech debt tackling. This contains: - blk-iocost improvements (Baolin Wang) - part0 iostat fix (Jeffle Xu) - Disable iopoll for split bios (Jeffle Xu) - block tracepoint cleanups (Christoph Hellwig) - Merging of struct block_device and hd_struct (Christoph Hellwig) - Rework/cleanup of how block device sizes are updated (Christoph Hellwig) - Simplification of gendisk lookup and removal of block device aliasing (Christoph Hellwig) - Block device ioctl cleanups (Christoph Hellwig) - Removal of bdget()/blkdev_get() as exported API (Christoph Hellwig) - Disk change rework, avoid ->revalidate_disk() (Christoph Hellwig) - sbitmap improvements (Pavel Begunkov) - Hybrid polling fix (Pavel Begunkov) - bvec iteration improvements (Pavel Begunkov) - Zone revalidation fixes (Damien Le Moal) - blk-throttle limit fix (Yu Kuai) - Various little fixes" * tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-block: (126 commits) blk-mq: fix msec comment from micro to milli seconds blk-mq: update arg in comment of blk_mq_map_queue blk-mq: add helper allocating tagset->tags Revert "block: Fix a lockdep complaint triggered by request queue flushing" nvme-loop: use blk_mq_hctx_set_fq_lock_class to set loop's lock class blk-mq: add new API of blk_mq_hctx_set_fq_lock_class block: disable iopoll for split bio block: Improve blk_revalidate_disk_zones() checks sbitmap: simplify wrap check sbitmap: replace CAS with atomic and sbitmap: remove swap_lock sbitmap: optimise sbitmap_deferred_clear() blk-mq: skip hybrid polling if iopoll doesn't spin blk-iocost: Factor out the base vrate change into a separate function blk-iocost: Factor out the active iocgs' state check into a separate function blk-iocost: Move the usage ratio calculation to the correct place blk-iocost: Remove unnecessary advance declaration blk-iocost: Fix some typos in comments blktrace: fix up a kerneldoc comment block: remove the request_queue to argument request based tracepoints ...
2020-12-16Merge tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-blockLinus Torvalds2-51/+1
Pull TIF_NOTIFY_SIGNAL updates from Jens Axboe: "This sits on top of of the core entry/exit and x86 entry branch from the tip tree, which contains the generic and x86 parts of this work. Here we convert the rest of the archs to support TIF_NOTIFY_SIGNAL. With that done, we can get rid of JOBCTL_TASK_WORK from task_work and signal.c, and also remove a deadlock work-around in io_uring around knowing that signal based task_work waking is invoked with the sighand wait queue head lock. The motivation for this work is to decouple signal notify based task_work, of which io_uring is a heavy user of, from sighand. The sighand lock becomes a huge contention point, particularly for threaded workloads where it's shared between threads. Even outside of threaded applications it's slower than it needs to be. Roman Gershman <romger@amazon.com> reported that his networked workload dropped from 1.6M QPS at 80% CPU to 1.0M QPS at 100% CPU after io_uring was changed to use TIF_NOTIFY_SIGNAL. The time was all spent hammering on the sighand lock, showing 57% of the CPU time there [1]. There are further cleanups possible on top of this. One example is TIF_PATCH_PENDING, where a patch already exists to use TIF_NOTIFY_SIGNAL instead. Hopefully this will also lead to more consolidation, but the work stands on its own as well" [1] https://github.com/axboe/liburing/issues/215 * tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-block: (28 commits) io_uring: remove 'twa_signal_ok' deadlock work-around kernel: remove checking for TIF_NOTIFY_SIGNAL signal: kill JOBCTL_TASK_WORK io_uring: JOBCTL_TASK_WORK is no longer used by task_work task_work: remove legacy TWA_SIGNAL path sparc: add support for TIF_NOTIFY_SIGNAL riscv: add support for TIF_NOTIFY_SIGNAL nds32: add support for TIF_NOTIFY_SIGNAL ia64: add support for TIF_NOTIFY_SIGNAL h8300: add support for TIF_NOTIFY_SIGNAL c6x: add support for TIF_NOTIFY_SIGNAL alpha: add support for TIF_NOTIFY_SIGNAL xtensa: add support for TIF_NOTIFY_SIGNAL arm: add support for TIF_NOTIFY_SIGNAL microblaze: add support for TIF_NOTIFY_SIGNAL hexagon: add support for TIF_NOTIFY_SIGNAL csky: add support for TIF_NOTIFY_SIGNAL openrisc: add support for TIF_NOTIFY_SIGNAL sh: add support for TIF_NOTIFY_SIGNAL um: add support for TIF_NOTIFY_SIGNAL ...
2020-12-16Merge tag 'seccomp-v5.11-rc1' of ↵Linus Torvalds1-3/+293
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "The major change here is finally gaining seccomp constant-action bitmaps, which internally reduces the seccomp overhead for many real-world syscall filters to O(1), as discussed at Plumbers this year. - Improve seccomp performance via constant-action bitmaps (YiFei Zhu & Kees Cook) - Fix bogus __user annotations (Jann Horn) - Add missed CONFIG for improved selftest coverage (Mickaël Salaün)" * tag 'seccomp-v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests/seccomp: Update kernel config seccomp: Remove bogus __user annotations seccomp/cache: Report cache data through /proc/pid/seccomp_cache xtensa: Enable seccomp architecture tracking sh: Enable seccomp architecture tracking s390: Enable seccomp architecture tracking riscv: Enable seccomp architecture tracking powerpc: Enable seccomp architecture tracking parisc: Enable seccomp architecture tracking csky: Enable seccomp architecture tracking arm: Enable seccomp architecture tracking arm64: Enable seccomp architecture tracking selftests/seccomp: Compare bitmap vs filter overhead x86: Enable seccomp architecture tracking seccomp/cache: Add "emulator" to check if filter is constant allow seccomp/cache: Lookup syscall allowlist bitmap for fast path
2020-12-16Merge tag 'audit-pr-20201214' of ↵Linus Torvalds2-29/+18
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "A small set of audit patches for v5.11 with four patches in total and only one of any real significance. Richard's patch to trigger accompanying records causes the kernel to emit additional related records when an audit event occurs; helping provide some much needed context to events in the audit log. It is also worth mentioning that this is a revised patch based on an earlier attempt that had to be reverted in the v5.8 time frame. Everything passes our test suite, and with no problems reported please merge this for v5.11" * tag 'audit-pr-20201214' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: replace atomic_add_return() audit: fix macros warnings audit: trigger accompanying records when no rules present audit: fix a kernel-doc markup