Age | Commit message (Collapse) | Author | Files | Lines |
|
Restructure the code a bit to make it simpler, fix some formatting problems
and add READ_ONCE/WRITE_ONCE to make sure there's no compiler load/store
tearing to the variables that can be accessed across CPUs.
Started with Mathieu Desnoyers's patch:
Link: https://lore.kernel.org/lkml/20210203175741.20665-1-mathieu.desnoyers@efficios.com/
And will keep his signature, but I will take the responsibility of this
being correct, and keep the authorship.
Link: https://lkml.kernel.org/r/20210204143004.61126582@gandalf.local.home
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
With static calls, a tracepoint can call the callback directly if there is
only one callback registered to that tracepoint. When there is more than
one, the static call will call the tracepoint's "iterator" function, which
needs to reload the tracepoint's "funcs" array again, as it could have
changed since the first time it was loaded.
But an arch without static calls is punished by having to load the
tracepoint's "funcs" array twice. Once in the DO_TRACE macro, and once
again in the iterator macro.
For archs without static calls, there's no reason to load the array macro
in the first place, since the iterator function will do it anyway.
Change the __DO_TRACE_CALL() macro to do the load and call of the
tracepoints funcs array only for architectures with static calls, and just
call the iterator function directly for architectures without static calls.
Link: https://lkml.kernel.org/r/20210208201050.909329787@goodmis.org
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
While working on a clean up that would restructure the difference between
architectures that have static calls vs those that do not, I was stumbling
over the "data_args" parameter that includes "__data" in the arguments. The
issue was that one version didn't even need it, while the other one did.
Instead of injecting a "__data = NULL;" into the macro for the unneeded
version, just remove it completely.
The original idea behind data_args is that there may be a case of a
tracepoint with no arguments. But this is considered bad practice, and all
tracepoints should pass something to that location (that's what tracepoints
were created for).
Link: https://lkml.kernel.org/r/20210208201050.768074128@goodmis.org
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The ftrace event subsystem is only created for showing the format files of
events created by the ftrace tracers, and are not trace events. The ftrace
subsystem currently has both the "enable" and "filter" files that in other
subsystems are used to enable/disable all events within the subsystem or set
a filter for all the subsystem events.
As ftrace subsystem events do not use enable or filter operations, these
files are useless in the ftrace subsystem. Remove them.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The kernel thread executing test can run on any cpu, which might be
different cpu latency tracer is running on, as a result, the
big latency caused by preemptirq delay test can't be detected.
Therefore, the argument cpu_affinity is added to be passed to test,
ensure it's running on the same cpu with latency tracer.
e.g.
cyclictest -p 90 -m -c 0 -i 1000 -a 3
modprobe preemptirq_delay_test test_mode=preempt delay=500 \
burst_size=3 cpu_affinity=3
Link: https://lkml.kernel.org/r/1611797713-20965-1-git-send-email-chensong_2000@189.cn
Signed-off-by: Song Chen <chensong_2000@189.cn>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The list of tracepoint callbacks is managed by an array that is protected
by RCU. To update this array, a new array is allocated, the updates are
copied over to the new array, and then the list of functions for the
tracepoint is switched over to the new array. After a completion of an RCU
grace period, the old array is freed.
This process happens for both adding a callback as well as removing one.
But on removing a callback, if the new array fails to be allocated, the
callback is not removed, and may be used after it is freed by the clients
of the tracepoint.
There's really no reason to fail if the allocation for a new array fails
when removing a function. Instead, the function can simply be replaced by a
stub function that could be cleaned up on the next modification of the
array. That is, instead of calling the function registered to the
tracepoint, it would call a stub function in its place.
Link: https://lore.kernel.org/r/20201115055256.65625-1-mmullins@mmlx.us
Link: https://lore.kernel.org/r/20201116175107.02db396d@gandalf.local.home
Link: https://lore.kernel.org/r/20201117211836.54acaef2@oasis.local.home
Link: https://lkml.kernel.org/r/20201118093405.7a6d2290@gandalf.local.home
[ Note, this version does use undefined compiler behavior (assuming that
a stub function with no parameters or return, can be called by a location
that thinks it has parameters but still no return value. Static calls
do the same thing, so this trick is not without precedent.
There's another solution that uses RCU tricks and is more complex, but
can be an alternative if this solution becomes an issue.
Link: https://lore.kernel.org/lkml/20210127170721.58bce7cc@gandalf.local.home/
]
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: netdev <netdev@vger.kernel.org>
Cc: bpf <bpf@vger.kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Florian Weimer <fw@deneb.enyo.de>
Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints")
Reported-by: syzbot+83aa762ef23b6f0d1991@syzkaller.appspotmail.com
Reported-by: syzbot+d29e58bb557324e55e5e@syzkaller.appspotmail.com
Reported-by: Matt Mullins <mmullins@mmlx.us>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Matt Mullins <mmullins@mmlx.us>
|
|
Defining DEBUG should only be done in development.
So remove DEBUG.
Link: https://lkml.kernel.org/r/20210115153348.131791-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Add description for trace_array_put() parameter.
kernel/trace/trace.c:464: warning: Function parameter or member 'this_tr' not described in 'trace_array_put'
Link: https://lkml.kernel.org/r/20210112111202.23508-1-huobean@gmail.com
Signed-off-by: Bean Huo <beanhuo@micron.com>
[ Merged as one of the original fixes was already fixed by someone else ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
s/controling/controlling/p
Link: https://lkml.kernel.org/r/20210112045008.29834-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
There is a spelling mistake in the Kconfig help text. Fix it.
Link: https://lkml.kernel.org/r/20201216114051.12056-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
I can't imagine when or why `current' would return a NULL pointer. This
check was added in commit
72829bc3d63cd ("ftrace: move enums to ftrace.h and make helper function global")
but it doesn't give me hint why it was needed.
Assume `current' never returns a NULL pointer and remove the check.
Link: https://lkml.kernel.org/r/20210125194511.3924915-5-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
PREEMPT_RT does not report "serving softirq" because the tracing core
looks at the preemption counter while PREEMPT_RT does not update it
while processing softirqs in order to remain preemptible. The
information is stored somewhere else.
The in_serving_softirq() macro and the SOFTIRQ_OFFSET define are still
working but not on the preempt-counter.
Use in_serving_softirq() macro which works on PREEMPT_RT. On !PREEMPT_RT
the compiler (gcc-10 / clang-11) is smart enough to optimize the
in_serving_softirq() related read of the preemption counter away.
The only difference I noticed by using in_serving_softirq() on
!PREEMPT_RT is that gcc-10 implemented tracing_gen_ctx_flags() as
reading FLAG, jmp _tracing_gen_ctx_flags(). Without in_serving_softirq()
it inlined _tracing_gen_ctx_flags() into tracing_gen_ctx_flags().
Link: https://lkml.kernel.org/r/20210125194511.3924915-4-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Inline tracing_gen_ctx_flags(). This allows to have one ifdef
CONFIG_TRACE_IRQFLAGS_SUPPORT.
This requires to move `trace_flag_type' so tracing_gen_ctx_flags() can
use it.
Link: https://lkml.kernel.org/r/20210125194511.3924915-3-bigeasy@linutronix.de
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Link: https://lkml.kernel.org/r/20210125140323.6b1ff20c@gandalf.local.home
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The state of the interrupts (irqflags) and the preemption counter are
both passed down to tracing_generic_entry_update(). Only one bit of
irqflags is actually required: The on/off state. The complete 32bit
of the preemption counter isn't needed. Just whether of the upper bits
(softirq, hardirq and NMI) are set and the preemption depth is needed.
The irqflags and the preemption counter could be evaluated early and the
information stored in an integer `trace_ctx'.
tracing_generic_entry_update() would use the upper bits as the
TRACE_FLAG_* and the lower 8bit as the disabled-preemption depth
(considering that one must be substracted from the counter in one
special cases).
The actual preemption value is not used except for the tracing record.
The `irqflags' variable is mostly used only for the tracing record. An
exception here is for instance wakeup_tracer_call() or
probe_wakeup_sched_switch() which explicilty disable interrupts and use
that `irqflags' to save (and restore) the IRQ state and to record the
state.
Struct trace_event_buffer has also the `pc' and flags' members which can
be replaced with `trace_ctx' since their actual value is not used
outside of trace recording.
This will reduce tracing_generic_entry_update() to simply assign values
to struct trace_entry. The evaluation of the TRACE_FLAG_* bits is moved
to _tracing_gen_ctx_flags() which replaces preempt_count() and
local_save_flags() invocations.
As an example, ftrace_syscall_enter() may invoke:
- trace_buffer_lock_reserve() -> … -> tracing_generic_entry_update()
- event_trigger_unlock_commit()
-> ftrace_trace_stack() -> … -> tracing_generic_entry_update()
-> ftrace_trace_userstack() -> … -> tracing_generic_entry_update()
In this case the TRACE_FLAG_* bits were evaluated three times. By using
the `trace_ctx' they are evaluated once and assigned three times.
A build with all tracers enabled on x86-64 with and without the patch:
text data bss dec hex filename
21970669 17084168 7639260 46694097 2c87ed1 vmlinux.old
21970293 17084168 7639260 46693721 2c87d59 vmlinux.new
text shrank by 379 bytes, data remained constant.
Link: https://lkml.kernel.org/r/20210125194511.3924915-2-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Remove the cpumask check, as we has done it at the beginning of
the function.
Also fix a typo. s/also the on the/also on the/
Link: https://lkml.kernel.org/r/20201224144634.3210-1-hqjagain@gmail.com
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
The cpu_buffer argument is not used inside the rb_inc_page() after
commit 3adc54fa82a6 ("ring-buffer: make the buffer a true circular link
list").
And cpu_buffer argument is not used inside the two functions too,
rb_is_head_page/rb_set_list_to_head.
Link: https://lkml.kernel.org/r/20201225140356.23008-1-hqjagain@gmail.com
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Since commit b6f11df26fdc ("trace: Call tracing_reset_online_cpus before
tracer->init()"), get/put_cpu() are not needed anymore.
We can use raw_smp_processor_id() instead.
Link: https://lkml.kernel.org/r/20201230140521.31920-1-hqjagain@gmail.com
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Update kernel-doc parameter after
commit b3b1e6ededa4 ("ftrace: Create set_ftrace_notrace_pid to not trace tasks")
added @filtered_no_pids.
Link: https://lkml.kernel.org/r/20201231153558.4804-1-hqjagain@gmail.com
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Attributing the function allows the compiler to more thoroughly
check the use of the function with -Wformat and similar flags.
Link: https://lkml.kernel.org/r/20201221162715.3757291-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
When executing a tracepoint, the tracepoint's func is dereferenced twice -
in __DO_TRACE() (where the returned pointer is checked) and later on in
__traceiter_##_name where the returned pointer is dereferenced without
checking which leads to races against tracepoint_removal_sync() and
crashes.
This adds a check before referencing the pointer in tracepoint_ptr_deref.
Link: https://lkml.kernel.org/r/20210202072326.120557-1-aik@ozlabs.ru
Cc: stable@vger.kernel.org
Fixes: d25e37d89dd2f ("tracepoint: Optimize using static_call()")
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Our system encountered a re-init error when re-registering same kretprobe,
where the kretprobe_instance in rp->free_instances is illegally accessed
after re-init.
Implementation to avoid re-registration has been introduced for kprobe
before, but lags for register_kretprobe(). We must check if kprobe has
been re-registered before re-initializing kretprobe, otherwise it will
destroy the data struct of kretprobe registered, which can lead to memory
leak, system crash, also some unexpected behaviors.
We use check_kprobe_rereg() to check if kprobe has been re-registered
before running register_kretprobe()'s body, for giving a warning message
and terminate registration process.
Link: https://lkml.kernel.org/r/20210128124427.2031088-1-bobo.shaobowang@huawei.com
Cc: stable@vger.kernel.org
Fixes: 1f0ab40976460 ("kprobes: Prevent re-registration of the same kprobe")
[ The above commit should have been done for kretprobes too ]
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@linux.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Fix kprobe_on_func_entry() returns error code instead of false so that
register_kretprobe() can return an appropriate error code.
append_trace_kprobe() expects the kprobe registration returns -ENOENT
when the target symbol is not found, and it checks whether the target
module is unloaded or not. If the target module doesn't exist, it
defers to probe the target symbol until the module is loaded.
However, since register_kretprobe() returns -EINVAL instead of -ENOENT
in that case, it always fail on putting the kretprobe event on unloaded
modules. e.g.
Kprobe event:
/sys/kernel/debug/tracing # echo p xfs:xfs_end_io >> kprobe_events
[ 16.515574] trace_kprobe: This probe might be able to register after target module is loaded. Continue.
Kretprobe event: (p -> r)
/sys/kernel/debug/tracing # echo r xfs:xfs_end_io >> kprobe_events
sh: write error: Invalid argument
/sys/kernel/debug/tracing # cat error_log
[ 41.122514] trace_kprobe: error: Failed to register probe event
Command: r xfs:xfs_end_io
^
To fix this bug, change kprobe_on_func_entry() to detect symbol lookup
failure and return -ENOENT in that case. Otherwise it returns -EINVAL
or 0 (succeeded, given address is on the entry).
Link: https://lkml.kernel.org/r/161176187132.1067016.8118042342894378981.stgit@devnote2
Cc: stable@vger.kernel.org
Fixes: 59158ec4aef7 ("tracing/kprobes: Check the probe on unloaded module correctly")
Reported-by: Jianlin Lv <Jianlin.Lv@arm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Eaerlier, tracing was disabled when reading the trace file. This behavior
was changed with:
commit 06e0a548bad0 ("tracing: Do not disable tracing when reading the
trace file").
This doesn't seem to work with the latency tracers.
The above mentioned commit dit not only change the behavior but also added
an option to emulate the old behavior. The idea with this patch is to
enable this pause-on-trace option when the latency tracers are used.
Link: https://lkml.kernel.org/r/20210119164344.37500-2-Viktor.Rosendahl@bmw.de
Cc: stable@vger.kernel.org
Fixes: 06e0a548bad0 ("tracing: Do not disable tracing when reading the trace file")
Signed-off-by: Viktor Rosendahl <Viktor.Rosendahl@bmw.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
will disable or pause function graph tracing, as there's some paths in
bringing down the CPU that can have issues with its return address being
modified. The task_struct structure has a "tracing_graph_pause" atomic
counter, that when set to something other than zero, the function graph
tracer will not modify the return address.
The problem is that the tracing_graph_pause counter is initialized when the
function graph tracer is enabled. This can corrupt the counter for the idle
task if it is suspended in these architectures.
CPU 1 CPU 2
----- -----
do_idle()
cpu_suspend()
pause_graph_tracing()
task_struct->tracing_graph_pause++ (0 -> 1)
start_graph_tracing()
for_each_online_cpu(cpu) {
ftrace_graph_init_idle_task(cpu)
task-struct->tracing_graph_pause = 0 (1 -> 0)
unpause_graph_tracing()
task_struct->tracing_graph_pause-- (0 -> -1)
The above should have gone from 1 to zero, and enabled function graph
tracing again. But instead, it is set to -1, which keeps it disabled.
There's no reason that the field tracing_graph_pause on the task_struct can
not be initialized at boot up.
Cc: stable@vger.kernel.org
Fixes: 380c4b1411ccd ("tracing/function-graph-tracer: append the tracing_graph_flag")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339
Reported-by: pierre.gondois@arm.com
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
|
|
Pull arch/sh updates from Rich Felker:
"Cleanup and warning fixes"
* tag 'sh-for-5.11' of git://git.libc.org/linux-sh:
sh/intc: Restore devm_ioremap() alignment
sh: mach-sh03: remove duplicate include
arch: sh: remove duplicate include
sh: Drop ARCH_NR_GPIOS definition
sh: Remove unused HAVE_COPY_THREAD_TLS macro
sh: remove CONFIG_IDE from most defconfig
sh: mm: Convert to DEFINE_SHOW_ATTRIBUTE
sh: intc: Convert to DEFINE_SHOW_ATTRIBUTE
arch/sh: hyphenate Non-Uniform in Kconfig prompt
sh: dma: fix kconfig dependency for G2_DMA
|
|
Pull io_uring fixes from Jens Axboe:
"Still need a final cancelation fix that isn't quite done done,
expected in the next day or two. That said, this contains:
- Wakeup fix for IOPOLL requests
- SQPOLL split close op handling fix
- Ensure that any use of io_uring fd itself is marked as inflight
- Short non-regular file read fix (Pavel)
- Fix up bad false positive warning (Pavel)
- SQPOLL fixes (Pavel)
- In-flight removal fix (Pavel)"
* tag 'io_uring-5.11-2021-01-24' of git://git.kernel.dk/linux-block:
io_uring: account io_uring internal files as REQ_F_INFLIGHT
io_uring: fix sleeping under spin in __io_clean_op
io_uring: fix short read retries for non-reg files
io_uring: fix SQPOLL IORING_OP_CLOSE cancelation state
io_uring: fix skipping disabling sqo on exec
io_uring: fix uring_flush in exit_files() warning
io_uring: fix false positive sqo warning on flush
io_uring: iopoll requests should also wake task ->in_idle state
|
|
Pull block fixes from Jens Axboe:
- NVMe pull request from Christoph:
- fix a status code in nvmet (Chaitanya Kulkarni)
- avoid double completions in nvme-rdma/nvme-tcp (Chao Leng)
- fix the CMB support to cope with NVMe 1.4 controllers (Klaus Jensen)
- fix PRINFO handling in the passthrough ioctl (Revanth Rajashekar)
- fix a double DMA unmap in nvme-pci
- lightnvm error path leak fix (Pan)
- MD pull request from Song:
- Flush request fix (Xiao)
* tag 'block-5.11-2021-01-24' of git://git.kernel.dk/linux-block:
lightnvm: fix memory leak when submit fails
nvme-pci: fix error unwind in nvme_map_data
nvme-pci: refactor nvme_unmap_data
md: Set prev_flush_start and flush_bio in an atomic way
nvmet: set right status on error in id-ns handler
nvme-pci: allow use of cmb on v1.4 controllers
nvme-tcp: avoid request double completion for concurrent nvme_tcp_timeout
nvme-rdma: avoid request double completion for concurrent nvme_rdma_timeout
nvme: check the PRINFO bit before deciding the host buffer length
|
|
Merge misc fixes from Andrew Morton:
"18 patches.
Subsystems affected by this patch series: mm (pagealloc, memcg, kasan,
memory-failure, and highmem), ubsan, proc, and MAINTAINERS"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
MAINTAINERS: add a couple more files to the Clang/LLVM section
proc_sysctl: fix oops caused by incorrect command parameters
powerpc/mm/highmem: use __set_pte_at() for kmap_local()
mips/mm/highmem: use set_pte() for kmap_local()
mm/highmem: prepare for overriding set_pte_at()
sparc/mm/highmem: flush cache and TLB
mm: fix page reference leak in soft_offline_page()
ubsan: disable unsigned-overflow check for i386
kasan, mm: fix resetting page_alloc tags for HW_TAGS
kasan, mm: fix conflicts with init_on_alloc/free
kasan: fix HW_TAGS boot parameters
kasan: fix incorrect arguments passing in kasan_add_zero_shadow
kasan: fix unaligned address is unhandled in kasan_remove_zero_shadow
mm: fix numa stats for thp migration
mm: memcg: fix memcg file_dirty numa stat
mm: memcg/slab: optimize objcg stock draining
mm: fix initialization of struct page for holes in memory layout
x86/setup: don't remove E820_TYPE_RAM for pfn 0
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char/misc driver fixes for 5.11-rc5:
- habanalabs driver fixes
- phy driver fixes
- hwtracing driver fixes
- rtsx cardreader driver fix
All of these have been in linux-next with no reported issues"
* tag 'char-misc-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc: rtsx: init value of aspm_enabled
habanalabs: disable FW events on device removal
habanalabs: fix backward compatibility of idle check
habanalabs: zero pci counters packet before submit to FW
intel_th: pci: Add Alder Lake-P support
stm class: Fix module init return on allocation failure
habanalabs: prevent soft lockup during unmap
habanalabs: fix reset process in case of failures
habanalabs: fix dma_addr passed to dma_mmap_coherent
phy: mediatek: allow compile-testing the dsi phy
phy: cpcap-usb: Fix warning for missing regulator_disable
PHY: Ingenic: fix unconditional build of phy-ingenic-usb
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some small driver core fixes for 5.11-rc5 that resolve some
reported problems:
- revert of a -rc1 patch that was causing problems with some machines
- device link device name collision problem fix (busses only have to
name devices unique to their bus, not unique to all busses)
- kernfs splice bugfixes to resolve firmware loading problems for
Qualcomm systems.
- other tiny driver core fixes for minor issues reported.
All of these have been in linux-next with no reported problems"
* tag 'driver-core-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Fix device link device name collision
driver core: Extend device_is_dependent()
kernfs: wire up ->splice_read and ->splice_write
kernfs: implement ->write_iter
kernfs: implement ->read_iter
Revert "driver core: Reorder devices on successful probe"
Driver core: platform: Add extra error check in devm_platform_get_irqs_affinity()
drivers core: Free dma_range_map when driver probe failed
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/IIO driver fixes from Greg KH:
"Here are some IIO driver fixes for 5.11-rc5 to resolve some reported
problems.
Nothing major, just a few small fixes, all of these have been in
linux-next for a while and full details are in the shortlog"
* tag 'staging-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: sx9310: Fix semtech,avg-pos-strength setting when > 16
iio: common: st_sensors: fix possible infinite loop in st_sensors_irq_thread
iio: ad5504: Fix setting power-down state
counter:ti-eqep: remove floor
drivers: iio: temperature: Add delay after the addressed reset command in mlx90632.c
iio: adc: ti_am335x_adc: remove omitted iio_kfifo_free()
dt-bindings: iio: accel: bma255: Fix bmc150/bmi055 compatible
iio: sx9310: Off by one in sx9310_read_thresh()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are three small tty/serial fixes for 5.11-rc5 to resolve reported
problems:
- two patches to fix up writing to ttys with splice
- mvebu-uart driver fix for reported problem
All of these have been in linux-next with no reported problems"
* tag 'tty-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: fix up hung_up_tty_write() conversion
tty: implement write_iter
serial: mvebu-uart: fix tx lost characters at power off
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB driver fixes for 5.11-rc5. They resolve:
- xhci issues for some reported problems
- ehci driver issue for one specific device
- USB gadget fixes for some reported problems
- cdns3 driver fixes for issues reported
- MAINTAINERS file update
- thunderbolt minor fix
All of these have been in linux-next with no reported issues"
* tag 'usb-5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: bdc: Make bdc pci driver depend on BROKEN
xhci: tegra: Delay for disabling LFPS detector
xhci: make sure TRB is fully written before giving it to the controller
usb: udc: core: Use lock when write to soft_connect
USB: gadget: dummy-hcd: Fix errors in port-reset handling
usb: gadget: aspeed: fix stop dma register setting.
USB: ehci: fix an interrupt calltrace error
ehci: fix EHCI host controller initialization sequence
MAINTAINERS: update Peter Chen's email address
thunderbolt: Drop duplicated 0x prefix from format string
MAINTAINERS: Update address for Cadence USB3 driver
usb: cdns3: imx: improve driver .remove API
usb: cdns3: imx: fix can't create core device the second time issue
usb: cdns3: imx: fix writing read-only memory issue
|
|
The K: entry should ensure that Nick and I always get CC'd on patches that
touch these files but it is better to be explicit rather than implicit.
Link: https://lkml.kernel.org/r/20210114004059.2129921-1-natechancellor@gmail.com
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The process_sysctl_arg() does not check whether val is empty before
invoking strlen(val). If the command line parameter () is incorrectly
configured and val is empty, oops is triggered.
For example:
"hung_task_panic=1" is incorrectly written as "hung_task_panic", oops is
triggered. The call stack is as follows:
Kernel command line: .... hung_task_panic
......
Call trace:
__pi_strlen+0x10/0x98
parse_args+0x278/0x344
do_sysctl_args+0x8c/0xfc
kernel_init+0x5c/0xf4
ret_from_fork+0x10/0x30
To fix it, check whether "val" is empty when "phram" is a sysctl field.
Error codes are returned in the failure branch, and error logs are
generated by parse_args().
Link: https://lkml.kernel.org/r/20210118133029.28580-1-nixiaoming@huawei.com
Fixes: 3db978d480e2843 ("kernel/sysctl: support setting sysctl parameters from kernel command line")
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: <stable@vger.kernel.org> [5.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The original PowerPC highmem mapping function used __set_pte_at() to
denote that the mapping is per CPU. This got lost with the conversion
to the generic implementation.
Override the default map function.
Link: https://lkml.kernel.org/r/20210112170411.281464308@linutronix.de
Fixes: 47da42b27a56 ("powerpc/mm/highmem: Switch to generic kmap atomic")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
set_pte_at() on MIPS invokes update_cache() which might recurse into
kmap_local().
Use set_pte() like the original MIPS highmem implementation did.
Link: https://lkml.kernel.org/r/20210112170411.187513575@linutronix.de
Fixes: a4c33e83bca1 ("mips/mm/highmem: Switch to generic kmap atomic")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Paul Cercueil <paul@crapouillou.net>
Reported-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The generic kmap_local() map function uses set_pte_at(), but MIPS requires
set_pte() and PowerPC wants __set_pte_at().
Provide arch_kmap_local_set_pte() and default it to set_pte_at().
Link: https://lkml.kernel.org/r/20210112170411.056306194@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch series "mm/highmem: Fix fallout from generic kmap_local
conversions".
The kmap_local conversion wreckaged sparc, mips and powerpc as it missed
some of the details in the original implementation.
This patch (of 4):
The recent conversion to the generic kmap_local infrastructure failed to
assign the proper pre/post map/unmap flush operations for sparc.
Sparc requires cache flush before map/unmap and tlb flush afterwards.
Link: https://lkml.kernel.org/r/20210112170136.078559026@linutronix.de
Link: https://lkml.kernel.org/r/20210112170410.905976187@linutronix.de
Fixes: 3293efa97807 ("sparc/mm/highmem: Switch to generic kmap atomic")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Andreas Larsson <andreas@gaisler.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The conversion to move pfn_to_online_page() internal to
soft_offline_page() missed that the get_user_pages() reference taken by
the madvise() path needs to be dropped when pfn_to_online_page() fails.
Note the direct sysfs-path to soft_offline_page() does not perform a
get_user_pages() lookup.
When soft_offline_page() is handed a pfn_valid() && !pfn_to_online_page()
pfn the kernel hangs at dax-device shutdown due to a leaked reference.
Link: https://lkml.kernel.org/r/161058501210.1840162.8108917599181157327.stgit@dwillia2-desk3.amr.corp.intel.com
Fixes: feec24a6139d ("mm, soft-offline: convert parameter to pfn")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Building ubsan kernels even for compile-testing introduced these
warnings in my randconfig environment:
crypto/blake2b_generic.c:98:13: error: stack frame size of 9636 bytes in function 'blake2b_compress' [-Werror,-Wframe-larger-than=]
static void blake2b_compress(struct blake2b_state *S,
crypto/sha512_generic.c:151:13: error: stack frame size of 1292 bytes in function 'sha512_generic_block_fn' [-Werror,-Wframe-larger-than=]
static void sha512_generic_block_fn(struct sha512_state *sst, u8 const *src,
lib/crypto/curve25519-fiat32.c:312:22: error: stack frame size of 2180 bytes in function 'fe_mul_impl' [-Werror,-Wframe-larger-than=]
static noinline void fe_mul_impl(u32 out[10], const u32 in1[10], const u32 in2[10])
lib/crypto/curve25519-fiat32.c:444:22: error: stack frame size of 1588 bytes in function 'fe_sqr_impl' [-Werror,-Wframe-larger-than=]
static noinline void fe_sqr_impl(u32 out[10], const u32 in1[10])
Further testing showed that this is caused by
-fsanitize=unsigned-integer-overflow, but is isolated to the 32-bit x86
architecture.
The one in blake2b immediately overflows the 8KB stack area
architectures, so better ensure this never happens by disabling the
option for 32-bit x86.
Link: https://lkml.kernel.org/r/20210112202922.2454435-1-arnd@kernel.org
Link: https://lore.kernel.org/lkml/20201230154749.746641-1-arnd@kernel.org/
Fixes: d0a3ac549f38 ("ubsan: enable for all*config builds")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Marco Elver <elver@google.com>
Cc: George Popescu <georgepope@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A previous commit added resetting KASAN page tags to
kernel_init_free_pages() to avoid false-positives due to accesses to
metadata with the hardware tag-based mode.
That commit did reset page tags before the metadata access, but didn't
restore them after. As the result, KASAN fails to detect bad accesses
to page_alloc allocations on some configurations.
Fix this by recovering the tag after the metadata access.
Link: https://lkml.kernel.org/r/02b5bcd692e912c27d484030f666b350ad7e4ae4.1611074450.git.andreyknvl@google.com
Fixes: aa1ef4d7b3f6 ("kasan, mm: reset tags when accessing metadata")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A few places where SLUB accesses object's data or metadata were missed
in a previous patch. This leads to false positives with hardware
tag-based KASAN when bulk allocations are used with init_on_alloc/free.
Fix the false-positives by resetting pointer tags during these accesses.
(The kasan_reset_tag call is removed from slab_alloc_node, as it's added
into maybe_wipe_obj_freeptr.)
Link: https://linux-review.googlesource.com/id/I50dd32838a666e173fe06c3c5c766f2c36aae901
Link: https://lkml.kernel.org/r/093428b5d2ca8b507f4a79f92f9929b35f7fada7.1610731872.git.andreyknvl@google.com
Fixes: aa1ef4d7b3f67 ("kasan, mm: reset tags when accessing metadata")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The initially proposed KASAN command line parameters are redundant.
This change drops the complex "kasan.mode=off/prod/full" parameter and
adds a simpler kill switch "kasan=off/on" instead. The new parameter
together with the already existing ones provides a cleaner way to
express the same set of features.
The full set of parameters with this change:
kasan=off/on - whether KASAN is enabled
kasan.fault=report/panic - whether to only print a report or also panic
kasan.stacktrace=off/on - whether to collect alloc/free stack traces
Default values:
kasan=on
kasan.fault=report
kasan.stacktrace=on (if CONFIG_DEBUG_KERNEL=y)
kasan.stacktrace=off (otherwise)
Link: https://linux-review.googlesource.com/id/Ib3694ed90b1e8ccac6cf77dfd301847af4aba7b8
Link: https://lkml.kernel.org/r/4e9c4a4bdcadc168317deb2419144582a9be6e61.1610736745.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
kasan_remove_zero_shadow() shall use original virtual address, start and
size, instead of shadow address.
Link: https://lkml.kernel.org/r/20210103063847.5963-1-lecopzer@gmail.com
Fixes: 0207df4fa1a86 ("kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN")
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
During testing kasan_populate_early_shadow and kasan_remove_zero_shadow,
if the shadow start and end address in kasan_remove_zero_shadow() is not
aligned to PMD_SIZE, the remain unaligned PTE won't be removed.
In the test case for kasan_remove_zero_shadow():
shadow_start: 0xffffffb802000000, shadow end: 0xffffffbfbe000000
3-level page table:
PUD_SIZE: 0x40000000 PMD_SIZE: 0x200000 PAGE_SIZE: 4K
0xffffffbf80000000 ~ 0xffffffbfbdf80000 will not be removed because in
kasan_remove_pud_table(), kasan_pmd_table(*pud) is true but the next
address is 0xffffffbfbdf80000 which is not aligned to PUD_SIZE.
In the correct condition, this should fallback to the next level
kasan_remove_pmd_table() but the condition flow always continue to skip
the unaligned part.
Fix by correcting the condition when next and addr are neither aligned.
Link: https://lkml.kernel.org/r/20210103135621.83129-1-lecopzer@gmail.com
Fixes: 0207df4fa1a86 ("kernel/memremap, kasan: make ZONE_DEVICE with work with KASAN")
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: YJ Chiang <yj.chiang@mediatek.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Fix a kernel panic in mips-cpu due to invalid irq domain hierarchy.
- Fix to not lose IPIs on bcm2836.
- Fix for a bogus marking of ITS devices as shared due to unitialized
stack variable.
- Clear a phantom interrupt on qcom-pdc to unblock suspend.
- Small cleanups, warning and build fixes.
* tag 'irq_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Export irq_check_status_bit()
irqchip/mips-cpu: Set IPI domain parent chip
irqchip/pruss: Simplify the TI_PRUSS_INTC Kconfig
irqchip/loongson-liointc: Fix build warnings
driver core: platform: Add extra error check in devm_platform_get_irqs_affinity()
irqchip/bcm2836: Fix IPI acknowledgement after conversion to handle_percpu_devid_irq
irqchip/irq-sl28cpld: Convert comma to semicolon
genirq/msi: Initialize msi_alloc_info before calling msi_domain_prepare_irqs()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Borislav Petkov:
- Adjust objtool to handle a recent binutils change to not generate
unused symbols anymore.
- Revert the fail-the-build-on-fatal-errors objtool strategy for now
due to the ever-increasing matrix of supported toolchains/plugins and
them causing too many such fatal errors currently.
- Do not add empty symbols to objdump's rbtree to accommodate clang
removing section symbols.
* tag 'objtool_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Don't fail on missing symbol table
objtool: Don't fail the kernel build on fatal errors
objtool: Don't add empty symbols to the rbtree
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Correct the marking of kthreads which are supposed to run on a
specific, single CPU vs such which are affine to only one CPU, mark
per-cpu workqueue threads as such and make sure that marking
"survives" CPU hotplug. Fix CPU hotplug issues with such kthreads.
- A fix to not push away tasks on CPUs coming online.
- Have workqueue CPU hotplug code use cpu_possible_mask when breaking
affinity on CPU offlining so that pending workers can finish on newly
arrived onlined CPUs too.
- Dump tasks which haven't vacated a CPU which is currently being
unplugged.
- Register a special scale invariance callback which gets called on
resume from RAM to read out APERF/MPERF after resume and thus make
the schedutil scaling governor more precise.
* tag 'sched_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Relax the set_cpus_allowed_ptr() semantics
sched: Fix CPU hotplug / tighten is_per_cpu_kthread()
sched: Prepare to use balance_push in ttwu()
workqueue: Restrict affinity change to rescuer
workqueue: Tag bound workers with KTHREAD_IS_PER_CPU
kthread: Extract KTHREAD_IS_PER_CPU
sched: Don't run cpu-online with balance_push() enabled
workqueue: Use cpu_possible_mask instead of cpu_active_mask to break affinity
sched/core: Print out straggler tasks in sched_cpu_dying()
x86: PM: Register syscore_ops for scale invariance
|