summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2020-12-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+16
Pull KVM updates from Paolo Bonzini: "Much x86 work was pushed out to 5.12, but ARM more than made up for it. ARM: - PSCI relay at EL2 when "protected KVM" is enabled - New exception injection code - Simplification of AArch32 system register handling - Fix PMU accesses when no PMU is enabled - Expose CSV3 on non-Meltdown hosts - Cache hierarchy discovery fixes - PV steal-time cleanups - Allow function pointers at EL2 - Various host EL2 entry cleanups - Simplification of the EL2 vector allocation s390: - memcg accouting for s390 specific parts of kvm and gmap - selftest for diag318 - new kvm_stat for when async_pf falls back to sync x86: - Tracepoints for the new pagetable code from 5.10 - Catch VFIO and KVM irqfd events before userspace - Reporting dirty pages to userspace with a ring buffer - SEV-ES host support - Nested VMX support for wait-for-SIPI activity state - New feature flag (AVX512 FP16) - New system ioctl to report Hyper-V-compatible paravirtualization features Generic: - Selftest improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: SVM: fix 32-bit compilation KVM: SVM: Add AP_JUMP_TABLE support in prep for AP booting KVM: SVM: Provide support to launch and run an SEV-ES guest KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests KVM: SVM: Provide support for SEV-ES vCPU loading KVM: SVM: Provide support for SEV-ES vCPU creation/loading KVM: SVM: Update ASID allocation to support SEV-ES guests KVM: SVM: Set the encryption mask for the SVM host save area KVM: SVM: Add NMI support for an SEV-ES guest KVM: SVM: Guest FPU state save/restore not needed for SEV-ES guest KVM: SVM: Do not report support for SMM for an SEV-ES guest KVM: x86: Update __get_sregs() / __set_sregs() to support SEV-ES KVM: SVM: Add support for CR8 write traps for an SEV-ES guest KVM: SVM: Add support for CR4 write traps for an SEV-ES guest KVM: SVM: Add support for CR0 write traps for an SEV-ES guest KVM: SVM: Add support for EFER write traps for an SEV-ES guest KVM: SVM: Support string IO operations for an SEV-ES guest KVM: SVM: Support MMIO for an SEV-ES guest KVM: SVM: Create trace events for VMGEXIT MSR protocol processing KVM: SVM: Create trace events for VMGEXIT processing ...
2020-12-19epoll: wire up syscall epoll_pwait2Willem de Bruijn1-0/+2
Split off from prev patch in the series that implements the syscall. Link: https://lkml.kernel.org/r/20201121144401.3727659-4-willemdebruijn.kernel@gmail.com Signed-off-by: Willem de Bruijn <willemb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-18Merge tag 'trace-v5.11' of ↵Linus Torvalds36-352/+662
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The major update to this release is that there's a new arch config option called CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS. Currently, only x86_64 enables it. All the ftrace callbacks now take a struct ftrace_regs instead of a struct pt_regs. If the architecture has HAVE_DYNAMIC_FTRACE_WITH_ARGS enabled, then the ftrace_regs will have enough information to read the arguments of the function being traced, as well as access to the stack pointer. This way, if a user (like live kernel patching) only cares about the arguments, then it can avoid using the heavier weight "regs" callback, that puts in enough information in the struct ftrace_regs to simulate a breakpoint exception (needed for kprobes). A new config option that audits the timestamps of the ftrace ring buffer at most every event recorded. Ftrace recursion protection has been cleaned up to move the protection to the callback itself (this saves on an extra function call for those callbacks). Perf now handles its own RCU protection and does not depend on ftrace to do it for it (saving on that extra function call). New debug option to add "recursed_functions" file to tracefs that lists all the places that triggered the recursion protection of the function tracer. This will show where things need to be fixed as recursion slows down the function tracer. The eval enum mapping updates done at boot up are now offloaded to a work queue, as it caused a noticeable pause on slow embedded boards. Various clean ups and last minute fixes" * tag 'trace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits) tracing: Offload eval map updates to a work queue Revert: "ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS" ring-buffer: Add rb_check_bpage in __rb_allocate_pages ring-buffer: Fix two typos in comments tracing: Drop unneeded assignment in ring_buffer_resize() tracing: Disable ftrace selftests when any tracer is running seq_buf: Avoid type mismatch for seq_buf_init ring-buffer: Fix a typo in function description ring-buffer: Remove obsolete rb_event_is_commit() ring-buffer: Add test to validate the time stamp deltas ftrace/documentation: Fix RST C code blocks tracing: Clean up after filter logic rewriting tracing: Remove the useless value assignment in test_create_synth_event() livepatch: Use the default ftrace_ops instead of REGS when ARGS is available ftrace/x86: Allow for arguments to be passed in to ftrace_regs by default ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regs MAINTAINERS: assign ./fs/tracefs to TRACING tracing: Fix some typos in comments ftrace: Remove unused varible 'ret' ring-buffer: Add recording of ring buffer recursion into recursed_functions ...
2020-12-18Merge tag 'modules-for-v5.11' of ↵Linus Torvalds2-89/+121
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull modules updates from Jessica Yu: "Summary of modules changes for the 5.11 merge window: - Fix a race condition between systemd/udev and the module loader. The module loader was sending a uevent before the module was fully initialized (i.e., before its init function has been called). This means udev can start processing the module uevent before the module has finished initializing, and some udev rules expect that the module has initialized already upon receiving the uevent. This resulted in some systemd mount units failing if udev processes the event faster than the module can finish init. This is fixed by delaying the uevent until after the module has called its init routine. - Make the linker array sections for kernel params and module version attributes more robust by switching to use the alignment of the type in question. Namely, linker section arrays will be constructed using the alignment required by the struct (using __alignof__()) as opposed to a specific value such as sizeof(void *) or sizeof(long). This is less likely to cause breakages should the size of the type ever change (Johan Hovold) - Fix module state inconsistency by setting it back to GOING when a module fails to load and is on its way out (Miroslav Benes) - Some comment and code cleanups (Sergey Shtylyov)" * tag 'modules-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module: delay kobject uevent until after module init call module: drop semicolon from version macro init: use type alignment for kernel parameters params: clean up module-param macros params: use type alignment for kernel parameters params: drop redundant "unused" attributes module: simplify version-attribute handling module: drop version-attribute alignment module: fix comment style module: add more 'kernel-doc' comments module: fix up 'kernel-doc' comments module: only handle errors with the *switch* statement in module_sig_check() module: avoid *goto*s in module_sig_check() module: merge repetitive strings in module_sig_check() module: set MODULE_STATE_GOING state when a module fails to load
2020-12-17Merge tag 'fsnotify_for_v5.11-rc1' of ↵Linus Torvalds3-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify updates from Jan Kara: "A few fsnotify fixes from Amir fixing fallout from big fsnotify overhaul a few months back and an improvement of defaults limiting maximum number of inotify watches from Waiman" * tag 'fsnotify_for_v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: fix events reported to watching parent and child inotify: convert to handle_inode_event() interface fsnotify: generalize handle_inode_event() inotify: Increase default inotify.max_user_watches limit to 1048576
2020-12-17Merge tag 'arm-soc-drivers-5.11' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "There are a couple of subsystems maintained by other people that merge their drivers through the SoC tree, those changes include: - The SCMI firmware framework gains support for sensor notifications and for controlling voltage domains. - A large update for the Tegra memory controller driver, integrating it better with the interconnect framework - The memory controller subsystem gains support for Mediatek MT8192 - The reset controller framework gains support for sharing pulsed resets For Soc specific drivers in drivers/soc, the main changes are - The Allwinner/sunxi MBUS gets a rework for the way it handles dma_map_ops and offsets between physical and dma address spaces. - An errata fix plus some cleanups for Freescale Layerscape SoCs - A cleanup for renesas drivers regarding MMIO accesses. - New SoC specific drivers for Mediatek MT8192 and MT8183 power domains - New SoC specific drivers for Aspeed AST2600 LPC bus control and SoC identification. - Core Power Domain support for Qualcomm MSM8916, MSM8939, SDM660 and SDX55. - A rework of the TI AM33xx 'genpd' power domain support to use information from DT instead of platform data - Support for TI AM64x SoCs - Allow building some Amlogic drivers as modules instead of built-in Finally, there are numerous cleanups and smaller bug fixes for Mediatek, Tegra, Samsung, Qualcomm, TI OMAP, Amlogic, Rockchips, Renesas, and Xilinx SoCs" * tag 'arm-soc-drivers-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (222 commits) soc: mediatek: mmsys: Specify HAS_IOMEM dependency for MTK_MMSYS firmware: xilinx: Properly align function parameter firmware: xilinx: Add a blank line after function declaration firmware: xilinx: Remove additional newline firmware: xilinx: Fix kernel-doc warnings firmware: xlnx-zynqmp: fix compilation warning soc: xilinx: vcu: add missing register NUM_CORE soc: xilinx: vcu: use vcu-settings syscon registers dt-bindings: soc: xlnx: extract xlnx, vcu-settings to separate binding soc: xilinx: vcu: drop useless success message clk: samsung: mark PM functions as __maybe_unused soc: samsung: exynos-chipid: initialize later - with arch_initcall soc: samsung: exynos-chipid: order list of SoCs by name memory: jz4780_nemc: Fix potential NULL dereference in jz4780_nemc_probe() memory: ti-emif-sram: only build for ARMv7 memory: tegra30: Support interconnect framework memory: tegra20: Support hardware versioning and clean up OPP table initialization dt-bindings: memory: tegra20-emc: Document opp-supported-hw property soc: rockchip: io-domain: Fix error return code in rockchip_iodomain_probe() reset-controller: ti: force the write operation when assert or deassert ...
2020-12-17Merge branch 'stable/for-linus-5.11' of ↵Linus Torvalds1-2/+18
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb update from Konrad Rzeszutek Wilk: "A generic (but for right now engaged only with AMD SEV) mechanism to adjust a larger size SWIOTLB based on the total memory of the SEV guests which right now require the bounce buffer for interacting with the outside world. Normal knobs (swiotlb=XYZ) still work" * 'stable/for-linus-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: x86,swiotlb: Adjust SWIOTLB bounce buffer size for SEV guests
2020-12-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds3-67/+0
Pull rdma updates from Jason Gunthorpe: "A smaller set of patches, nothing stands out as being particularly major this cycle. The biggest item would be the new HIP09 HW support from HNS, otherwise it was pretty quiet for new work here: - Driver bug fixes and updates: bnxt_re, cxgb4, rxe, hns, i40iw, cxgb4, mlx4 and mlx5 - Bug fixes and polishing for the new rts ULP - Cleanup of uverbs checking for allowed driver operations - Use sysfs_emit all over the place - Lots of bug fixes and clarity improvements for hns - hip09 support for hns - NDR and 50/100Gb signaling rates - Remove dma_virt_ops and go back to using the IB DMA wrappers - mlx5 optimizations for contiguous DMA regions" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (147 commits) RDMA/cma: Don't overwrite sgid_attr after device is released RDMA/mlx5: Fix MR cache memory leak RDMA/rxe: Use acquire/release for memory ordering RDMA/hns: Simplify AEQE process for different types of queue RDMA/hns: Fix inaccurate prints RDMA/hns: Fix incorrect symbol types RDMA/hns: Clear redundant variable initialization RDMA/hns: Fix coding style issues RDMA/hns: Remove unnecessary access right set during INIT2INIT RDMA/hns: WARN_ON if get a reserved sl from users RDMA/hns: Avoid filling sl in high 3 bits of vlan_id RDMA/hns: Do shift on traffic class when using RoCEv2 RDMA/hns: Normalization the judgment of some features RDMA/hns: Limit the length of data copied between kernel and userspace RDMA/mlx4: Remove bogus dev_base_lock usage RDMA/uverbs: Fix incorrect variable type RDMA/core: Do not indicate device ready when device enablement fails RDMA/core: Clean up cq pool mechanism RDMA/core: Update kernel documentation for ib_create_named_qp() MAINTAINERS: SOFT-ROCE: Change Zhu Yanjun's email address ...
2020-12-16Merge tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-blockLinus Torvalds1-136/+45
Pull block updates from Jens Axboe: "Another series of killing more code than what is being added, again thanks to Christoph's relentless cleanups and tech debt tackling. This contains: - blk-iocost improvements (Baolin Wang) - part0 iostat fix (Jeffle Xu) - Disable iopoll for split bios (Jeffle Xu) - block tracepoint cleanups (Christoph Hellwig) - Merging of struct block_device and hd_struct (Christoph Hellwig) - Rework/cleanup of how block device sizes are updated (Christoph Hellwig) - Simplification of gendisk lookup and removal of block device aliasing (Christoph Hellwig) - Block device ioctl cleanups (Christoph Hellwig) - Removal of bdget()/blkdev_get() as exported API (Christoph Hellwig) - Disk change rework, avoid ->revalidate_disk() (Christoph Hellwig) - sbitmap improvements (Pavel Begunkov) - Hybrid polling fix (Pavel Begunkov) - bvec iteration improvements (Pavel Begunkov) - Zone revalidation fixes (Damien Le Moal) - blk-throttle limit fix (Yu Kuai) - Various little fixes" * tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-block: (126 commits) blk-mq: fix msec comment from micro to milli seconds blk-mq: update arg in comment of blk_mq_map_queue blk-mq: add helper allocating tagset->tags Revert "block: Fix a lockdep complaint triggered by request queue flushing" nvme-loop: use blk_mq_hctx_set_fq_lock_class to set loop's lock class blk-mq: add new API of blk_mq_hctx_set_fq_lock_class block: disable iopoll for split bio block: Improve blk_revalidate_disk_zones() checks sbitmap: simplify wrap check sbitmap: replace CAS with atomic and sbitmap: remove swap_lock sbitmap: optimise sbitmap_deferred_clear() blk-mq: skip hybrid polling if iopoll doesn't spin blk-iocost: Factor out the base vrate change into a separate function blk-iocost: Factor out the active iocgs' state check into a separate function blk-iocost: Move the usage ratio calculation to the correct place blk-iocost: Remove unnecessary advance declaration blk-iocost: Fix some typos in comments blktrace: fix up a kerneldoc comment block: remove the request_queue to argument request based tracepoints ...
2020-12-16Merge tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-blockLinus Torvalds2-51/+1
Pull TIF_NOTIFY_SIGNAL updates from Jens Axboe: "This sits on top of of the core entry/exit and x86 entry branch from the tip tree, which contains the generic and x86 parts of this work. Here we convert the rest of the archs to support TIF_NOTIFY_SIGNAL. With that done, we can get rid of JOBCTL_TASK_WORK from task_work and signal.c, and also remove a deadlock work-around in io_uring around knowing that signal based task_work waking is invoked with the sighand wait queue head lock. The motivation for this work is to decouple signal notify based task_work, of which io_uring is a heavy user of, from sighand. The sighand lock becomes a huge contention point, particularly for threaded workloads where it's shared between threads. Even outside of threaded applications it's slower than it needs to be. Roman Gershman <romger@amazon.com> reported that his networked workload dropped from 1.6M QPS at 80% CPU to 1.0M QPS at 100% CPU after io_uring was changed to use TIF_NOTIFY_SIGNAL. The time was all spent hammering on the sighand lock, showing 57% of the CPU time there [1]. There are further cleanups possible on top of this. One example is TIF_PATCH_PENDING, where a patch already exists to use TIF_NOTIFY_SIGNAL instead. Hopefully this will also lead to more consolidation, but the work stands on its own as well" [1] https://github.com/axboe/liburing/issues/215 * tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-block: (28 commits) io_uring: remove 'twa_signal_ok' deadlock work-around kernel: remove checking for TIF_NOTIFY_SIGNAL signal: kill JOBCTL_TASK_WORK io_uring: JOBCTL_TASK_WORK is no longer used by task_work task_work: remove legacy TWA_SIGNAL path sparc: add support for TIF_NOTIFY_SIGNAL riscv: add support for TIF_NOTIFY_SIGNAL nds32: add support for TIF_NOTIFY_SIGNAL ia64: add support for TIF_NOTIFY_SIGNAL h8300: add support for TIF_NOTIFY_SIGNAL c6x: add support for TIF_NOTIFY_SIGNAL alpha: add support for TIF_NOTIFY_SIGNAL xtensa: add support for TIF_NOTIFY_SIGNAL arm: add support for TIF_NOTIFY_SIGNAL microblaze: add support for TIF_NOTIFY_SIGNAL hexagon: add support for TIF_NOTIFY_SIGNAL csky: add support for TIF_NOTIFY_SIGNAL openrisc: add support for TIF_NOTIFY_SIGNAL sh: add support for TIF_NOTIFY_SIGNAL um: add support for TIF_NOTIFY_SIGNAL ...
2020-12-16Merge tag 'seccomp-v5.11-rc1' of ↵Linus Torvalds1-3/+293
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "The major change here is finally gaining seccomp constant-action bitmaps, which internally reduces the seccomp overhead for many real-world syscall filters to O(1), as discussed at Plumbers this year. - Improve seccomp performance via constant-action bitmaps (YiFei Zhu & Kees Cook) - Fix bogus __user annotations (Jann Horn) - Add missed CONFIG for improved selftest coverage (Mickaël Salaün)" * tag 'seccomp-v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests/seccomp: Update kernel config seccomp: Remove bogus __user annotations seccomp/cache: Report cache data through /proc/pid/seccomp_cache xtensa: Enable seccomp architecture tracking sh: Enable seccomp architecture tracking s390: Enable seccomp architecture tracking riscv: Enable seccomp architecture tracking powerpc: Enable seccomp architecture tracking parisc: Enable seccomp architecture tracking csky: Enable seccomp architecture tracking arm: Enable seccomp architecture tracking arm64: Enable seccomp architecture tracking selftests/seccomp: Compare bitmap vs filter overhead x86: Enable seccomp architecture tracking seccomp/cache: Add "emulator" to check if filter is constant allow seccomp/cache: Lookup syscall allowlist bitmap for fast path
2020-12-16Merge tag 'audit-pr-20201214' of ↵Linus Torvalds2-29/+18
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "A small set of audit patches for v5.11 with four patches in total and only one of any real significance. Richard's patch to trigger accompanying records causes the kernel to emit additional related records when an audit event occurs; helping provide some much needed context to events in the audit log. It is also worth mentioning that this is a revised patch based on an earlier attempt that had to be reverted in the v5.8 time frame. Everything passes our test suite, and with no problems reported please merge this for v5.11" * tag 'audit-pr-20201214' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: replace atomic_add_return() audit: fix macros warnings audit: trigger accompanying records when no rules present audit: fix a kernel-doc markup
2020-12-16Merge tag 'printk-for-5.11' of ↵Linus Torvalds2-119/+170
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Finally allow parallel writes and reads into/from the lockless ringbuffer. But it is not a complete solution. Readers are still serialized against each other. And nested writes are still prevented by printk_safe per-CPU buffers. - Use ttynull as the ultimate fallback for /dev/console. - Officially allow disabling console output by using console="" or console=null - A few code cleanups * tag 'printk-for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: remove logbuf_lock writer-protection of ringbuffer printk: inline log_output(),log_store() in vprintk_store() printk: remove obsolete dead assignment printk/console: Allow to disable console output by using console="" or console=null init/console: Use ttynull as a fallback when there is no console printk: ringbuffer: Reference text_data_ring directly in callees.
2020-12-16Merge tag 'asm-generic-timers-5.11' of ↵Linus Torvalds7-60/+48
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic cross-architecture timer cleanup from Arnd Bergmann: "This cleans up two ancient timer features that were never completed in the past, CONFIG_GENERIC_CLOCKEVENTS and CONFIG_ARCH_USES_GETTIMEOFFSET. There was only one user left for the ARCH_USES_GETTIMEOFFSET variant of clocksource implementations, the ARM EBSA110 platform. Rather than changing to use modern timekeeping, we remove the platform entirely as Russell no longer uses his machine and nobody else seems to have one any more. The conditional code for using arch_gettimeoffset() is removed as a result. For CONFIG_GENERIC_CLOCKEVENTS, there are still a couple of platforms not using clockevent drivers: parisc, ia64, most of m68k, and one Arm platform. These all do timer ticks slighly differently, and this gets cleaned up to the point they at least all call the same helper function. Instead of most platforms using 'select GENERIC_CLOCKEVENTS' in Kconfig, the polarity is now reversed, with the few remaining ones selecting LEGACY_TIMER_TICK instead" * tag 'asm-generic-timers-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: timekeeping: default GENERIC_CLOCKEVENTS to enabled timekeeping: remove xtime_update m68k: remove timer_interrupt() function m68k: change remaining timers to legacy_timer_tick m68k: m68328: use legacy_timer_tick() m68k: sun3/sun3c: use legacy_timer_tick m68k: split heartbeat out of timer function m68k: coldfire: use legacy_timer_tick() parisc: use legacy_timer_tick ARM: rpc: use legacy_timer_tick ia64: convert to legacy_timer_tick timekeeping: add CONFIG_LEGACY_TIMER_TICK timekeeping: remove arch_gettimeoffset net: remove am79c961a driver ARM: remove ebsa110 platform
2020-12-16Merge branch 'akpm' (patches from Andrew)Linus Torvalds9-135/+271
Merge yet more updates from Andrew Morton: - lots of little subsystems - a few post-linux-next MM material. Most of the rest awaits more merging of other trees. Subsystems affected by this series: alpha, procfs, misc, core-kernel, bitmap, lib, lz4, checkpatch, nilfs, kdump, rapidio, gcov, bfs, relay, resource, ubsan, reboot, fault-injection, lzo, apparmor, and mm (swap, memory-hotplug, pagemap, cleanups, and gup). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (86 commits) mm: fix some spelling mistakes in comments mm: simplify follow_pte{,pmd} mm: unexport follow_pte_pmd apparmor: remove duplicate macro list_entry_is_head() lib/lzo/lzo1x_compress.c: make lzogeneric1x_1_compress() static fault-injection: handle EI_ETYPE_TRUE reboot: hide from sysfs not applicable settings reboot: allow to override reboot type if quirks are found reboot: remove cf9_safe from allowed types and rename cf9_force reboot: allow to specify reboot mode via sysfs reboot: refactor and comment the cpu selection code lib/ubsan.c: mark type_check_kinds with static keyword kcov: don't instrument with UBSAN ubsan: expand tests and reporting ubsan: remove UBSAN_MISC in favor of individual options ubsan: enable for all*config builds ubsan: disable UBSAN_TRAP for all*config ubsan: disable object-size sanitizer under GCC ubsan: move cc-option tests into Kconfig ubsan: remove redundant -Wno-maybe-uninitialized ...
2020-12-16fault-injection: handle EI_ETYPE_TRUEBarnabás Pőcze1-3/+3
Commit af3b854492f351d1 ("mm/page_alloc.c: allow error injection") introduced EI_ETYPE_TRUE, but did not extend * lib/error-inject.c:error_type_string(), and * kernel/fail_function.c:adjust_error_retval() to accommodate for this change. Handle EI_ETYPE_TRUE in both functions appropriately by * returning "TRUE" in error_type_string(), * adjusting the return value to true (1) in adjust_error_retval(). Furthermore, simplify the logic of handling EI_ETYPE_NULL in adjust_error_retval(). Link: https://lkml.kernel.org/r/njB1czX0ZgWPR9h61euHIBb5bEyePw9D4D2m3i5lc9Cl96P8Q1308dTcmsEZW7Vtz3Ifz4do-rOtSfuFTyGoEDYokkK2aUqBePVptzZEWfU=@protonmail.com Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16reboot: hide from sysfs not applicable settingsMatteo Croce1-23/+31
Not all the reboot settings from both the kernel command line or sysfs interface are available to all platforms. Filter out reboot_type and reboot_force which are x86 only, and also remove reboot_cpu on kernels without SMP support. This saves some space, and avoid confusing the user with settings which will have no effect. Link: https://lkml.kernel.org/r/20201130173717.198952-3-mcroce@linux.microsoft.com Signed-off-by: Matteo Croce <mcroce@microsoft.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16reboot: allow to override reboot type if quirks are foundMatteo Croce1-0/+6
Patch series "reboot: sysfs improvements". Some improvements to the sysfs reboot interface: hide not working settings and support machines with known reboot quirks. This patch (of 2): On some machines a quirk can force a specific reboot type. Quirks are found during a DMI scan, the list of machines which need special reboot handling is defined in reboot_dmi_table. The kernel command line reboot= option overrides this via a global variable `reboot_default`, so that the reboot type requested in the command line is really performed. This was not true when setting the reboot type via the new sysfs interface. Fix this by setting reboot_default upon the first change, like reboot_setup() does for the command line. Link: https://lkml.kernel.org/r/20201130173717.198952-1-mcroce@linux.microsoft.com Link: https://lkml.kernel.org/r/20201130173717.198952-2-mcroce@linux.microsoft.com Signed-off-by: Matteo Croce <mcroce@microsoft.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16reboot: remove cf9_safe from allowed types and rename cf9_forceMatteo Croce1-9/+3
BOOT_CF9_SAFE_STR is an internal value used only by the x86 code and it's not possible to set it from userspace. Remove it, and rename 'cf9_force' to 'pci', so to make it coherent with the kernel command line reboot= option. Tested with this script: cd /sys/kernel/reboot/ for i in cold warm hard soft gpio; do echo $i >mode read j <mode [ $i = $j ] || echo "mode $i != $j" done for i in bios acpi kbd triple efi pci; do echo $i >type read j <type [ $i = $j ] || echo "type $i != $j" done for i in $(seq 0 $(nproc --ignore=1)); do echo $i >cpu read j <cpu [ $i = $j ] || echo "cpu $i != $j" done for i in 0 1; do echo $i >force read j <force [ $i = $j ] || echo "force $i != $j" done Link: https://lkml.kernel.org/r/20201113015900.543923-1-mcroce@linux.microsoft.com Fixes: eab8da48579d ("reboot: allow to specify reboot mode via sysfs") Signed-off-by: Matteo Croce <mcroce@microsoft.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16reboot: allow to specify reboot mode via sysfsMatteo Croce1-0/+206
The kernel cmdline reboot= option offers some sort of control on how the reboot is issued. We don't always know in advance what type of reboot to perform. Sometimes a warm reboot is preferred to persist certain memory regions across the reboot. Others a cold one is needed to apply a future system update that makes a memory memory model change, like changing the base page size or resizing a persistent memory region. Or simply we want to enable reboot_force because we noticed that something bad happened. Add handles in sysfs to allow setting these reboot options, so they can be changed when the system is booted, other than at boot time. The handlers are under <sysfs>/kernel/reboot, can be read to get the current configuration and written to alter it. # cd /sys/kernel/reboot/ # grep . * cpu:0 force:0 mode:cold type:acpi # echo 2 >cpu # echo yes >force # echo soft >mode # echo bios >type # grep . * cpu:2 force:1 mode:soft type:bios Before setting anything, check for CAP_SYS_BOOT capability, so it's possible to allow an unpriviledged process to change these settings simply by relaxing the handles permissions, without opening them to the world. [natechancellor@gmail.com: fix variable assignments in type_store] Link: https://lkml.kernel.org/r/20201112035023.974748-1-natechancellor@gmail.com Link: https://github.com/ClangBuiltLinux/linux/issues/1197 Link: https://lkml.kernel.org/r/20201110202746.9690-1-mcroce@linux.microsoft.com Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Kees Cook <keescook@chromium.org> Cc: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16reboot: refactor and comment the cpu selection codeMatteo Croce1-13/+17
Small improvements to the code, without changing the way it works: - use a local variable, to avoid a small time lapse where reboot_cpu can have an invalid value - comment the code which is not easy to understand at a glance - merge two identical code blocks into one - replace pointer arithmetics with equivalent array syntax Link: https://lkml.kernel.org/r/20201103214025.116799-4-mcroce@linux.microsoft.com Signed-off-by: Matteo Croce <mcroce@microsoft.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Fabian Frederick <fabf@skynet.be> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Robin Holt <robinmholt@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16kcov: don't instrument with UBSANDmitry Vyukov1-0/+3
Both KCOV and UBSAN use compiler instrumentation. If UBSAN detects a bug in KCOV, it may cause infinite recursion via printk and other common functions. We already don't instrument KCOV with KASAN/KCSAN for this reason, don't instrument it with UBSAN as well. As a side effect this also resolves the following gcc warning: conflicting types for built-in function '__sanitizer_cov_trace_switch'; expected 'void(long unsigned int, void *)' [-Wbuiltin-declaration-mismatch] It's only reported when kcov.c is compiled with any of the sanitizers enabled. Size of the arguments is correct, it's just that gcc uses 'long' on 64-bit arches and 'long long' on 32-bit arches, while kernel type is always 'long long'. Link: https://lkml.kernel.org/r/20201209100152.2492072-1-dvyukov@google.com Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Suggested-by: Marco Elver <elver@google.com> Acked-by: Marco Elver <elver@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16kernel/resource.c: fix kernel-doc markupsMauro Carvalho Chehab1-10/+14
Kernel-doc markups should use this format: identifier - description While here, fix a kernel-doc tag that was using, instead, a normal comment block. [akpm@linux-foundation.org: coding style fixes] Link: https://lkml.kernel.org/r/c5e38e1070f8dbe2f9607a10b44afe2875bd966c.1605521731.git.mchehab+huawei@kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: "Jonathan Corbet" <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16blktrace: make relay callbacks constJani Nikula1-1/+1
Now that relay_open() accepts const callbacks, make relay callbacks const. Link: https://lkml.kernel.org/r/7ff5ce0b735901eb4f10e13da2704f1d8c4a2507.1606153547.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16relay: allow the use of const callback structsJani Nikula1-27/+10
None of the relay users require the use of mutable structs for callbacks, however the relay code does. Instead of assigning the default callback for subbuf_start, add a wrapper to conditionally call the client callback if available, and fall back to default behaviour otherwise. This lets all relay users make their struct rchan_callbacks const data. [jani.nikula@intel.com: cleanups, per Christoph] Link: https://lkml.kernel.org/r/20201124115412.32402-1-jani.nikula@intel.com Link: https://lkml.kernel.org/r/cc3ff292e4eb4fdc56bee3d690c7b8e39209cd37.1606153547.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16relay: make create_buf_file and remove_buf_file callbacks mandatoryJani Nikula1-25/+1
All clients provide create_buf_file and remove_buf_file callbacks, and they're required for relay to make sense. There is no point in them being optional. Also document whether each callback is mandatory/optional. Link: https://lkml.kernel.org/r/88003c1527386b93036e286e7917f1e33aec84ac.1606153547.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16relay: require non-NULL callbacks in relay_open()Jani Nikula1-12/+2
There are no clients passing NULL callbacks, which makes sense as it wouldn't even create a file. Require non-NULL callbacks, and throw away the handling for NULL callbacks. Link: https://lkml.kernel.org/r/e40642f3b027d2bb6bc851ddb60e0a61ea51f5f8.1606153547.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16relay: remove unused buf_mapped and buf_unmapped callbacksJani Nikula1-34/+0
Patch series "relay: cleanup and const callbacks", v2. None of the relay users require the use of mutable structs for callbacks, however the relay code does. Instead of assigning default callbacks when there is none, add callback wrappers to conditionally call the client callbacks if available, and fall back to default behaviour (typically no-op) otherwise. This lets all relay users make their struct rchan_callbacks const data. This series starts with a number of cleanups first based on Christoph's feedback. This patch (of 9): No relay client uses the buf_mapped or buf_unmapped callbacks. Remove them. This makes relay's vm_operations_struct close callback a dummy, remove it as well. Link: https://lkml.kernel.org/r/cover.1606153547.git.jani.nikula@intel.com Link: https://lkml.kernel.org/r/c69fff6e0cd485563604240bbfcc028434983bec.1606153547.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16gcov: fix kernel-doc markup issueAlex Shi1-3/+3
Fix the following kernel-doc issue in gcov: kernel/gcov/gcc_4_7.c:238: warning: Function parameter or member 'dst' not described in 'gcov_info_add' kernel/gcov/gcc_4_7.c:238: warning: Function parameter or member 'src' not described in 'gcov_info_add' kernel/gcov/gcc_4_7.c:238: warning: Excess function parameter 'dest' description in 'gcov_info_add' kernel/gcov/gcc_4_7.c:238: warning: Excess function parameter 'source' description in 'gcov_info_add' Link: https://lkml.kernel.org/r/1605252352-63983-1-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16gcov: remove support for GCC < 4.9Nick Desaulniers1-3/+1
Since commit 0bddd227f3dc ("Documentation: update for gcc 4.9 requirement") the minimum supported version of GCC is gcc-4.9. It's now safe to remove this code. Similar to commit 10415533a906 ("gcov: Remove old GCC 3.4 support") but that was for GCC 4.8 and this is for GCC 4.9. Link: https://github.com/ClangBuiltLinux/linux/issues/427 Link: https://lkml.kernel.org/r/20201111030557.2015680-1-ndesaulniers@google.com Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16kdump: append uts_namespace.name offset to VMCOREINFOAlexander Egorenkov1-0/+1
The offset of the field 'init_uts_ns.name' has changed since commit 9a56493f6942 ("uts: Use generic ns_common::count"). Make the offset of the field 'uts_namespace.name' available in VMCOREINFO because tools like 'crash-utility' and 'makedumpfile' must be able to read it from crash dumps. Link: https://lore.kernel.org/r/159644978167.604812.1773586504374412107.stgit@localhost.localdomain Link: https://lkml.kernel.org/r/20200930102328.396488-1-egorenar@linux.ibm.com Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Acked-by: lijiang <lijiang@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Kees Cook <keescook@chromium.org> Cc: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16kernel/acct.c: use #elif instead of #end and #elifHui Su1-5/+2
Cleanup: use #elif instead of #end and #elif. Link: https://lkml.kernel.org/r/20201015150736.GA91603@rlk Signed-off-by: Hui Su <sh_def@163.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-16Merge branch 'exec-update-lock-for-v5.11' of ↵Linus Torvalds4-26/+26
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull exec-update-lock update from Eric Biederman: "The key point of this is to transform exec_update_mutex into a rw_semaphore so readers can be separated from writers. This makes it easier to understand what the holders of the lock are doing, and makes it harder to contend or deadlock on the lock. The real deadlock fix wound up in perf_event_open" * 'exec-update-lock-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: exec: Transform exec_update_mutex into a rw_semaphore
2020-12-16Merge branch 'exec-for-v5.11' of ↵Linus Torvalds4-72/+22
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull execve updates from Eric Biederman: "This set of changes ultimately fixes the interaction of posix file lock and exec. Fundamentally most of the change is just moving where unshare_files is called during exec, and tweaking the users of files_struct so that the count of files_struct is not unnecessarily played with. Along the way fcheck and related helpers were renamed to more accurately reflect what they do. There were also many other small changes that fell out, as this is the first time in a long time much of this code has been touched. Benchmarks haven't turned up any practical issues but Al Viro has observed a possibility for a lot of pounding on task_lock. So I have some changes in progress to convert put_files_struct to always rcu free files_struct. That wasn't ready for the merge window so that will have to wait until next time" * 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits) exec: Move io_uring_task_cancel after the point of no return coredump: Document coredump code exclusively used by cell spufs file: Remove get_files_struct file: Rename __close_fd_get_file close_fd_get_file file: Replace ksys_close with close_fd file: Rename __close_fd to close_fd and remove the files parameter file: Merge __alloc_fd into alloc_fd file: In f_dupfd read RLIMIT_NOFILE once. file: Merge __fd_install into fd_install proc/fd: In fdinfo seq_show don't use get_files_struct bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu file: Implement task_lookup_next_fd_rcu kcmp: In get_file_raw_ptr use task_lookup_fd_rcu proc/fd: In tid_fd_mode use task_lookup_fd_rcu file: Implement task_lookup_fd_rcu file: Rename fcheck lookup_fd_rcu file: Replace fcheck_files with files_lookup_fd_rcu file: Factor files_lookup_fd_locked out of fcheck_files file: Rename __fcheck_files to files_lookup_fd_raw ...
2020-12-16Merge tag 'acpi-5.11-rc1' of ↵Linus Torvalds3-5/+158
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These update the ACPICA code in the kernel to upstream revision 20201113, fix and clean up some resources manipulation code, extend the enumeration and gpio-line-names property documentation, clean up the handling of _DEP during device enumeration, add a new backlight DMI quirk, clean up transaction handling in the EC driver and make some assorted janitorial changes. Specifics: - Update ACPICA code in the kernel to upstream revision 20201113 with changes as follows: * Add 5 new UUIDs to the known UUID table (Bob Moore) * Remove extreaneous "the" in comments (Colin Ian King) * Add function trace macros to improve debugging (Erik Kaneda) * Fix interpreter memory leak (Erik Kaneda) * Handle "orphan" _REG for GPIO OpRegions (Hans de Goede) - Introduce resource_union() and resource_intersection() helpers and clean up some resource-manipulation code with the help of them (Andy Shevchenko) - Revert problematic commit related to the handling of resources in the ACPI core (Daniel Scally) - Extend the ACPI device enumeration documentation and the gpio-line-names _DSD property documentation, clean up the latter (Flavio Suligoi) - Clean up _DEP handling during device enumeration, modify the list of _DEP exceptions and the handling of it and fix up terminology related to _DEP (Hans de Goede, Rafael Wysocki) - Eliminate in_interrupt() usage from the ACPI EC driver (Sebastian Andrzej Siewior) - Clean up the advance_transaction() routine and related code in the ACPI EC driver (Rafael Wysocki) - Add new backlight quirk for GIGABYTE GB-BXBT-2807 (Jasper St Pierre) - Make assorted janitorial changes in several ACPI-related pieces of code (Hanjun Guo, Jason Yan, Punit Agrawal)" * tag 'acpi-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (40 commits) ACPI: scan: Fix up _DEP-related terminology with supplier/consumer ACPI: scan: Drop INT3396 from acpi_ignore_dep_ids[] ACPI: video: Add DMI quirk for GIGABYTE GB-BXBT-2807 Revert "ACPI / resources: Use AE_CTRL_TERMINATE to terminate resources walks" ACPI: scan: Add PNP0D80 to the _DEP exceptions list ACPI: scan: Call acpi_get_object_info() from acpi_add_single_object() ACPI: scan: Add acpi_info_matches_hids() helper ACPICA: Update version to 20201113 ACPICA: Interpreter: fix memory leak by using existing buffer ACPICA: Add function trace macros to improve debugging ACPICA: Also handle "orphan" _REG methods for GPIO OpRegions ACPICA: Remove extreaneous "the" in comments ACPICA: Add 5 new UUIDs to the known UUID table resource: provide meaningful MODULE_LICENSE() in test suite ASoC: Intel: catpt: Replace open coded variant of resource_intersection() ACPI: processor: Drop duplicate setting of shared_cpu_map ACPI: EC: Clean up status flags checks in advance_transaction() ACPI: EC: Untangle error handling in advance_transaction() ACPI: EC: Simplify error handling in advance_transaction() ACPI: EC: Rename acpi_ec_is_gpe_raised() ...
2020-12-16Merge tag 'pm-5.11-rc1' of ↵Linus Torvalds4-7/+31
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These update cpufreq (core and drivers), cpuidle (polling state implementation and the PSCI driver), the OPP (operating performance points) framework, devfreq (core and drivers), the power capping RAPL (Running Average Power Limit) driver, the Energy Model support, the generic power domains (genpd) framework, the ACPI device power management, the core system-wide suspend code and power management utilities. Specifics: - Use local_clock() instead of jiffies in the cpufreq statistics to improve accuracy (Viresh Kumar). - Fix up OPP usage in the cpufreq-dt and qcom-cpufreq-nvmem cpufreq drivers (Viresh Kumar). - Clean up the cpufreq core, the intel_pstate driver and the schedutil cpufreq governor (Rafael Wysocki). - Fix up error code paths in the sti-cpufreq and mediatek cpufreq drivers (Yangtao Li, Qinglang Miao). - Fix cpufreq_online() to return error codes instead of success (0) in all cases when it fails (Wang ShaoBo). - Add mt8167 support to the mediatek cpufreq driver and blacklist mt8516 in the cpufreq-dt-platdev driver (Fabien Parent). - Modify the tegra194 cpufreq driver to always return values from the frequency table as the current frequency and clean up that driver (Sumit Gupta, Jon Hunter). - Modify the arm_scmi cpufreq driver to allow it to discover the power scale present in the performance protocol and provide this information to the Energy Model (Lukasz Luba). - Add missing MODULE_DEVICE_TABLE to several cpufreq drivers (Pali Rohár). - Clean up the CPPC cpufreq driver (Ionela Voinescu). - Fix NVMEM_IMX_OCOTP dependency in the imx cpufreq driver (Arnd Bergmann). - Rework the poling interval selection for the polling state in cpuidle (Mel Gorman). - Enable suspend-to-idle for PSCI OSI mode in the PSCI cpuidle driver (Ulf Hansson). - Modify the OPP framework to support empty (node-less) OPP tables in DT for passing dependency information (Nicola Mazzucato). - Fix potential lockdep issue in the OPP core and clean up the OPP core (Viresh Kumar). - Modify dev_pm_opp_put_regulators() to accept a NULL argument and update its users accordingly (Viresh Kumar). - Add frequency changes tracepoint to devfreq (Matthias Kaehlcke). - Add support for governor feature flags to devfreq, make devfreq sysfs file permissions depend on the governor and clean up the devfreq core (Chanwoo Choi). - Clean up the tegra20 devfreq driver and deprecate it to allow another driver based on EMC_STAT to be used instead of it (Dmitry Osipenko). - Add interconnect support to the tegra30 devfreq driver, allow it to take the interconnect and OPP information from DT and clean it up (Dmitry Osipenko). - Add interconnect support to the exynos-bus devfreq driver along with interconnect properties documentation (Sylwester Nawrocki). - Add suport for AMD Fam17h and Fam19h processors to the RAPL power capping driver (Victor Ding, Kim Phillips). - Fix handling of overly long constraint names in the powercap framework (Lukasz Luba). - Fix the wakeup configuration handling for bridges in the ACPI device power management core (Rafael Wysocki). - Add support for using an abstract scale for power units in the Energy Model (EM) and document it (Lukasz Luba). - Add em_cpu_energy() micro-optimization to the EM (Pavankumar Kondeti). - Modify the generic power domains (genpd) framwework to support suspend-to-idle (Ulf Hansson). - Fix creation of debugfs nodes in genpd (Thierry Strudel). - Clean up genpd (Lina Iyer). - Clean up the core system-wide suspend code and make it print driver flags for devices with debug enabled (Alex Shi, Patrice Chotard, Chen Yu). - Modify the ACPI system reboot code to make it prepare for system power off to avoid confusing the platform firmware (Kai-Heng Feng). - Update the pm-graph (multiple changes, mostly usability-related) and cpupower (online and offline CPU information support) PM utilities (Todd Brandt, Brahadambal Srinivasan)" * tag 'pm-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (86 commits) cpufreq: Fix cpufreq_online() return value on errors cpufreq: Fix up several kerneldoc comments cpufreq: stats: Use local_clock() instead of jiffies cpufreq: schedutil: Simplify sugov_update_next_freq() cpufreq: intel_pstate: Simplify intel_cpufreq_update_pstate() PM: domains: create debugfs nodes when adding power domains opp: of: Allow empty opp-table with opp-shared dt-bindings: opp: Allow empty OPP tables media: venus: dev_pm_opp_put_*() accepts NULL argument drm/panfrost: dev_pm_opp_put_*() accepts NULL argument drm/lima: dev_pm_opp_put_*() accepts NULL argument PM / devfreq: exynos: dev_pm_opp_put_*() accepts NULL argument cpufreq: qcom-cpufreq-nvmem: dev_pm_opp_put_*() accepts NULL argument cpufreq: dt: dev_pm_opp_put_regulators() accepts NULL argument opp: Allow dev_pm_opp_put_*() APIs to accept NULL opp_table opp: Don't create an OPP table from dev_pm_opp_get_opp_table() cpufreq: dt: Don't (ab)use dev_pm_opp_get_opp_table() to create OPP table opp: Reduce the size of critical section in _opp_kref_release() PM / EM: Micro optimization in em_cpu_energy cpufreq: arm_scmi: Discover the power scale in performance protocol ...
2020-12-16Merge branch 'for-linus' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - support for inhibiting input devices at request from userspace. If a device implements open/close methods, it can also put device into low power state. This is needed, for example, to disable keyboard and touchpad on convertibles when they are transitioned into tablet mode - now that ordinary input devices can be configured for polling mode, dedicated input polling device implementation has been removed - GTCO tablet driver has been removed, as it used problematic custom HID parser, devices are EOL, and there is no interest from the manufacturer - a new driver for Dialog DA7280 haptic chips has been introduced - a new driver for power button on Dell Wyse 3020 - support for eKTF2132 in ektf2127 driver - support for SC2721 and SC2730 in sc27xx-vibra driver - enhancements for Atmel touchscreens, AD7846 touchscreens, Elan touchpads, ADP5589, ST1232 touchscreen, TM2 touchkey drivers - fixes and cleanups to allow clean builds with W=1 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (86 commits) Input: da7280 - fix spelling mistake "sequemce" -> "sequence" Input: cyapa_gen6 - fix out-of-bounds stack access Input: sc27xx - add support for sc2730 and sc2721 dt-bindings: input: Add compatible string for SC2721 and SC2730 dt-bindings: input: Convert sc27xx-vibra.txt to json-schema Input: stmpe - add axis inversion and swapping capability Input: adp5589-keys - do not explicitly control IRQ for wakeup Input: adp5589-keys - do not unconditionally configure as wakeup source Input: ipx4xx-beeper - convert comma to semicolon Input: parkbd - convert comma to semicolon Input: new da7280 haptic driver dt-bindings: input: Add document bindings for DA7280 MAINTAINERS: da7280 updates to the Dialog Semiconductor search terms Input: elantech - fix protocol errors for some trackpoints in SMBus mode Input: elan_i2c - add new trackpoint report type 0x5F Input: elants - document some registers and values Input: atmel_mxt_ts - simplify the return expression of mxt_send_bootloader_cmd() Input: imx_keypad - add COMPILE_TEST support Input: applespi - use new structure for SPI transfer delays Input: synaptics-rmi4 - use new structure for SPI transfer delays ...
2020-12-16Merge tag 'irq-core-2020-12-15' of ↵Linus Torvalds8-176/+176
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Generic interrupt and irqchips subsystem updates. Unusually, there is not a single completely new irq chip driver, just new DT bindings and extensions of existing drivers to accomodate new variants! Core: - Consolidation and robustness changes for irq time accounting - Cleanup and consolidation of irq stats - Remove the fasteoi IPI flow which has been proved useless - Provide an interface for converting legacy interrupt mechanism into irqdomains Drivers: - Preliminary support for managed interrupts on platform devices - Correctly identify allocation of MSIs proxyied by another device - Generalise the Ocelot support to new SoCs - Improve GICv4.1 vcpu entry, matching the corresponding KVM optimisation - Work around spurious interrupts on Qualcomm PDC - Random fixes and cleanups" * tag 'irq-core-2020-12-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits) irqchip/qcom-pdc: Fix phantom irq when changing between rising/falling driver core: platform: Add devm_platform_get_irqs_affinity() ACPI: Drop acpi_dev_irqresource_disabled() resource: Add irqresource_disabled() genirq/affinity: Add irq_update_affinity_desc() irqchip/gic-v3-its: Flag device allocation as proxied if behind a PCI bridge irqchip/gic-v3-its: Tag ITS device as shared if allocating for a proxy device platform-msi: Track shared domain allocation irqchip/ti-sci-intr: Fix freeing of irqs irqchip/ti-sci-inta: Fix printing of inta id on probe success drivers/irqchip: Remove EZChip NPS interrupt controller Revert "genirq: Add fasteoi IPI flow" irqchip/hip04: Make IPIs use handle_percpu_devid_irq() irqchip/bcm2836: Make IPIs use handle_percpu_devid_irq() irqchip/armada-370-xp: Make IPIs use handle_percpu_devid_irq() irqchip/gic, gic-v3: Make SGIs use handle_percpu_devid_irq() irqchip/ocelot: Add support for Jaguar2 platforms irqchip/ocelot: Add support for Serval platforms irqchip/ocelot: Add support for Luton platforms irqchip/ocelot: prepare to support more SoC ...
2020-12-16Merge tag 'driver-core-5.11-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big driver core updates for 5.11-rc1 This time there was a lot of different work happening here for some reason: - redo of the fwnode link logic, speeding it up greatly - auxiliary bus added (this was a tag that will be pulled in from other trees/maintainers this merge window as well, as driver subsystems started to rely on it) - platform driver core cleanups on the way to fixing some long-time api updates in future releases - minor fixes and tweaks. All have been in linux-next with no (finally) reported issues. Testing there did helped in shaking issues out a lot :)" * tag 'driver-core-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (39 commits) driver core: platform: don't oops in platform_shutdown() on unbound devices ACPI: Use fwnode_init() to set up fwnode misc: pvpanic: Replace OF headers by mod_devicetable.h misc: pvpanic: Combine ACPI and platform drivers usb: host: sl811: Switch to use platform_get_mem_or_io() vfio: platform: Switch to use platform_get_mem_or_io() driver core: platform: Introduce platform_get_mem_or_io() dyndbg: fix use before null check soc: fix comment for freeing soc_dev_attr driver core: platform: use bus_type functions driver core: platform: change logic implementing platform_driver_probe driver core: platform: reorder functions driver core: make driver_probe_device() static driver core: Fix a couple of typos driver core: Reorder devices on successful probe driver core: Delete pointless parameter in fwnode_operations.add_links driver core: Refactor fw_devlink feature efi: Update implementation of add_links() to create fwnode links of: property: Update implementation of add_links() to create fwnode links driver core: Use device's fwnode to check if it is waiting for suppliers ...
2020-12-16Merge tag 'net-next-5.11' of ↵Linus Torvalds28-736/+1616
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - support "prefer busy polling" NAPI operation mode, where we defer softirq for some time expecting applications to periodically busy poll - AF_XDP: improve efficiency by more batching and hindering the adjacency cache prefetcher - af_packet: make packet_fanout.arr size configurable up to 64K - tcp: optimize TCP zero copy receive in presence of partial or unaligned reads making zero copy a performance win for much smaller messages - XDP: add bulk APIs for returning / freeing frames - sched: support fragmenting IP packets as they come out of conntrack - net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs BPF: - BPF switch from crude rlimit-based to memcg-based memory accounting - BPF type format information for kernel modules and related tracing enhancements - BPF implement task local storage for BPF LSM - allow the FENTRY/FEXIT/RAW_TP tracing programs to use bpf_sk_storage Protocols: - mptcp: improve multiple xmit streams support, memory accounting and many smaller improvements - TLS: support CHACHA20-POLY1305 cipher - seg6: add support for SRv6 End.DT4/DT6 behavior - sctp: Implement RFC 6951: UDP Encapsulation of SCTP - ppp_generic: add ability to bridge channels directly - bridge: Connectivity Fault Management (CFM) support as is defined in IEEE 802.1Q section 12.14. Drivers: - mlx5: make use of the new auxiliary bus to organize the driver internals - mlx5: more accurate port TX timestamping support - mlxsw: - improve the efficiency of offloaded next hop updates by using the new nexthop object API - support blackhole nexthops - support IEEE 802.1ad (Q-in-Q) bridging - rtw88: major bluetooth co-existance improvements - iwlwifi: support new 6 GHz frequency band - ath11k: Fast Initial Link Setup (FILS) - mt7915: dual band concurrent (DBDC) support - net: ipa: add basic support for IPA v4.5 Refactor: - a few pieces of in_interrupt() cleanup work from Sebastian Andrzej Siewior - phy: add support for shared interrupts; get rid of multiple driver APIs and have the drivers write a full IRQ handler, slight growth of driver code should be compensated by the simpler API which also allows shared IRQs - add common code for handling netdev per-cpu counters - move TX packet re-allocation from Ethernet switch tag drivers to a central place - improve efficiency and rename nla_strlcpy - number of W=1 warning cleanups as we now catch those in a patchwork build bot Old code removal: - wan: delete the DLCI / SDLA drivers - wimax: move to staging - wifi: remove old WDS wifi bridging support" * tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1922 commits) net: hns3: fix expression that is currently always true net: fix proc_fs init handling in af_packet and tls nfc: pn533: convert comma to semicolon af_vsock: Assign the vsock transport considering the vsock address flags af_vsock: Set VMADDR_FLAG_TO_HOST flag on the receive path vsock_addr: Check for supported flag values vm_sockets: Add VMADDR_FLAG_TO_HOST vsock flag vm_sockets: Add flags field in the vsock address data structure net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit net: mscc: ocelot: install MAC addresses in .ndo_set_rx_mode from process context nfc: s3fwrn5: Release the nfc firmware net: vxget: clean up sparse warnings mlxsw: spectrum_router: Use eXtended mezzanine to offload IPv4 router mlxsw: spectrum: Set KVH XLT cache mode for Spectrum2/3 mlxsw: spectrum_router_xm: Introduce basic XM cache flushing mlxsw: reg: Add Router LPM Cache Enable Register mlxsw: reg: Add Router LPM Cache ML Delete Register mlxsw: spectrum_router_xm: Implement L-value tracking for M-index mlxsw: reg: Add XM Router M Table Register ...
2020-12-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds8-22/+83
Merge misc updates from Andrew Morton: - a few random little subsystems - almost all of the MM patches which are staged ahead of linux-next material. I'll trickle to post-linux-next work in as the dependents get merged up. Subsystems affected by this patch series: kthread, kbuild, ide, ntfs, ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation, kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction, oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc, uaccess, zram, and cleanups). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (200 commits) mm: cleanup kstrto*() usage mm: fix fall-through warnings for Clang mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at mm: shmem: convert shmem_enabled_show to use sysfs_emit_at mm:backing-dev: use sysfs_emit in macro defining functions mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening mm: use sysfs_emit for struct kobject * uses mm: fix kernel-doc markups zram: break the strict dependency from lzo zram: add stat to gather incompressible pages since zram set up zram: support page writeback mm/process_vm_access: remove redundant initialization of iov_r mm/zsmalloc.c: rework the list_add code in insert_zspage() mm/zswap: move to use crypto_acomp API for hardware acceleration mm/zswap: fix passing zero to 'PTR_ERR' warning mm/zswap: make struct kernel_param_ops definitions const userfaultfd/selftests: hint the test runner on required privilege userfaultfd/selftests: fix retval check for userfaultfd_open() userfaultfd/selftests: always dump something in modes userfaultfd: selftests: make __{s,u}64 format specifiers portable ...
2020-12-15kernel/power: allow hibernation with page_poison sanity checkingVlastimil Babka3-5/+13
Page poisoning used to be incompatible with hibernation, as the state of poisoned pages was lost after resume, thus enabling CONFIG_HIBERNATION forces CONFIG_PAGE_POISONING_NO_SANITY. For the same reason, the poisoning with zeroes variant CONFIG_PAGE_POISONING_ZERO used to disable hibernation. The latter restriction was removed by commit 1ad1410f632d ("PM / Hibernate: allow hibernation with PAGE_POISONING_ZERO") and similarly for init_on_free by commit 18451f9f9e58 ("PM: hibernate: fix crashes with init_on_free=1") by making sure free pages are cleared after resume. We can use the same mechanism to instead poison free pages with PAGE_POISON after resume. This covers both zero and 0xAA patterns. Thus we can remove the Kconfig restriction that disables page poison sanity checking when hibernation is enabled. Link: https://lkml.kernel.org/r/20201113104033.22907-4-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [hibernation] Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Laura Abbott <labbott@kernel.org> Cc: Mateusz Nosek <mateusznosek0@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15PM: hibernate: make direct map manipulations more explicitMike Rapoport1-2/+36
When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be not present in the direct map and has to be explicitly mapped before it could be copied. Introduce hibernate_map_page() and hibernation_unmap_page() that will explicitly use set_direct_map_{default,invalid}_noflush() for ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_{map,unmap}_pages() for DEBUG_PAGEALLOC case. The remapping of the pages in safe_copy_page() presumes that it only changes protection bits in an existing PTE and so it is safe to ignore return value of set_direct_map_{default,invalid}_noflush(). Still, add a pr_warn() so that future changes in set_memory APIs will not silently break hibernation. Link: https://lkml.kernel.org/r/20201109192128.960-3-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Len Brown <len.brown@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15workqueue: kasan: record workqueue stackWalter Wu1-0/+3
Patch series "kasan: add workqueue stack for generic KASAN", v5. Syzbot reports many UAF issues for workqueue, see [1]. In some of these access/allocation happened in process_one_work(), we see the free stack is useless in KASAN report, it doesn't help programmers to solve UAF for workqueue issue. This patchset improves KASAN reports by making them to have workqueue queueing stack. It is useful for programmers to solve use-after-free or double-free memory issue. Generic KASAN also records the last two workqueue stacks and prints them in KASAN report. It is only suitable for generic KASAN. [1] https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22+process_one_work [2] https://bugzilla.kernel.org/show_bug.cgi?id=198437 This patch (of 4): When analyzing use-after-free or double-free issue, recording the enqueuing work stacks is helpful to preserve usage history which potentially gives a hint about the affected code. For workqueue it has turned out to be useful to record the enqueuing work call stacks. Because user can see KASAN report to determine whether it is root cause. They don't need to enable debugobjects, but they have a chance to find out the root cause. Link: https://lkml.kernel.org/r/20201203022148.29754-1-walter-zh.wu@mediatek.com Link: https://lkml.kernel.org/r/20201203022442.30006-1-walter-zh.wu@mediatek.com Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com> Suggested-by: Marco Elver <elver@google.com> Acked-by: Marco Elver <elver@google.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Marco Elver <elver@google.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15mm: cleanup: remove unused tsk arg from __access_remote_vmJohn Hubbard1-1/+1
Despite a comment that said that page fault accounting would be charged to whatever task_struct* was passed into __access_remote_vm(), the tsk argument was actually unused. Making page fault accounting actually use this task struct is quite a project, so there is no point in keeping the tsk argument. Delete both the comment, and the argument. [rppt@linux.ibm.com: changelog addition] Link: https://lkml.kernel.org/r/20201026074137.4147787-1-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15mm: memcg/slab: rename *_lruvec_slab_state to *_lruvec_kmem_stateMuchun Song1-1/+1
The *_lruvec_slab_state is also suitable for pages allocated from buddy, not just for the slab objects. But the function name seems to tell us that only slab object is applicable. So we can rename the keyword of slab to kmem. Link: https://lkml.kernel.org/r/20201117085249.24319-1-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Roman Gushchin <guro@fb.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15cgroup: remove obsoleted broken_hierarchy and warned_broken_hierarchyRoman Gushchin1-7/+0
With the deprecation of the non-hierarchical mode of the memory controller there are no more examples of broken hierarchies left. Let's remove the cgroup core code which was supposed to print warnings about creating of broken hierarchies. Link: https://lkml.kernel.org/r/20201110220800.929549-4-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15mm: memcg: deprecate the non-hierarchical modeRoman Gushchin1-5/+0
Patch series "mm: memcg: deprecate cgroup v1 non-hierarchical mode", v1. The non-hierarchical cgroup v1 mode is a legacy of early days of the memory controller and doesn't bring any value today. However, it complicates the code and creates many edge cases all over the memory controller code. It's a good time to deprecate it completely. This patchset removes the internal logic, adjusts the user interface and updates the documentation. The alt patch removes some bits of the cgroup core code, which become obsolete. Michal Hocko said: "All that we know today is that we have a warning in place to complain loudly when somebody relies on use_hierarchy=0 with a deeper hierarchy. For all those years we have seen _zero_ reports that would describe a sensible usecase. Moreover we (SUSE) have backported this warning into old distribution kernels (since 3.0 based kernels) to extend the coverage and didn't hear even for users who adopt new kernels only very slowly. The only report we have seen so far was a LTP test suite which doesn't really reflect any real life usecase" This patch (of 3): The non-hierarchical cgroup v1 mode is a legacy of early days of the memory controller and doesn't bring any value today. However, it complicates the code and creates many edge cases all over the memory controller code. It's a good time to deprecate it completely. Functionally this patch enabled is by default for all cgroups and forbids switching it off. Nothing changes if cgroup v2 is used: hierarchical mode was enforced from scratch. To protect the ABI memory.use_hierarchy interface is preserved with a limited functionality: reading always returns "1", writing of "1" passes silently, writing of any other value fails with -EINVAL and a warning to dmesg (on the first occasion). Link: https://lkml.kernel.org/r/20201110220800.929549-1-guro@fb.com Link: https://lkml.kernel.org/r/20201110220800.929549-2-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15mm/gup: prevent gup_fast from racing with COW during forkJason Gunthorpe1-0/+1
Since commit 70e806e4e645 ("mm: Do early cow for pinned pages during fork() for ptes") pages under a FOLL_PIN will not be write protected during COW for fork. This means that pages returned from pin_user_pages(FOLL_WRITE) should not become write protected while the pin is active. However, there is a small race where get_user_pages_fast(FOLL_PIN) can establish a FOLL_PIN at the same time copy_present_page() is write protecting it: CPU 0 CPU 1 get_user_pages_fast() internal_get_user_pages_fast() copy_page_range() pte_alloc_map_lock() copy_present_page() atomic_read(has_pinned) == 0 page_maybe_dma_pinned() == false atomic_set(has_pinned, 1); gup_pgd_range() gup_pte_range() pte_t pte = gup_get_pte(ptep) pte_access_permitted(pte) try_grab_compound_head() pte = pte_wrprotect(pte) set_pte_at(); pte_unmap_unlock() // GUP now returns with a write protected page The first attempt to resolve this by using the write protect caused problems (and was missing a barrrier), see commit f3c64eda3e50 ("mm: avoid early COW write protect games during fork()") Instead wrap copy_p4d_range() with the write side of a seqcount and check the read side around gup_pgd_range(). If there is a collision then get_user_pages_fast() fails and falls back to slow GUP. Slow GUP is safe against this race because copy_page_range() is only called while holding the exclusive side of the mmap_lock on the src mm_struct. [akpm@linux-foundation.org: coding style fixes] Link: https://lore.kernel.org/r/CAHk-=wi=iCnYCARbPGjkVJu9eyYeZ13N64tZYLdOB8CP5Q_PLw@mail.gmail.com Link: https://lkml.kernel.org/r/2-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com Fixes: f3c64eda3e50 ("mm: avoid early COW write protect games during fork()") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: "Ahmed S. Darwish" <a.darwish@linutronix.de> [seqcount_t parts] Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kirill Shutemov <kirill@shutemov.name> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15kthread_worker: document CPU hotplug handlingPetr Mladek1-1/+19
The kthread worker API is simple. In short, it allows to create, use, and destroy workers. kthread_create_worker_on_cpu() just allows to bind a newly created worker to a given CPU. It is up to the API user how to handle CPU hotplug. They have to decide how to handle pending work items, prevent queuing new ones, and restore the functionality when the CPU goes off and on. There are few catches: + The CPU affinity gets lost when it is scheduled on an offline CPU. + The worker might not exist when the CPU was off when the user created the workers. A good practice is to implement two CPU hotplug callbacks and destroy/create the worker when CPU goes down/up. Mention this in the function description. [akpm@linux-foundation.org: grammar tweaks] Link: https://lore.kernel.org/r/20201028073031.4536-1-qiang.zhang@windriver.com Link: https://lkml.kernel.org/r/20201102101039.19227-1-pmladek@suse.com Reported-by: Zhang Qiang <Qiang.Zhang@windriver.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>