summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide
AgeCommit message (Collapse)AuthorFilesLines
2022-10-14Merge tag 'sched-psi-2022-10-14' of ↵Linus Torvalds1-0/+23
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull PSI updates from Ingo Molnar: - Various performance optimizations, resulting in a 4%-9% speedup in the mmtests/config-scheduler-perfpipe micro-benchmark. - New interface to turn PSI on/off on a per cgroup level. * tag 'sched-psi-2022-10-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/psi: Per-cgroup PSI accounting disable/re-enable interface sched/psi: Cache parent psi_group to speed up group iteration sched/psi: Consolidate cgroup_psi() sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure sched/psi: Remove NR_ONCPU task accounting sched/psi: Optimize task switch inside shared cgroups again sched/psi: Move private helpers to sched/stats.h sched/psi: Save percpu memory when !psi_cgroups_enabled sched/psi: Don't create cgroup PSI files when psi_disabled sched/psi: Fix periodic aggregation shut off
2022-10-13Merge tag 'for-linus-6.1-rc1-tag' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - Some minor typo fixes - A fix of the Xen pcifront driver for supporting the device model to run in a Linux stub domain - A cleanup of the pcifront driver - A series to enable grant-based virtio with Xen on x86 - A cleanup of Xen PV guests to distinguish between safe and faulting MSR accesses - Two fixes of the Xen gntdev driver - Two fixes of the new xen grant DMA driver * tag 'for-linus-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Kconfig: Fix spelling mistake "Maxmium" -> "Maximum" xen/pv: support selecting safe/unsafe msr accesses xen/pv: refactor msr access functions to support safe and unsafe accesses xen/pv: fix vendor checks for pmu emulation xen/pv: add fault recovery control to pmu msr accesses xen/virtio: enable grant based virtio on x86 xen/virtio: use dom0 as default backend for CONFIG_XEN_VIRTIO_FORCE_GRANT xen/virtio: restructure xen grant dma setup xen/pcifront: move xenstore config scanning into sub-function xen/gntdev: Accommodate VMA splitting xen/gntdev: Prevent leaking grants xen/virtio: Fix potential deadlock when accessing xen_grant_dma_devices xen/virtio: Fix n_pages calculation in xen_grant_dma_map(unmap)_page() xen/xenbus: Fix spelling mistake "hardward" -> "hardware" xen-pcifront: Handle missed Connected state
2022-10-12Merge tag 'mm-nonmm-stable-2022-10-11' of ↵Linus Torvalds1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - hfs and hfsplus kmap API modernization (Fabio Francesco) - make crash-kexec work properly when invoked from an NMI-time panic (Valentin Schneider) - ntfs bugfixes (Hawkins Jiawei) - improve IPC msg scalability by replacing atomic_t's with percpu counters (Jiebin Sun) - nilfs2 cleanups (Minghao Chi) - lots of other single patches all over the tree! * tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits) include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype proc: test how it holds up with mapping'less process mailmap: update Frank Rowand email address ia64: mca: use strscpy() is more robust and safer init/Kconfig: fix unmet direct dependencies ia64: update config files nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure fork: remove duplicate included header files init/main.c: remove unnecessary (void*) conversions proc: mark more files as permanent nilfs2: remove the unneeded result variable nilfs2: delete unnecessary checks before brelse() checkpatch: warn for non-standard fixes tag style usr/gen_init_cpio.c: remove unnecessary -1 values from int file ipc/msg: mitigate the lock contention with percpu counter percpu: add percpu_counter_add_local and percpu_counter_sub_local fs/ocfs2: fix repeated words in comments relay: use kvcalloc to alloc page array in relay_alloc_page_array proc: make config PROC_CHILDREN depend on PROC_FS fs: uninline inode_maybe_inc_iversion() ...
2022-10-11xen/pv: support selecting safe/unsafe msr accessesJuergen Gross1-0/+6
Instead of always doing the safe variants for reading and writing MSRs in Xen PV guests, make the behavior controllable via Kconfig option and a boot parameter. The default will be the current behavior, which is to always use the safe variant. Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-11Merge tag 'mm-stable-2022-10-08' of ↵Linus Torvalds13-35/+287
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
2022-10-10Merge tag 'iommu-updates-v6.1' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - remove the bus_set_iommu() interface which became unnecesary because of IOMMU per-device probing - make the dma-iommu.h header private - Intel VT-d changes from Lu Baolu: - Decouple PASID and PRI from SVA - Add ESRTPS & ESIRTPS capability check - Cleanups - Apple DART support for the M1 Pro/MAX SOCs - support for AMD IOMMUv2 page-tables for the DMA-API layer. The v2 page-tables are compatible with the x86 CPU page-tables. Using them for DMA-API prepares support for hardware-assisted IOMMU virtualization - support for MT6795 Helio X10 M4Us in the Mediatek IOMMU driver - some smaller fixes and cleanups * tag 'iommu-updates-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (59 commits) iommu/vt-d: Avoid unnecessary global DMA cache invalidation iommu/vt-d: Avoid unnecessary global IRTE cache invalidation iommu/vt-d: Rename cap_5lp_support to cap_fl5lp_support iommu/vt-d: Remove pasid_set_eafe() iommu/vt-d: Decouple PASID & PRI enabling from SVA iommu/vt-d: Remove unnecessary SVA data accesses in page fault path dt-bindings: iommu: arm,smmu-v3: Relax order of interrupt names iommu: dart: Support t6000 variant iommu/io-pgtable-dart: Add DART PTE support for t6000 iommu/io-pgtable: Add DART subpage protection support iommu/io-pgtable: Move Apple DART support to its own file iommu/mediatek: Add support for MT6795 Helio X10 M4Us iommu/mediatek: Introduce new flag TF_PORT_TO_ADDR_MT8173 dt-bindings: mediatek: Add bindings for MT6795 M4U iommu/iova: Fix module config properly iommu/amd: Fix sparse warning iommu/amd: Remove outdated comment iommu/amd: Free domain ID after domain_flush_pages iommu/amd: Free domain id in error path iommu/virtio: Fix compile error with viommu_capable() ...
2022-10-10Merge tag 'cgroup-for-6.1' of ↵Linus Torvalds1-69/+87
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - cpuset now support isolated cpus.partition type, which will enable dynamic CPU isolation - pids.peak added to remember the max number of pids used - holes in cgroup namespace plugged - internal cleanups * tag 'cgroup-for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (25 commits) cgroup: use strscpy() is more robust and safer iocost_monitor: reorder BlkgIterator cgroup: simplify code in cgroup_apply_control cgroup: Make cgroup_get_from_id() prettier cgroup/cpuset: remove unreachable code cgroup: Remove CFTYPE_PRESSURE cgroup: Improve cftype add/rm error handling kselftest/cgroup: Add cpuset v2 partition root state test cgroup/cpuset: Update description of cpuset.cpus.partition in cgroup-v2.rst cgroup/cpuset: Make partition invalid if cpumask change violates exclusivity rule cgroup/cpuset: Relocate a code block in validate_change() cgroup/cpuset: Show invalid partition reason string cgroup/cpuset: Add a new isolated cpus.partition type cgroup/cpuset: Relax constraints to partition & cpus changes cgroup/cpuset: Allow no-task partition to have empty cpuset.cpus.effective cgroup/cpuset: Miscellaneous cleanups & add helper functions cgroup/cpuset: Enable update_tasks_cpumask() on top_cpuset cgroup: add pids.peak interface for pids controller cgroup: Remove data-race around cgrp_dfl_visible cgroup: Fix build failure when CONFIG_SHRINKER_DEBUG ...
2022-10-10Merge tag 'powerpc-6.1-1' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Remove our now never-true definitions for pgd_huge() and p4d_leaf(). - Add pte_needs_flush() and huge_pmd_needs_flush() for 64-bit. - Add support for syscall wrappers. - Add support for KFENCE on 64-bit. - Update 64-bit HV KVM to use the new guest state entry/exit accounting API. - Support execute-only memory when using the Radix MMU (P9 or later). - Implement CONFIG_PARAVIRT_TIME_ACCOUNTING for pseries guests. - Updates to our linker script to move more data into read-only sections. - Allow the VDSO to be randomised on 32-bit. - Many other small features and fixes. Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Christophe Leroy, David Hildenbrand, Disha Goel, Fabiano Rosas, Gaosheng Cui, Gustavo A. R. Silva, Haren Myneni, Hari Bathini, Jilin Yuan, Joel Stanley, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent Dufour, Liang He, Li Huafei, Lukas Bulwahn, Madhavan Srinivasan, Nathan Chancellor, Nathan Lynch, Nicholas Miehlbradt, Nicholas Piggin, Pali Rohár, Rohan McLure, Russell Currey, Sachin Sant, Segher Boessenkool, Shrikanth Hegde, Tyrel Datwyler, Wolfram Sang, ye xingchen, and Zheng Yongjun. * tag 'powerpc-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits) KVM: PPC: Book3S HV: Fix stack frame regs marker powerpc: Don't add __powerpc_ prefix to syscall entry points powerpc/64s/interrupt: Fix stack frame regs marker powerpc/64: Fix msr_check_and_set/clear MSR[EE] race powerpc/64s/interrupt: Change must-hard-mask interrupt check from BUG to WARN powerpc/pseries: Add firmware details to the hardware description powerpc/powernv: Add opal details to the hardware description powerpc: Add device-tree model to the hardware description powerpc/64: Add logical PVR to the hardware description powerpc: Add PVR & CPU name to hardware description powerpc: Add hardware description string powerpc/configs: Enable PPC_UV in powernv_defconfig powerpc/configs: Update config files for removed/renamed symbols powerpc/mm: Fix UBSAN warning reported on hugetlb powerpc/mm: Always update max/min_low_pfn in mem_topology_setup() powerpc/mm/book3s/hash: Rename flush_tlb_pmd_range powerpc: Drops STABS_DEBUG from linker scripts powerpc/64s: Remove lost/old comment powerpc/64s: Remove old STAB comment powerpc: remove orphan systbl_chk.sh ...
2022-10-09Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+5
Pull kvm updates from Paolo Bonzini: "The first batch of KVM patches, mostly covering x86. ARM: - Account stage2 page table allocations in memory stats x86: - Account EPT/NPT arm64 page table allocations in memory stats - Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR accesses - Drop eVMCS controls filtering for KVM on Hyper-V, all known versions of Hyper-V now support eVMCS fields associated with features that are enumerated to the guest - Use KVM's sanitized VMCS config as the basis for the values of nested VMX capabilities MSRs - A myriad event/exception fixes and cleanups. Most notably, pending exceptions morph into VM-Exits earlier, as soon as the exception is queued, instead of waiting until the next vmentry. This fixed a longstanding issue where the exceptions would incorrecly become double-faults instead of triggering a vmexit; the common case of page-fault vmexits had a special workaround, but now it's fixed for good - A handful of fixes for memory leaks in error paths - Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow - Never write to memory from non-sleepable kvm_vcpu_check_block() - Selftests refinements and cleanups - Misc typo cleanups Generic: - remove KVM_REQ_UNHALT" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits) KVM: remove KVM_REQ_UNHALT KVM: mips, x86: do not rely on KVM_REQ_UNHALT KVM: x86: never write to memory from kvm_vcpu_check_block() KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events KVM: nVMX: Make event request on VMXOFF iff INIT/SIPI is pending KVM: nVMX: Make an event request if INIT or SIPI is pending on VM-Enter KVM: SVM: Make an event request if INIT or SIPI is pending when GIF is set KVM: x86: lapic does not have to process INIT if it is blocked KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific KVM: x86: Rename and expose helper to detect if INIT/SIPI are allowed KVM: nVMX: Make an event request when pending an MTF nested VM-Exit KVM: x86: make vendor code check for all nested events mailmap: Update Oliver's email address KVM: x86: Allow force_emulation_prefix to be written without a reload KVM: selftests: Add an x86-only test to verify nested exception queueing KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes KVM: x86: Rename inject_pending_events() to kvm_check_and_inject_events() KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle behavior KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions KVM: x86: Morph pending exceptions to pending VM-Exits at queue time ...
2022-10-08Merge tag 'driver-core-6.1-rc1' of ↵Linus Torvalds1-118/+128
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big set of driver core and debug printk changes for 6.1-rc1. Included in here is: - dynamic debug updates for the core and the drm subsystem. The drm changes have all been acked by the relevant maintainers - kernfs fixes for syzbot reported problems - kernfs refactors and updates for cgroup requirements - magic number cleanups and removals from the kernel tree (they were not being used and they really did not actually do anything) - other tiny cleanups All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (74 commits) docs: filesystems: sysfs: Make text and code for ->show() consistent Documentation: NBD_REQUEST_MAGIC isn't a magic number a.out: restore CMAGIC device property: Add const qualifier to device_get_match_data() parameter drm_print: add _ddebug descriptor to drm_*dbg prototypes drm_print: prefer bare printk KERN_DEBUG on generic fn drm_print: optimize drm_debug_enabled for jump-label drm-print: add drm_dbg_driver to improve namespace symmetry drm-print.h: include dyndbg header drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro drm_print: interpose drm_*dbg with forwarding macros drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers. drm_print: condense enum drm_debug_category debugfs: use DEFINE_SHOW_ATTRIBUTE to define debugfs_regset32_fops driver core: use IS_ERR_OR_NULL() helper in device_create_groups_vargs() Documentation: ENI155_MAGIC isn't a magic number Documentation: NBD_REPLY_MAGIC isn't a magic number nbd: remove define-only NBD_MAGIC, previously magic number Documentation: FW_HEADER_MAGIC isn't a magic number Documentation: EEPROM_MAGIC_VALUE isn't a magic number ...
2022-10-06Merge tag 'linux-kselftest-kunit-6.1-rc1' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull KUnit updates from Shuah Khan: "Several documentation fixes, UML related cleanups, and a feature to enable/disable KUnit tests This includes the change to rename all_test_uml.config, and use it for '--alltests'. Note: if anyone was using all_tests_uml.config, this change breaks them. This change simplifies the usage and eliminates the need to type: --kunitconfig=tools/testing/kunit/configs/all_tests_uml.config A simple workaround to create a symlink to the new name can solve the problem for anyone using all_tests_uml.config. all_tests_uml.config should work across ~all architectures" * tag 'linux-kselftest-kunit-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: Documentation: Kunit: Use full path to .kunitconfig kunit: tool: rename all_test_uml.config, use it for --alltests kunit: tool: remove UML specific options from all_tests_uml.config lib: stackinit: update reference to kunit-tool lib: overflow: update reference to kunit-tool Documentation: KUnit: update links in the index page Documentation: KUnit: add intro to the getting-started page Documentation: KUnit: Reword start guide for selecting tests Documentation: KUnit: add note about mrproper in start.rst Documentation: KUnit: avoid repeating "kunit.py run" in start.rst Documentation: KUnit: remove duplicated docs for kunit_tool Documentation: Kunit: Add ref for other kinds of tests Documentation: KUnit: Fix non-uml anchor Documentation: Kunit: Fix inconsistent titles Documentation: kunit: fix trivial typo kunit: no longer call module_info(test, "Y") for kunit modules kunit: add kunit.enable to enable/disable KUnit test kunit: tool: make --raw_output=kunit (aka --raw_output) preserve leading spaces
2022-10-06Merge tag 'linux-kselftest-next-6.1-rc1' of ↵Linus Torvalds1-0/+76
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest updates from Shuah Khan: "Fixes and new tests: - Add an amd-pstate-ut test module, used by kselftest to unit test amd-pstate functionality - Fixes and cleanups to to cpu-hotplug to delete the fault injection test code - Improvements to vm test to use top_srcdir for builds" * tag 'linux-kselftest-next-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: docs:kselftest: fix kselftest_module.h path of example module cpufreq: amd-pstate: Add explanation for X86_AMD_PSTATE_UT selftests/cpu-hotplug: Add log info when test success selftests/cpu-hotplug: Reserve one cpu online at least selftests/cpu-hotplug: Delete fault injection related code selftests/cpu-hotplug: Use return instead of exit selftests/cpu-hotplug: Correct log info cpufreq: amd-pstate: modify type in argument 2 for filp_open Documentation: amd-pstate: Add unit test introduction selftests: amd-pstate: Add test trigger for amd-pstate driver cpufreq: amd-pstate: Add test module for amd-pstate driver cpufreq: amd-pstate: Expose struct amd_cpudata selftests/vm: use top_srcdir instead of recomputing relative paths
2022-10-06Merge tag 'arm64-upstream' of ↵Linus Torvalds3-1/+107
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE vector granule register added to the user regs together with SVE perf extensions documentation. - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation to match the actual kernel behaviour (zeroing the registers on syscall rather than "zeroed or preserved" previously). - More conversions to automatic system registers generation. - vDSO: use self-synchronising virtual counter access in gettimeofday() if the architecture supports it. - arm64 stacktrace cleanups and improvements. - arm64 atomics improvements: always inline assembly, remove LL/SC trampolines. - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception handling, better EL1 undefs reporting. - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect result. - arm64 defconfig updates: build CoreSight as a module, enable options necessary for docker, memory hotplug/hotremove, enable all PMUs provided by Arm. - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME extensions). - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove unused function. - kselftest updates for arm64: simple HWCAP validation, FP stress test improvements, validation of ZA regs in signal handlers, include larger SVE and SME vector lengths in signal tests, various cleanups. - arm64 alternatives (code patching) improvements to robustness and consistency: replace cpucap static branches with equivalent alternatives, associate callback alternatives with a cpucap. - Miscellaneous updates: optimise kprobe performance of patching single-step slots, simplify uaccess_mask_ptr(), move MTE registers initialisation to C, support huge vmalloc() mappings, run softirqs on the per-CPU IRQ stack, compat (arm32) misalignment fixups for multiword accesses. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits) arm64: alternatives: Use vdso/bits.h instead of linux/bits.h arm64/kprobe: Optimize the performance of patching single-step slot arm64: defconfig: Add Coresight as module kselftest/arm64: Handle EINTR while reading data from children kselftest/arm64: Flag fp-stress as exiting when we begin finishing up kselftest/arm64: Don't repeat termination handler for fp-stress ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs arm64/mm: fold check for KFENCE into can_set_direct_map() arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static arm64: fix the build with binutils 2.27 kselftest/arm64: Don't enable v8.5 for MTE selftest builds arm64: uaccess: simplify uaccess_mask_ptr() arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header kselftest/arm64: Fix typo in hwcap check arm64: mte: move register initialization to C arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate() arm64: dma: Drop cache invalidation from arch_dma_prep_coherent() arm64/sve: Add Perf extensions documentation ...
2022-10-05Documentation: amd-pstate: Add unit test introductionMeng Li1-0/+76
Introduce the AMD P-State unit test module design and implementation. It also talks about kselftest and how to use. Signed-off-by: Meng Li <li.meng@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-10-04Merge tag 'net-next-6.1' of ↵Linus Torvalds2-13/+13
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - Introduce and use a single page frag cache for allocating small skb heads, clawing back the 10-20% performance regression in UDP flood test from previous fixes. - Run packets which already went thru HW coalescing thru SW GRO. This significantly improves TCP segment coalescing and simplifies deployments as different workloads benefit from HW or SW GRO. - Shrink the size of the base zero-copy send structure. - Move TCP init under a new slow / sleepable version of DO_ONCE(). BPF: - Add BPF-specific, any-context-safe memory allocator. - Add helpers/kfuncs for PKCS#7 signature verification from BPF programs. - Define a new map type and related helpers for user space -> kernel communication over a ring buffer (BPF_MAP_TYPE_USER_RINGBUF). - Allow targeting BPF iterators to loop through resources of one task/thread. - Add ability to call selected destructive functions. Expose crash_kexec() to allow BPF to trigger a kernel dump. Use CAP_SYS_BOOT check on the loading process to judge permissions. - Enable BPF to collect custom hierarchical cgroup stats efficiently by integrating with the rstat framework. - Support struct arguments for trampoline based programs. Only structs with size <= 16B and x86 are supported. - Invoke cgroup/connect{4,6} programs for unprivileged ICMP ping sockets (instead of just TCP and UDP sockets). - Add a helper for accessing CLOCK_TAI for time sensitive network related programs. - Support accessing network tunnel metadata's flags. - Make TCP SYN ACK RTO tunable by BPF programs with TCP Fast Open. - Add support for writing to Netfilter's nf_conn:mark. Protocols: - WiFi: more Extremely High Throughput (EHT) and Multi-Link Operation (MLO) work (802.11be, WiFi 7). - vsock: improve support for SO_RCVLOWAT. - SMC: support SO_REUSEPORT. - Netlink: define and document how to use netlink in a "modern" way. Support reporting missing attributes via extended ACK. - IPSec: support collect metadata mode for xfrm interfaces. - TCPv6: send consistent autoflowlabel in SYN_RECV state and RST packets. - TCP: introduce optional per-netns connection hash table to allow better isolation between namespaces (opt-in, at the cost of memory and cache pressure). - MPTCP: support TCP_FASTOPEN_CONNECT. - Add NEXT-C-SID support in Segment Routing (SRv6) End behavior. - Adjust IP_UNICAST_IF sockopt behavior for connected UDP sockets. - Open vSwitch: - Allow specifying ifindex of new interfaces. - Allow conntrack and metering in non-initial user namespace. - TLS: support the Korean ARIA-GCM crypto algorithm. - Remove DECnet support. Driver API: - Allow selecting the conduit interface used by each port in DSA switches, at runtime. - Ethernet Power Sourcing Equipment and Power Device support. - Add tc-taprio support for queueMaxSDU parameter, i.e. setting per traffic class max frame size for time-based packet schedules. - Support PHY rate matching - adapting between differing host-side and link-side speeds. - Introduce QUSGMII PHY mode and 1000BASE-KX interface mode. - Validate OF (device tree) nodes for DSA shared ports; make phylink-related properties mandatory on DSA and CPU ports. Enforcing more uniformity should allow transitioning to phylink. - Require that flash component name used during update matches one of the components for which version is reported by info_get(). - Remove "weight" argument from driver-facing NAPI API as much as possible. It's one of those magic knobs which seemed like a good idea at the time but is too indirect to use in practice. - Support offload of TLS connections with 256 bit keys. New hardware / drivers: - Ethernet: - Microchip KSZ9896 6-port Gigabit Ethernet Switch - Renesas Ethernet AVB (EtherAVB-IF) Gen4 SoCs - Analog Devices ADIN1110 and ADIN2111 industrial single pair Ethernet (10BASE-T1L) MAC+PHY. - Rockchip RV1126 Gigabit Ethernet (a version of stmmac IP). - Ethernet SFPs / modules: - RollBall / Hilink / Turris 10G copper SFPs - HALNy GPON module - WiFi: - CYW43439 SDIO chipset (brcmfmac) - CYW89459 PCIe chipset (brcmfmac) - BCM4378 on Apple platforms (brcmfmac) Drivers: - CAN: - gs_usb: HW timestamp support - Ethernet PHYs: - lan8814: cable diagnostics - Ethernet NICs: - Intel (100G): - implement control of FCS/CRC stripping - port splitting via devlink - L2TPv3 filtering offload - nVidia/Mellanox: - tunnel offload for sub-functions - MACSec offload, w/ Extended packet number and replay window offload - significantly restructure, and optimize the AF_XDP support, align the behavior with other vendors - Huawei: - configuring DSCP map for traffic class selection - querying standard FEC statistics - querying SerDes lane number via ethtool - Marvell/Cavium: - egress priority flow control - MACSec offload - AMD/SolarFlare: - PTP over IPv6 and raw Ethernet - small / embedded: - ax88772: convert to phylink (to support SFP cages) - altera: tse: convert to phylink - ftgmac100: support fixed link - enetc: standard Ethtool counters - macb: ZynqMP SGMII dynamic configuration support - tsnep: support multi-queue and use page pool - lan743x: Rx IP & TCP checksum offload - igc: add xdp frags support to ndo_xdp_xmit - Ethernet high-speed switches: - Marvell (prestera): - support SPAN port features (traffic mirroring) - nexthop object offloading - Microchip (sparx5): - multicast forwarding offload - QoS queuing offload (tc-mqprio, tc-tbf, tc-ets) - Ethernet embedded switches: - Marvell (mv88e6xxx): - support RGMII cmode - NXP (felix): - standardized ethtool counters - Microchip (lan966x): - QoS queuing offload (tc-mqprio, tc-tbf, tc-cbs, tc-ets) - traffic policing and mirroring - link aggregation / bonding offload - QUSGMII PHY mode support - Qualcomm 802.11ax WiFi (ath11k): - cold boot calibration support on WCN6750 - support to connect to a non-transmit MBSSID AP profile - enable remain-on-channel support on WCN6750 - Wake-on-WLAN support for WCN6750 - support to provide transmit power from firmware via nl80211 - support to get power save duration for each client - spectral scan support for 160 MHz - MediaTek WiFi (mt76): - WiFi-to-Ethernet bridging offload for MT7986 chips - RealTek WiFi (rtw89): - P2P support" * tag 'net-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1864 commits) eth: pse: add missing static inlines once: rename _SLOW to _SLEEPABLE net: pse-pd: add regulator based PSE driver dt-bindings: net: pse-dt: add bindings for regulator based PoDL PSE controller ethtool: add interface to interact with Ethernet Power Equipment net: mdiobus: search for PSE nodes by parsing PHY nodes. net: mdiobus: fwnode_mdiobus_register_phy() rework error handling net: add framework to support Ethernet PSE and PDs devices dt-bindings: net: phy: add PoDL PSE property net: marvell: prestera: Propagate nh state from hw to kernel net: marvell: prestera: Add neighbour cache accounting net: marvell: prestera: add stub handler neighbour events net: marvell: prestera: Add heplers to interact with fib_notifier_info net: marvell: prestera: Add length macros for prestera_ip_addr net: marvell: prestera: add delayed wq and flush wq on deinit net: marvell: prestera: Add strict cleanup of fib arbiter net: marvell: prestera: Add cleanup of allocated fib_nodes net: marvell: prestera: Add router nexthops ABI eth: octeon: fix build after netif_napi_add() changes net/mlx5: E-Switch, Return EBUSY if can't get mode lock ...
2022-10-04Merge tag 'x86_microcode_for_v6.1_rc1' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x75 microcode loader updates from Borislav Petkov: - Get rid of a single ksize() usage - By popular demand, print the previous microcode revision an update was done over - Remove more code related to the now gone MICROCODE_OLD_INTERFACE - Document the problems stemming from microcode late loading * tag 'x86_microcode_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/AMD: Track patch allocation size explicitly x86/microcode: Print previous version of microcode after reload x86/microcode: Remove ->request_microcode_user() x86/microcode: Document the whole late loading problem
2022-10-04Merge tag 'x86_apic_for_v6.1_rc1' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 APIC update from Borislav Petkov: - Add support for locking the APIC in X2APIC mode to prevent SGX enclave leaks * tag 'x86_apic_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Don't disable x2APIC if locked
2022-10-04mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbolJohannes Weiner1-3/+1
Since 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control"), CONFIG_MEMCG_SWAP hasn't been a user-visible config option anymore, it just means CONFIG_MEMCG && CONFIG_SWAP. Update the sites accordingly and drop the symbol. [ While touching the docs, remove two references to CONFIG_MEMCG_KMEM, which hasn't been a user-visible symbol for over half a decade. ] Link: https://lkml.kernel.org/r/20220926135704.400818-5-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeelb@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm: memcontrol: deprecate swapaccounting=0 modeJohannes Weiner1-6/+0
The swapaccounting= commandline option already does very little today. To close a trivial containment failure case, the swap ownership tracking part of the swap controller has recently become mandatory (see commit 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control") for details), which makes up the majority of the work during swapout, swapin, and the swap slot map. The only thing left under this flag is the page_counter operations and the visibility of the swap control files in the first place, which are rather meager savings. There also aren't many scenarios, if any, where controlling the memory of a cgroup while allowing it unlimited access to a global swap space is a workable resource isolation strategy. On the other hand, there have been several bugs and confusion around the many possible swap controller states (cgroup1 vs cgroup2 behavior, memory accounting without swap accounting, memcg runtime disabled). This puts the maintenance overhead of retaining the toggle above its practical benefits. Deprecate it. Link: https://lkml.kernel.org/r/20220926135704.400818-3-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Suggested-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by pmdsZach O'Keefe1-1/+8
The main benefit of THPs are that they can be mapped at the pmd level, increasing the likelihood of TLB hit and spending less cycles in page table walks. pte-mapped hugepages - that is - hugepage-aligned compound pages of order HPAGE_PMD_ORDER mapped by ptes - although being contiguous in physical memory, don't have this advantage. In fact, one could argue they are detrimental to system performance overall since they occupy a precious hugepage-aligned/sized region of physical memory that could otherwise be used more effectively. Additionally, pte-mapped hugepages can be the cheapest memory to collapse for khugepaged since no new hugepage allocation or copying of memory contents is necessary - we only need to update the mapping page tables. In the anonymous collapse path, we are able to collapse pte-mapped hugepages (albeit, perhaps suboptimally), but the file/shmem path makes no effort when compound pages (of any order) are encountered. Identify pte-mapped hugepages in the file/shmem collapse path. The final step of which makes a racy check of the value of the pmd to ensure it maps a pte table. This should be fine, since races that result in false-positive (i.e. attempt collapse even though we shouldn't) will fail later in collapse_pte_mapped_thp() once we actually lock mmap_lock and reinspect the pmd value. Races that result in false-negatives (i.e. where we decide to not attempt collapse, but should have) shouldn't be an issue, since in the worst case, we do nothing - which is what we've done up to this point. We make a similar check in retract_page_tables(). If we do think we've found a pte-mapped hugepgae in khugepaged context, attempt to update page tables mapping this hugepage. Note that these collapses still count towards the /sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed counter, and if the pte-mapped hugepage was also mapped into multiple process' address spaces, could be incremented for each page table update. Since we increment the counter when a pte-mapped hugepage is successfully added to the list of to-collapse pte-mapped THPs, it's possible that we never actually update the page table either. This is different from how file/shmem pages_collapsed accounting works today where only a successful page cache update is counted (it's also possible here that no page tables are actually changed). Though it incurs some slop, this is preferred to either not accounting for the event at all, or plumbing through data in struct mm_slot on whether to account for the collapse or not. Also note that work still needs to be done to support arbitrary compound pages, and that this should all be converted to using folios. [shy828301@gmail.com: Spelling mistake, update comment, and add Documentation] Link: https://lore.kernel.org/linux-mm/CAHbLzkpHwZxFzjfX9nxVoRhzup8WMjMfyL6Xiq8mZ9M-N3ombw@mail.gmail.com/ Link: https://lkml.kernel.org/r/20220907144521.3115321-3-zokeefe@google.com Link: https://lkml.kernel.org/r/20220922224046.1143204-3-zokeefe@google.com Signed-off-by: Zach O'Keefe <zokeefe@google.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Chris Kennelly <ckennelly@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com> Cc: SeongJae Park <sj@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/huge_memory: prevent THP_ZERO_PAGE_ALLOC increased twiceLiu Shixin1-4/+3
A user who reads THP_ZERO_PAGE_ALLOC may be more concerned about the huge zero pages that are really allocated for thp. It is misleading to increase THP_ZERO_PAGE_ALLOC twice if two threads call get_huge_zero_page concurrently. Don't increase the value if the huge page is not really used. Update Documentation/admin-guide/mm/transhuge.rst to suit. Link: https://lkml.kernel.org/r/20220909021653.3371879-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Cc: Alexander Potapenko <glider@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04Docs/admin-guide/mm/damon/usage: note DAMON debugfs interface deprecation planSeongJae Park1-0/+5
Commit b18402726bd1 ("Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface") announced the DAMON debugfs interface deprecation plan, but it is not so aggressively announced. As the deprecation time is coming, this commit makes the announce more easy to be found by adding the note at the beginning of the DAMON debugfs interface usage document. Link: https://lkml.kernel.org/r/20220909202901.57977-8-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Cc: Yun Levi <ppbuk5246@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04Docs/admin-guide/mm/damon/start: mention the dependency as sysfs instead of ↵SeongJae Park1-10/+3
debugfs 'Getting Started' document of DAMON says DAMON user-space tool, damo[1], is using DAMON debugfs interface, and therefore it needs to ensure debugfs is mounted. However, the latest version of the tool is using DAMON sysfs interface. Moreover, DAMON debugfs interface is going to be deprecated as announced by commit b18402726bd1 ("Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface"). This commit therefore update the document to tell readers about DAMON sysfs interface dependency instead and never mention about debugfs interface, which will be deprecated. [1] https://github.com/awslabs/damo Link: https://lkml.kernel.org/r/20220909202901.57977-7-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Cc: Yun Levi <ppbuk5246@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04Docs/admin-guide/mm/damon: rename the title of the documentSeongJae Park1-3/+3
The title of the DAMON document for admin-guide, 'Monitoring Data Accesses', could confuse readers in some ways. First of all, DAMON is not the only single way for data access monitoring. And the document is for not only the data access monitoring but also data access pattern based memory management optimizations (DAMOS). This commit updates the title to 'DAMON: Data Access MONitor', which more explicitly explains what the document describes. Link: https://lkml.kernel.org/r/20220909202901.57977-5-sj@kernel.org Fixes: c4ba6014aec3 ("Documentation: add documents for DAMON") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Cc: Yun Levi <ppbuk5246@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03Merge tag 'acpi-6.1-rc1' of ↵Linus Torvalds1-13/+0
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "ACPI and PNP updates for 6.1-rc1. These rearrange the ACPI device object initialization code (to get rid of a redundant parent pointer from struct acpi_device among other things), unify the _UID handling, drop support for some _OSI strings that should not be necessary any more, add new IDs to support more hardware and some more quirks, fix a few issues and clean up code all over. Specifics: - Reimplement acpi_get_pci_dev() using the list of physical devices associated with the given ACPI device object (Rafael Wysocki) - Rename ACPI device object reference counting functions (Rafael Wysocki) - Rearrange ACPI device object initialization code (Rafael Wysocki) - Drop parent field from struct acpi_device (Rafael Wysocki) - Extend the the int3472-tps68470 driver to support multiple consumers of a single TPS68470 along with the requisite framework-level support (Daniel Scally) - Filter out non-memory resources in is_memory(), add a helper function to find all memory type resources of an ACPI device object and use that function in 3 places (Heikki Krogerus) - Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS model S5402ZA (Tamim Khan, Kellen Renshaw) - Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus) - Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario Limonciello) - Clean up ACPI platform devices support code (Andy Shevchenko, John Garry) - Clean up ACPI bus management code (Andy Shevchenko, ye xingchen) - Add support for multiple DMA windows with different offsets to the ACPI device enumeration code and use it on LoongArch (Jianmin Lv) - Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko) - Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario Limonciello) - Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT parsing code (Liu Shixin) - Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on invalid physical addresses (Hans de Goede) - Silence missing-declarations warning related to Apple device properties management (Lukas Wunner) - Disable frequency invariance in the CPPC library if registers used by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton) - Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan) - Fix Tx acknowledge in the PCC address space handler (Huisong Li) - Use wait_for_completion_timeout() for PCC mailbox operations (Huisong Li) - Release resources on PCC address space setup failure path (Rafael Mendonca) - Remove unneeded result variables from APEI code (ye xingchen) - Print total number of records found during BERT log parsing (Dmitry Monakhov) - Drop support for 3 _OSI strings that should not be necessary any more and update documentation on custom _OSI strings so that adding new ones is not encouraged any more (Mario Limonciello) - Drop unneeded result variable from ec_write() (ye xingchen) - Remove the leftover struct acpi_ac_bl from the ACPI AC driver (Hanjun Guo) - Reorder symbols to get rid of a few forward declarations in the ACPI fan driver (Uwe Kleine-König) - Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid Norlander) - Add ARM DMA-330 controller to the supported list in the ACPI AMBA driver (Vijayenthiran Subramaniam) - Drop references to non-functional 01.org/linux-acpi web site from MAINTAINERS and Kconfig help texts (Rafael Wysocki) - Replace strlcpy() with unused retval with strscpy() in the ACPI support code (Wolfram Sang) - Do not initialize ret in main() in the pfrut utility (Shi junming) - Drop useless ACPI DSDT override documentation (Rafael Wysocki) - Fix a few typos and wording mistakes in the ACPI device enumeration documentation (Jean Delvare) - Introduce acpi_dev_uid_to_integer() to convert a _UID string into an integer value (Andy Shevchenko) - Use acpi_dev_uid_to_integer() in several places to unify _UID handling (Andy Shevchenko) - Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng Cui)" * tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (79 commits) ACPI: LPSS: Deduplicate skipping device in acpi_lpss_create_device() ACPI: LPSS: Replace loop with first entry retrieval ACPI: x86: s2idle: Add another ID to s2idle_dmi_table ACPI: x86: s2idle: Fix a NULL pointer dereference MAINTAINERS: Drop records pointing to 01.org/linux-acpi ACPI: Kconfig: Drop link to https://01.org/linux-acpi ACPI: docs: Drop useless DSDT override documentation ACPI: DPTF: Drop stale link from Kconfig help ACPI: x86: s2idle: Add a quirk for ASUSTeK COMPUTER INC. ROG Flow X13 ACPI: x86: s2idle: Add a quirk for Lenovo Slim 7 Pro 14ARH7 ACPI: x86: s2idle: Add a quirk for ASUS ROG Zephyrus G14 ACPI: x86: s2idle: Add a quirk for ASUS TUF Gaming A17 FA707RE ACPI: x86: s2idle: Add module parameter to prefer Microsoft GUID ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt ACPI: x86: s2idle: Move _HID handling for AMD systems into structures platform/x86: int3472: Add board data for Surface Go2 IR camera platform/x86: int3472: Support multiple gpio lookups in board data platform/x86: int3472: Support multiple clock consumers ACPI: bus: Add iterator for dependent devices ACPI: scan: Add acpi_dev_get_next_consumer_dev() ...
2022-10-03Merge branches 'acpi-misc', 'acpi-tools' and 'acpi-docs'Rafael J. Wysocki1-13/+0
Merge miscellaneous ACPI material, ACPI tools changes and ACPI documentation updates for 6.1-rc1: - Drop references to non-functional 01.org/linux-acpi web site from MAINTAINERS and Kconfig help texts (Rafael Wysocki). - Replace strlcpy() with unused retval with strscpy() in the ACPI support code (Wolfram Sang). - Do not initialize ret in main() in the pfrut utility (Shi junming). - Drop useless ACPI DSDT override documentation (Rafael Wysocki). - Fix a few typos and wording mistakes in the ACPI device enumeration documentation (Jean Delvare). * acpi-misc: MAINTAINERS: Drop records pointing to 01.org/linux-acpi ACPI: Kconfig: Drop link to https://01.org/linux-acpi ACPI: DPTF: Drop stale link from Kconfig help ACPI: move from strlcpy() with unused retval to strscpy() * acpi-tools: ACPI: tools: pfrut: Do not initialize ret in main() * acpi-docs: ACPI: docs: Drop useless DSDT override documentation ACPI: docs: enumeration: Fix a few typos and wording mistakes
2022-10-03Merge tag 'docs-6.1' of git://git.lwn.net/linuxLinus Torvalds4-86/+10
Pull documentation updates from Jonathan Corbet: "There's not a huge amount of activity in the docs tree this time around, but a few significant changes even so: - A complete rewriting of the top-level index.rst file, which mostly reflects itself in a redone top page in the HTML-rendered docs. The hope is that the new organization will be a friendlier starting point for both users and developers. - Some math-rendering improvements. - A coding-style.rst update on the use of BUG() and WARN() - A big maintainer-PHP guide update. - Some code-of-conduct updates - More Chinese translation work Plus the usual pile of typo fixes, corrections, and updates" * tag 'docs-6.1' of git://git.lwn.net/linux: (66 commits) checkpatch: warn on usage of VM_BUG_ON() and other BUG variants coding-style.rst: document BUG() and WARN() rules ("do not crash the kernel") Documentation: devres: add missing IO helper Documentation: devres: update IRQ helper Documentation/mm: modify page_referenced to folio_referenced Documentation/CoC: Reflect current CoC interpretation and practices docs/doc-guide: Add documentation on SPHINX_IMGMATH docs: process/5.Posting.rst: clarify use of Reported-by: tag docs, kprobes: Fix the wrong location of Kprobes docs: add a man-pages link to the front page docs: put atomic*.txt and memory-barriers.txt into the core-api book docs: move asm-annotations.rst into core-api docs: remove some index.rst cruft docs: reconfigure the HTML left column docs: Rewrite the front page docs: promote the title of process/index.rst Documentation: devres: add missing SPI helper Documentation: devres: add missing PINCTRL helpers docs: hugetlbpage.rst: fix a typo of hugepage size docs/zh_CN: Add new translation of admin-guide/bootconfig.rst ...
2022-09-30kunit: add kunit.enable to enable/disable KUnit testJoe Fradley1-0/+6
This patch adds the kunit.enable module parameter that will need to be set to true in addition to KUNIT being enabled for KUnit tests to run. The default value is true giving backwards compatibility. However, for the production+testing use case the new config option KUNIT_DEFAULT_ENABLED can be set to N requiring the tester to opt-in by passing kunit.enable=1 to the kernel. Signed-off-by: Joe Fradley <joefradley@google.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-09-30Merge branch 'for-next/misc' into for-next/coreCatalin Marinas1-1/+6
* for-next/misc: : Miscellaneous patches arm64/kprobe: Optimize the performance of patching single-step slot ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs arm64/mm: fold check for KFENCE into can_set_direct_map() arm64: uaccess: simplify uaccess_mask_ptr() arm64: mte: move register initialization to C arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate() arm64: dma: Drop cache invalidation from arch_dma_prep_coherent() arm64: support huge vmalloc mappings arm64: spectre: increase parameters that can be used to turn off bhb mitigation individually arm64: run softirqs on the per-CPU IRQ stack arm64: compat: Implement misalignment fixups for multiword loads
2022-09-28ACPI: docs: Drop useless DSDT override documentationRafael J. Wysocki1-13/+0
Because https://01.org/linux-acpi web site has become permanently inaccessible, the "Overriding DSDT" document in the kernel tree pointing to it as the main source of information is useless (and the config option name mentioned by it is incorrect), so drop it and drop the pointer to it from the ACPI Kconfig. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-09-27docs: hugetlbpage.rst: fix a typo of hugepage sizeHoi Pok Wu1-1/+1
should be kB instead of Kb Signed-off-by: Hoi Pok Wu <wuhoipok@gmail.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Link: https://lore.kernel.org/r/20220922030645.9719-1-wuhoipok@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-09-27Documentation/hw-vuln: Update spectre docLin Yujun1-0/+1
commit 7c693f54c873691 ("x86/speculation: Add spectre_v2=ibrs option to support Kernel IBRS") adds the "ibrs " option in Documentation/admin-guide/kernel-parameters.txt but omits it to Documentation/admin-guide/hw-vuln/spectre.rst, add it. Signed-off-by: Lin Yujun <linyujun809@huawei.com> Link: https://lore.kernel.org/r/20220830123614.23007-1-linyujun809@huawei.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-09-27docs: admin-guide: for kernel bugs refer to other kernel documentationLukas Bulwahn1-82/+7
The current section 'If something goes wrong' makes a number of suggestions for debugging, bug hunting and reporting issues, which are quite briefly described in that section. However, the suggestions are also well covered in other kernel documentation or sometimes simply outdated. Here, each suggestion in that section is summarized, and then followed with its assessment, and the derived action for each suggestion: - use MAINTAINERS and mailing list: covered in 'Reporting issues', summarized in the short guide, detailed in its further section. Reporting issues even provides some specific examples that guides readers well through the needed steps. Refer to 'Reporting issues'. - contact Linus Torvalds: probably outdated as currently described. nevertheless covered in 'Reporting issues'. Reporting issues points out to contact the relevant kernel maintainers first, and after some patience and failed attempts with those maintainers, contacting Linus Torvalds might be okay. Refer to 'Reporting issues'. - tell what kernel, how to duplicate, the setup, if the problem is new or old and when did you notice: covered in 'Reporting issues', especially in Step-by-step guide how to report issues to the kernel maintainers. Refer to 'Reporting issues'. - duplicate kernel bug reports exactly: covered in 'Reporting issues', especially in Write and send the report. Refer to 'Reporting issues'. - read 'Bug hunting': keep this reference. Refer to 'Bug hunting'. - compile the kernel with CONFIG_KALLSYMS: covered in 'Reporting issues', especially in Decode failure messages. Refer to 'Reporting issues'. - alternatively, use ksymoops: ksymoops at the mentioned URL seems not to be maintained anymore. It was released roughly once a year until version 2.4.11 in 2005, but has not seen a new release since then. The information in ./scripts/ksymoops/README is from 1999, and does not give more insight on its actual maintenance state either. Ksymoops is mentioned as system utility in changes.rst, but also not recommended there. Drop the explanation on using ksymoops. - alternatively, lookup dump manually with the EIP and nm to determine the function in which the kernel crashes: this method seems already a quite advanced and low-level debugging method. Even all the further references on bug hunting and debugging do not mention it. Drop this alternative method and limit mentioning methods explained in the other existing kernel documentation. - read 'Reporting issues': keep this reference. Refer to 'Reporting issues'. - use gdb for debugging: some specific details, e.g., edit arch/x86/Makefile, are probably outdated or limited to one (historic important) setup. Using gdb is covered in 'Bug hunting', 'Debugging kernel and modules via gdb' and 'Using kgdb, kdb and the kernel debugger internals'. Refer to those three documents. Overall, it is sufficient to refer to reporting-issues.rst, bug-hunting.rst, gdb-kernel-debugging.rst and kgdb.rst and this way cover the existing suggestions. 'Reporting issues' is quite new and probably up to date. 'Bug hunting', 'Debugging kernel and modules via gdb' and 'Using kgdb, kdb and the kernel debugger internals' might need some revisit and update, but they are generally in an acceptable state for referring to them. Replace the existing suggestions by reference to other existing kernel documentation covering those suggestions---partly even nicely summarized and then explained in greater detail. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Link: https://lore.kernel.org/r/20220720041325.15693-3-lukas.bulwahn@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-09-27docs: admin-guide: do not mention the 'run a.out user programs' featureLukas Bulwahn1-2/+0
Running a.out user programs with the latest kernel release is a very rare and uncommon use case nowadays. The support of a.out user programs is only remaining for the alpha architecture and is not defined and activated in the architecture's Kconfig (so even the activation of this support requires to modify the Kconfig file and not just kernel build configuration). The discussion on a.out support in 2019 (see Link) shows that the support of a.out user programs is just remaining for a special corner case from some (alpha architecture) users. There is no need to point out and mention this special feature to the general audience of kernel users. Delete the reference to this historic and special feature. Link: https://lore.kernel.org/all/CAHk-=wgt7M6yA5BJCJo0nF22WgPJnN8CvViL9CAJmd+S+Civ6w@mail.gmail.com/ Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Link: https://lore.kernel.org/r/20220720041325.15693-2-lukas.bulwahn@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-09-27Remove duplicate words inside documentationAkhil Raj1-1/+1
I have removed repeated `the` inside the documentation Signed-off-by: Akhil Raj <lf32.dev@gmail.com> Link: https://lore.kernel.org/r/20220827145359.32599-1-lf32.dev@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-09-27ksm: add profit monitoring documentationxu xin1-0/+36
Add the description of KSM profit and how to determine it separately in system-wide range and inner a single process. Link: https://lkml.kernel.org/r/20220830144003.299870-1-xu.xin16@zte.com.cn Signed-off-by: xu xin <xu.xin16@zte.com.cn> Reviewed-by: Xiaokai Ran <ran.xiaokai@zte.com.cn> Reviewed-by: Yang Yang <yang.yang29@zte.com.cn> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-27mm: multi-gen LRU: admin guideYu Zhao2-0/+163
Add an admin guide. Link: https://lkml.kernel.org/r/20220918080010.2920238-14-yuzhao@google.com Signed-off-by: Yu Zhao <yuzhao@google.com> Acked-by: Brian Geffon <bgeffon@google.com> Acked-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org> Acked-by: Oleksandr Natalenko <oleksandr@natalenko.name> Acked-by: Steven Barrett <steven@liquorix.net> Acked-by: Suleiman Souhlal <suleiman@google.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Tested-by: Daniel Byrne <djbyrne@mtu.edu> Tested-by: Donald Carr <d@chaos-reins.com> Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Tested-by: Shuang Zhai <szhai2@cs.rochester.edu> Tested-by: Sofia Trinh <sofia.trinh@edi.works> Tested-by: Vaibhav Jain <vaibhav@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Barry Song <baohua@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michael Larabel <Michael@MichaelLarabel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26Documentation: Rename PPC_FSL_BOOK3E to PPC_E500Christophe Leroy1-1/+1
CONFIG_PPC_FSL_BOOK3E is redundant with CONFIG_PPC_E500. Rename it so that CONFIG_PPC_FSL_BOOK3E can be removed later. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d3d42b395c09e66b0705fda1e51779f33e13ac38.1663606876.git.christophe.leroy@csgroup.eu
2022-09-23Merge branch 'for-6.0-fixes' into for-6.1Tejun Heo5-25/+41
for-6.0 has the following fix for cgroup_get_from_id(). 836ac87d ("cgroup: fix cgroup_get_from_id") which conflicts with the following two commits in for-6.1. 4534dee9 ("cgroup: cgroup: Honor caller's cgroup NS when resolving cgroup id") fa7e439c ("cgroup: Homogenize cgroup_get_from_id() return value") While the resolution is straightforward, the code ends up pretty ugly afterwards. Let's pull for-6.0-fixes into for-6.1 so that the code can be fixed up there. Signed-off-by: Tejun Heo <tj@kernel.org>
2022-09-22docs: perf: Add description for Alibaba's T-Head PMU driverShuai Xue2-0/+101
Alibaba's T-Head SoC implements uncore PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Link: https://lore.kernel.org/r/20220914022326.88550-2-xueshuai@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>
2022-09-16bpf: Use bpf_capable() instead of CAP_SYS_ADMIN for blinding decisionYauheni Kaliuta1-0/+3
The full CAP_SYS_ADMIN requirement for blinding looks too strict nowadays. These days given unprivileged BPF is disabled by default, the main users for constant blinding coming from unprivileged in particular via cBPF -> eBPF migration (e.g. old-style socket filters). Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220831090655.156434-1-ykaliuta@redhat.com Link: https://lore.kernel.org/bpf/20220905090149.61221-1-ykaliuta@redhat.com
2022-09-16arm64: support huge vmalloc mappingsKefeng Wang1-1/+1
As commit 559089e0a93d ("vmalloc: replace VM_NO_HUGE_VMAP with VM_ALLOW_HUGE_VMAP"), the use of hugepage mappings for vmalloc is an opt-in strategy, so it is saftly to support huge vmalloc mappings on arm64, for now, it is used in kvmalloc() and alloc_large_system_hash(). Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20220911044423.139229-1-wangkefeng.wang@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-12Merge 6.0-rc5 into driver-core-nextGreg Kroah-Hartman5-25/+41
We need the driver core and debugfs changes in this branch. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-12kernel/utsname_sysctl.c: print kernel archPetr Vorel1-0/+5
Print the machine hardware name (UTS_MACHINE) in /proc/sys/kernel/arch. This helps people who debug kernel with initramfs with minimal environment (i.e. without coreutils or even busybox) or allow to open sysfs file instead of run 'uname -m' in high level languages. Link: https://lkml.kernel.org/r/20220901194403.3819-1-pvorel@suse.cz Signed-off-by: Petr Vorel <pvorel@suse.cz> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: David Sterba <dsterba@suse.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-12page_ext: introduce boot parameter 'early_page_ext'Li Zhe1-0/+8
In commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in page_ext_init""), we call page_ext_init() after page_alloc_init_late() to avoid some panic problem. It seems that we cannot track early page allocations in current kernel even if page structure has been initialized early. This patch introduces a new boot parameter 'early_page_ext' to resolve this problem. If we pass it to the kernel, page_ext_init() will be moved up and the feature 'deferred initialization of struct pages' will be disabled to initialize the page allocator early and prevent the panic problem above. It can help us to catch early page allocations. This is useful especially when we find that the free memory value is not the same right after different kernel booting. [akpm@linux-foundation.org: fix section issue by removing __meminitdata] Link: https://lkml.kernel.org/r/20220825102714.669-1-lizhe.67@bytedance.com Signed-off-by: Li Zhe <lizhe.67@bytedance.com> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-12memory tiering: rate limit NUMA migration throughputHuang Ying1-0/+11
In NUMA balancing memory tiering mode, if there are hot pages in slow memory node and cold pages in fast memory node, we need to promote/demote hot/cold pages between the fast and cold memory nodes. A choice is to promote/demote as fast as possible. But the CPU cycles and memory bandwidth consumed by the high promoting/demoting throughput will hurt the latency of some workload because of accessing inflating and slow memory bandwidth contention. A way to resolve this issue is to restrict the max promoting/demoting throughput. It will take longer to finish the promoting/demoting. But the workload latency will be better. This is implemented in this patch as the page promotion rate limit mechanism. The number of the candidate pages to be promoted to the fast memory node via NUMA balancing is counted, if the count exceeds the limit specified by the users, the NUMA balancing promotion will be stopped until the next second. A new sysctl knob kernel.numa_balancing_promote_rate_limit_MBps is added for the users to specify the limit. Link: https://lkml.kernel.org/r/20220713083954.34196-3-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: osalvador <osalvador@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Wei Xu <weixugc@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Zhong Jiang <zhongjiang-ali@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-12mm/cma_debug: show complete cma name in debugfs directoriesCharan Teja Kalla1-5/+5
Currently only 12 characters of the cma name is being used as the debug directories where as the cma name can be of length CMA_MAX_NAME(=64) characters. One side problem with this is having 2 cma's with first common 12 characters would end up in trying to create directories with same name and fails with -EEXIST thus can limit cma debug functionality. The 'cma-' prefix is used initially where cma areas don't have any names and are represented by simple integer values. Since now each cma would be having its own name, drop 'cma-' prefix for the cma debug directories as they are clearly evident that they are for cma debug through creating them in /sys/kernel/debug/cma/ path. Link: https://lkml.kernel.org/r/1660223729-22461-1-git-send-email-quic_charante@quicinc.com Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com> Cc: David Hildenbrand <david@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Pavan Kondeti <quic_pkondeti@quicinc.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-12userfaultfd: update documentation to describe /dev/userfaultfdAxel Rasmussen2-3/+41
Explain the different ways to create a new userfaultfd, and how access control works for each way. [axelrasmussen@google.com: improve wording in documentation, per Mike] Link: https://lkml.kernel.org/r/20220819205201.658693-5-axelrasmussen@google.com Link: https://lkml.kernel.org/r/20220808175614.3885028-5-axelrasmussen@google.com Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry V. Levin <ldv@altlinux.org> Cc: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nadav Amit <namit@vmware.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-09arm64: spectre: increase parameters that can be used to turn off bhb ↵Liu Song1-0/+5
mitigation individually In our environment, it was found that the mitigation BHB has a great impact on the benchmark performance. For example, in the lmbench test, the "process fork && exit" test performance drops by 20%. So it is necessary to have the ability to turn off the mitigation individually through cmdline, thus avoiding having to compile the kernel by adjusting the config. Signed-off-by: Liu Song <liusong@linux.alibaba.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/1661514050-22263-1-git-send-email-liusong@linux.alibaba.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09sched/psi: Per-cgroup PSI accounting disable/re-enable interfaceChengming Zhou1-0/+17
PSI accounts stalls for each cgroup separately and aggregates it at each level of the hierarchy. This may cause non-negligible overhead for some workloads when under deep level of the hierarchy. commit 3958e2d0c34e ("cgroup: make per-cgroup pressure stall tracking configurable") make PSI to skip per-cgroup stall accounting, only account system-wide to avoid this each level overhead. But for our use case, we also want leaf cgroup PSI stats accounted for userspace adjustment on that cgroup, apart from only system-wide adjustment. So this patch introduce a per-cgroup PSI accounting disable/re-enable interface "cgroup.pressure", which is a read-write single value file that allowed values are "0" and "1", the defaults is "1" so per-cgroup PSI stats is enabled by default. Implementation details: It should be relatively straight-forward to disable and re-enable state aggregation, time tracking, averaging on a per-cgroup level, if we can live with losing history from while it was disabled. I.e. the avgs will restart from 0, total= will have gaps. But it's hard or complex to stop/restart groupc->tasks[] updates, which is not implemented in this patch. So we always update groupc->tasks[] and PSI_ONCPU bit in psi_group_change() even when the cgroup PSI stats is disabled. Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lkml.kernel.org/r/20220907090332.2078-1-zhouchengming@bytedance.com