summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2021-07-09Merge tag 'trace-v5.14-2' of ↵Linus Torvalds1-0/+18
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix and cleanup from Steven Rostedt: "Tracing fix for histograms and a clean up in ftrace: - Fixed a bug that broke the .sym-offset modifier and added a test to make sure nothing breaks it again. - Replace a list_del/list_add() with a list_move()" * tag 'trace-v5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Use list_move instead of list_del/list_add tracing/selftests: Add tests to test histogram sym and sym-offset modifiers tracing/histograms: Fix parsing of "sym-offset" modifier
2021-07-08selftest/mremap_test: avoid crash with static buildAneesh Kumar K.V1-2/+3
With a large mmap map size, we can overlap with the text area and using MAP_FIXED results in unmapping that area. Switch to MAP_FIXED_NOREPLACE and handle the EEXIST error. Link: https://lkml.kernel.org/r/20210616045239.370802-3-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08selftest/mremap_test: update the test to handle pagesize other than 4KAneesh Kumar K.V1-52/+61
Patch series "mrermap fixes", v2. This patch (of 6): Instead of hardcoding 4K page size fetch it using sysconf(). For the performance measurements test still assume 2M and 1G are hugepage sizes. Link: https://lkml.kernel.org/r/20210616045239.370802-1-aneesh.kumar@linux.ibm.com Link: https://lkml.kernel.org/r/20210616045239.370802-2-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Hugh Dickins <hughd@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08secretmem: test: add basic selftest for memfd_secret(2)Mike Rapoport4-1/+316
The test verifies that file descriptor created with memfd_secret does not allow read/write operations, that secret memory mappings respect RLIMIT_MEMLOCK and that remote accesses with process_vm_read() and ptrace() to the secret memory fail. Link: https://lkml.kernel.org/r/20210518072034.31572-8-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christopher Lameter <cl@linux.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Bottomley <jejb@linux.ibm.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tycho Andersen <tycho@tycho.ws> Cc: Will Deacon <will@kernel.org> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08tracing/selftests: Add tests to test histogram sym and sym-offset modifiersSteven Rostedt (VMware)1-0/+18
Add a test to the tracing selftests that will catch if the .sym or .sym-offset modifiers break in the future. Link: https://lkml.kernel.org/r/20210707121451.101a1002@oasis.local.home Acked-by: Tom Zanussi <zanussi@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-07-07Merge tag 'x86-fpu-2021-07-07' of ↵Linus Torvalds4-7/+260
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu updates from Thomas Gleixner: "Fixes and improvements for FPU handling on x86: - Prevent sigaltstack out of bounds writes. The kernel unconditionally writes the FPU state to the alternate stack without checking whether the stack is large enough to accomodate it. Check the alternate stack size before doing so and in case it's too small force a SIGSEGV instead of silently corrupting user space data. - MINSIGSTKZ and SIGSTKSZ are constants in signal.h and have never been updated despite the fact that the FPU state which is stored on the signal stack has grown over time which causes trouble in the field when AVX512 is available on a CPU. The kernel does not expose the minimum requirements for the alternate stack size depending on the available and enabled CPU features. ARM already added an aux vector AT_MINSIGSTKSZ for the same reason. Add it to x86 as well. - A major cleanup of the x86 FPU code. The recent discoveries of XSTATE related issues unearthed quite some inconsistencies, duplicated code and other issues. The fine granular overhaul addresses this, makes the code more robust and maintainable, which allows to integrate upcoming XSTATE related features in sane ways" * tag 'x86-fpu-2021-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits) x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again x86/fpu/signal: Let xrstor handle the features to init x86/fpu/signal: Handle #PF in the direct restore path x86/fpu: Return proper error codes from user access functions x86/fpu/signal: Split out the direct restore code x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing() x86/fpu/signal: Sanitize the xstate check on sigframe x86/fpu/signal: Remove the legacy alignment check x86/fpu/signal: Move initial checks into fpu__restore_sig() x86/fpu: Mark init_fpstate __ro_after_init x86/pkru: Remove xstate fiddling from write_pkru() x86/fpu: Don't store PKRU in xstate in fpu_reset_fpstate() x86/fpu: Remove PKRU handling from switch_fpu_finish() x86/fpu: Mask PKRU from kernel XRSTOR[S] operations x86/fpu: Hook up PKRU into ptrace() x86/fpu: Add PKRU storage outside of task XSAVE buffer x86/fpu: Dont restore PKRU in fpregs_restore_userspace() x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() ...
2021-07-05Merge tag 'char-misc-5.14-rc1' of ↵Linus Torvalds4-8/+23
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc driver updates from Greg KH: "Here is the big set of char / misc and other driver subsystem updates for 5.14-rc1. Included in here are: - habanalabs driver updates - fsl-mc driver updates - comedi driver updates - fpga driver updates - extcon driver updates - interconnect driver updates - mei driver updates - nvmem driver updates - phy driver updates - pnp driver updates - soundwire driver updates - lots of other tiny driver updates for char and misc drivers This is looking more and more like the "various driver subsystems mushed together" tree... All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (292 commits) mcb: Use DEFINE_RES_MEM() helper macro and fix the end address PNP: moved EXPORT_SYMBOL so that it immediately followed its function/variable bus: mhi: pci-generic: Add missing 'pci_disable_pcie_error_reporting()' calls bus: mhi: Wait for M2 state during system resume bus: mhi: core: Fix power down latency intel_th: Wait until port is in reset before programming it intel_th: msu: Make contiguous buffers uncached intel_th: Remove an unused exit point from intel_th_remove() stm class: Spelling fix nitro_enclaves: Set Bus Master for the NE PCI device misc: ibmasm: Modify matricies to matrices misc: vmw_vmci: return the correct errno code siox: Simplify error handling via dev_err_probe() fpga: machxo2-spi: Address warning about unused variable lkdtm/heap: Add init_on_alloc tests selftests/lkdtm: Enable various testable CONFIGs lkdtm: Add CONFIG hints in errors where possible lkdtm: Enable DOUBLE_FAULT on all architectures lkdtm/heap: Add vmalloc linear overflow test lkdtm/bugs: XFAIL UNALIGNED_LOAD_STORE_WRITE ...
2021-07-04Merge branch 'core-rcu-2021.07.04' of ↵Linus Torvalds16-54/+422
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Bitmap parsing support for "all" as an alias for all bits - Documentation updates - Miscellaneous fixes, including some that overlap into mm and lockdep - kvfree_rcu() updates - mem_dump_obj() updates, with acks from one of the slab-allocator maintainers - RCU NOCB CPU updates, including limited deoffloading - SRCU updates - Tasks-RCU updates - Torture-test updates * 'core-rcu-2021.07.04' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (78 commits) tasks-rcu: Make show_rcu_tasks_gp_kthreads() be static inline rcu-tasks: Make ksoftirqd provide RCU Tasks quiescent states rcu: Add missing __releases() annotation rcu: Remove obsolete rcu_read_unlock() deadlock commentary rcu: Improve comments describing RCU read-side critical sections rcu: Create an unrcu_pointer() to remove __rcu from a pointer srcu: Early test SRCU polling start rcu: Fix various typos in comments rcu/nocb: Unify timers rcu/nocb: Prepare for fine-grained deferred wakeup rcu/nocb: Only cancel nocb timer if not polling rcu/nocb: Delete bypass_timer upon nocb_gp wakeup rcu/nocb: Cancel nocb_timer upon nocb_gp wakeup rcu/nocb: Allow de-offloading rdp leader rcu/nocb: Directly call __wake_nocb_gp() from bypass timer rcu: Don't penalize priority boosting when there is nothing to boost rcu: Point to documentation of ordering guarantees rcu: Make rcu_gp_cleanup() be noinline for tracing rcu: Restrict RCU_STRICT_GRACE_PERIOD to at most four CPUs rcu: Make show_rcu_gp_kthreads() dump rcu_node structures blocking GP ...
2021-07-04Merge branch 'lkmm.2021.05.10c' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull lkmm fixlet from Paul E McKenney. Fix missing underscore in Linux-kernel memory model docs. * 'lkmm.2021.05.10c' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: tools/memory-model: Fix smp_mb__after_spinlock() spelling
2021-07-03Merge tag 'trace-v5.14' of ↵Linus Torvalds17-30/+439
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Added option for per CPU threads to the hwlat tracer - Have hwlat tracer handle hotplug CPUs - New tracer: osnoise, that detects latency caused by interrupts, softirqs and scheduling of other tasks. - Added timerlat tracer that creates a thread and measures in detail what sources of latency it has for wake ups. - Removed the "success" field of the sched_wakeup trace event. This has been hardcoded as "1" since 2015, no tooling should be looking at it now. If one exists, we can revert this commit, fix that tool and try to remove it again in the future. - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids. - New boot command line option "tp_printk_stop", as tp_printk causes trace events to write to console. When user space starts, this can easily live lock the system. Having a boot option to stop just after boot up is useful to prevent that from happening. - Have ftrace_dump_on_oops boot command line option take numbers that match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops. - Bootconfig clean ups, fixes and enhancements. - New ktest script that tests bootconfig options. - Add tracepoint_probe_register_may_exist() to register a tracepoint without triggering a WARN*() if it already exists. BPF has a path from user space that can do this. All other paths are considered a bug. - Small clean ups and fixes * tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (49 commits) tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT tracing: Simplify & fix saved_tgids logic treewide: Add missing semicolons to __assign_str uses tracing: Change variable type as bool for clean-up trace/timerlat: Fix indentation on timerlat_main() trace/osnoise: Make 'noise' variable s64 in run_osnoise() tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing tracing: Fix spelling in osnoise tracer "interferences" -> "interference" Documentation: Fix a typo on trace/osnoise-tracer trace/osnoise: Fix return value on osnoise_init_hotplug_support trace/osnoise: Make interval u64 on osnoise_main trace/osnoise: Fix 'no previous prototype' warnings tracing: Have osnoise_main() add a quiescent state for task rcu seq_buf: Make trace_seq_putmem_hex() support data longer than 8 seq_buf: Fix overflow in seq_buf_putmem_hex() trace/osnoise: Support hotplug operations trace/hwlat: Support hotplug operations trace/hwlat: Protect kdata->kthread with get/put_online_cpus trace: Add timerlat tracer trace: Add osnoise tracer ...
2021-07-02Merge tag 'linux-kselftest-next-5.14-rc1' of ↵Linus Torvalds14-138/+308
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest update from Shuah Khan: "Fixes to existing tests and framework: - migrate sgx test to kselftest harness - add new test cases to sgx test - ftrace test fix event-no-pid on 1-core machine - splice test adjust for handler fallback removal" * tag 'linux-kselftest-next-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/sgx: remove checks for file execute permissions selftests/ftrace: fix event-no-pid on 1-core machine selftests/sgx: Refine the test enclave to have storage selftests/sgx: Add EXPECT_EEXIT() macro selftests/sgx: Dump enclave memory map selftests/sgx: Migrate to kselftest harness selftests/sgx: Rename 'eenter' and 'sgx_call_vdso' selftests: timers: rtcpie: skip test if default RTC device does not exist selftests: lib.mk: Also install "config" and "settings" selftests: splice: Adjust for handler fallback removal selftests/tls: Add {} to avoid static checker warning selftests/resctrl: Fix incorrect parsing of option "-t"
2021-07-02Merge tag 'linux-kselftest-kunit-fixes-5.14-rc1' of ↵Linus Torvalds18-125/+563
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull KUnit update from Shuah Khan: "Fixes and features: - add support for skipped tests - introduce kunit_kmalloc_array/kunit_kcalloc() helpers - add gnu_printf specifiers - add kunit_shutdown - add unit test for filtering suites by names - convert lib/test_list_sort.c to use KUnit - code organization moving default config to tools/testing/kunit - refactor of internal parser input handling - cleanups and updates to documentation - code cleanup related to casts" * tag 'linux-kselftest-kunit-fixes-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (29 commits) kunit: add unit test for filtering suites by names kasan: test: make use of kunit_skip() kunit: test: Add example tests which are always skipped kunit: tool: Support skipped tests in kunit_tool kunit: Support skipped tests thunderbolt: test: Reinstate a few casts of bitfields kunit: tool: internal refactor of parser input handling lib/test: convert lib/test_list_sort.c to use KUnit kunit: introduce kunit_kmalloc_array/kunit_kcalloc() helpers kunit: Remove the unused all_tests.config kunit: Move default config from arch/um -> tools/testing/kunit kunit: arch/um/configs: Enable KUNIT_ALL_TESTS by default kunit: Add gnu_printf specifiers lib/cmdline_kunit: Remove a cast which are no-longer required kernel/sysctl-test: Remove some casts which are no-longer required thunderbolt: test: Remove some casts which are no longer required mmc: sdhci-of-aspeed: Remove some unnecessary casts from KUnit tests iio: Remove a cast in iio-test-format which is no longer required device property: Remove some casts in property-entry-test Documentation: kunit: Clean up some string casts in examples ...
2021-07-02Merge tag 'powerpc-5.14-1' of ↵Linus Torvalds10-11/+160
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - A big series refactoring parts of our KVM code, and converting some to C. - Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on some CPUs. - Support for the Microwatt soft-core. - Optimisations to our interrupt return path on 64-bit. - Support for userspace access to the NX GZIP accelerator on PowerVM on Power10. - Enable KUAP and KUEP by default on 32-bit Book3S CPUs. - Other smaller features, fixes & cleanups. Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Baokun Li, Benjamin Herrenschmidt, Bharata B Rao, Christophe Leroy, Daniel Axtens, Daniel Henrique Barboza, Finn Thain, Geoff Levand, Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley, Jordan Niethe, Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika Vasireddy, Shaokun Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar Singh, Tom Rix, Vaibhav Jain, YueHaibing, Zhang Jianhua, and Zhen Lei. * tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (218 commits) powerpc: Only build restart_table.c for 64s powerpc/64s: move ret_from_fork etc above __end_soft_masked powerpc/64s/interrupt: clean up interrupt return labels powerpc/64/interrupt: add missing kprobe annotations on interrupt exit symbols powerpc/64: enable MSR[EE] in irq replay pt_regs powerpc/64s/interrupt: preserve regs->softe for NMI interrupts powerpc/64s: add a table of implicit soft-masked addresses powerpc/64e: remove implicit soft-masking and interrupt exit restart logic powerpc/64e: fix CONFIG_RELOCATABLE build warnings powerpc/64s: fix hash page fault interrupt handler powerpc/4xx: Fix setup_kuep() on SMP powerpc/32s: Fix setup_{kuap/kuep}() on SMP powerpc/interrupt: Use names in check_return_regs_valid() powerpc/interrupt: Also use exit_must_hard_disable() on PPC32 powerpc/sysfs: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE powerpc/ptrace: Refactor regs_set_return_{msr/ip} powerpc/ptrace: Move set_return_regs_changed() before regs_set_return_{msr/ip} powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi() powerpc/pseries/vas: Include irqdomain.h powerpc: mark local variables around longjmp as volatile ...
2021-07-02Merge tag 'perf-tools-for-v5.14-2021-07-01' of ↵Linus Torvalds120-2419/+14120
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tool updates from Arnaldo Carvalho de Melo: "Tools: - Add cgroup support for 'perf top' (-G). - Add support for KVM MSRs in 'perf kvm stat' - Support probes on init functions in 'perf probe', to support the bootconfig format. - Improve error reporting in 'perf probe'. - No need to synthesize BUILD_ID records in 'perf inject' if the MMAP2 records have build ids already. - Allow toggling source code ('s' hotkey) in 'perf annotate' in all lines. - Add itrace options support to 'perf annotate'. - Support to custom DSO filters for 'perf script'. Hardware enablement: - Support the HYBRID_TOPOLOGY and HYBRID_CPU_PMU_CAPS features in the perf.data file header. - Support PMU prefix for mem-load and mem-store events, to support hybrid (BIG little) CPUs such as Intel's Alderlake. - Support hybrid CPUs in 'perf mem' and 'perf c2c'. Hardware tracing: - Intel PT now supports tracing KVM guests. - Timestamp improvements for ARM's Coresight. Build: - Add 'make -C tools/perf build-test' entries for libopencsd/CORESIGHT=1 and libbpf/LIBBPF_DYNAMIC=1. - Use bison's --file-prefix-map option to avoid storing full paths when using O= in the perf build. Tests: - Improve the 'perf test' entries for libpfm4 and BPF counters. Misc: - Sync msr-index.h, mount.h, kvm headers with the kernel originals. - Add vendor events and metrics for Intel's Icelake Server & Client" * tag 'perf-tools-for-v5.14-2021-07-01' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (123 commits) perf session: Add missing evlist__delete when deleting a session perf annotate: Allow 's' on source code lines perf dlfilter: Add object_code() to perf_dlfilter_fns perf dlfilter: Add attr() to perf_dlfilter_fns perf dlfilter: Add srcline() to perf_dlfilter_fns perf dlfilter: Add insn() to perf_dlfilter_fns perf dlfilter: Add resolve_address() to perf_dlfilter_fns perf build: Install perf_dlfilter.h perf script: Add option to pass arguments to dlfilters perf script: Add option to list dlfilters perf script: Add dlfilter__filter_event_early() perf script: Add API for filtering via dynamically loaded shared object perf llvm: Return -ENOMEM when asprintf() fails perf cs-etm: Delay decode of non-timeless data until cs_etm__flush_events() tools headers UAPI: Synch KVM's svm.h header with the kernel tools kvm headers arm64: Update KVM headers from the kernel sources tools headers UAPI: Sync linux/kvm.h with the kernel sources tools headers cpufeatures: Sync with the kernel sources tools include UAPI: Update linux/mount.h copy tools arch x86: Sync the msr-index.h copy with the kernel sources ...
2021-07-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds9-543/+1125
Merge more updates from Andrew Morton: "190 patches. Subsystems affected by this patch series: mm (hugetlb, userfaultfd, vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock, migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap, zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc, core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs, signals, exec, kcov, selftests, compress/decompress, and ipc" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits) ipc/util.c: use binary search for max_idx ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock ipc: use kmalloc for msg_queue and shmid_kernel ipc sem: use kvmalloc for sem_undo allocation lib/decompressors: remove set but not used variabled 'level' selftests/vm/pkeys: exercise x86 XSAVE init state selftests/vm/pkeys: refill shadow register after implicit kernel write selftests/vm/pkeys: handle negative sys_pkey_alloc() return code selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random kcov: add __no_sanitize_coverage to fix noinstr for all architectures exec: remove checks in __register_bimfmt() x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned hfsplus: report create_date to kstat.btime hfsplus: remove unnecessary oom message nilfs2: remove redundant continue statement in a while-loop kprobes: remove duplicated strong free_insn_page in x86 and s390 init: print out unknown kernel parameters checkpatch: do not complain about positive return values starting with EPOLL checkpatch: improve the indented label test checkpatch: scripts/spdxcheck.py now requires python3 ...
2021-07-02Merge branch 'for-5.14' of ↵Linus Torvalds6-58/+354
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - cgroup.kill is added which implements atomic killing of the whole subtree. Down the line, this should be able to replace the multiple userland implementations of "keep killing till empty". - PSI can now be turned off at boot time to avoid overhead for configurations which don't care about PSI. * 'for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: make per-cgroup pressure stall tracking configurable cgroup: Fix kernel-doc cgroup: inline cgroup_task_freeze() tests/cgroup: test cgroup.kill tests/cgroup: move cg_wait_for(), cg_prepare_for_wait() tests/cgroup: use cgroup.kill in cg_killall() docs/cgroup: add entry for cgroup.kill cgroup: introduce cgroup.kill
2021-07-01perf session: Add missing evlist__delete when deleting a sessionRiccardo Mancini1-1/+4
ASan reports a memory leak caused by evlist not being deleted on exit in perf-report, perf-script and perf-data. The problem is caused by evlist->session not being deleted, which is allocated in perf_session__read_header, called in perf_session__new if perf_data is in read mode. In case of write mode, the session->evlist is filled by the caller. This patch solves the problem by calling evlist__delete in perf_session__delete if perf_data is in read mode. Changes in v2: - call evlist__delete from within perf_session__delete v1: https://lore.kernel.org/lkml/20210621234317.235545-1-rickyman7@gmail.com/ ASan report follows: $ ./perf script report flamegraph ================================================================= ==227640==ERROR: LeakSanitizer: detected memory leaks <SNIP unrelated> Indirect leak of 2704 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 #2 0x7f999e in evlist__new /home/user/linux/tools/perf/util/evlist.c:77:26 #3 0x8ad938 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3797:20 #4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 #5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 #6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 #7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 #8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 #9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 #10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 #11 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 568 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 #2 0x80ce88 in evsel__new_idx /home/user/linux/tools/perf/util/evsel.c:268:24 #3 0x8aed93 in evsel__new /home/user/linux/tools/perf/util/evsel.h:210:9 #4 0x8ae07e in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3853:11 #5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 #6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 #7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 #8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 #9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 #10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 #11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 #12 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 264 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 #2 0xbe3e70 in xyarray__new /home/user/linux/tools/lib/perf/xyarray.c:10:23 #3 0xbd7754 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:361:21 #4 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7 #5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 #6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 #7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 #8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 #9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 #10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 #11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 #12 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 32 byte(s) in 1 object(s) allocated from: #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137) #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9 #2 0xbd77e0 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:365:14 #3 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7 #4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 #5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 #6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 #7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 #8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 #9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 #10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 #11 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) Indirect leak of 7 byte(s) in 1 object(s) allocated from: #0 0x4b8207 in strdup (/home/user/linux/tools/perf/perf+0x4b8207) #1 0x8b4459 in evlist__set_event_name /home/user/linux/tools/perf/util/header.c:2292:16 #2 0x89d862 in process_event_desc /home/user/linux/tools/perf/util/header.c:2313:3 #3 0x8af319 in perf_file_section__process /home/user/linux/tools/perf/util/header.c:3651:9 #4 0x8aa6e9 in perf_header__process_sections /home/user/linux/tools/perf/util/header.c:3427:9 #5 0x8ae3e7 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3886:2 #6 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6 #7 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10 #8 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12 #9 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 #10 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 #11 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2 #12 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3 #13 0x7f5260654b74 (/lib64/libc.so.6+0x27b74) SUMMARY: AddressSanitizer: 3728 byte(s) leaked in 7 allocation(s). Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210624231926.212208-1-rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf annotate: Allow 's' on source code linesRiccardo Mancini1-3/+29
In perf annotate, when 's' is pressed on a line containing source code, it shows the message "Only available for assembly lines". This patch gets rid of the error, moving the cursr to the next available asm line (or the closest previous one if no asm line is found moving forwards), before hiding source code lines. Changes in v2: - handle case of no asm line found in annotate_browser__find_next_asm_line by returning NULL and handling error in caller. Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210624223423.189550-1-rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf dlfilter: Add object_code() to perf_dlfilter_fnsAdrian Hunter3-2/+41
Add a function, for use by dlfilters, to read object code. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf dlfilter: Add attr() to perf_dlfilter_fnsAdrian Hunter3-2/+18
Add a function, for use by dlfilters, to return the perf_event_attr structure. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf dlfilter: Add srcline() to perf_dlfilter_fnsAdrian Hunter3-2/+35
Add a function, for use by dlfilters, to return source code file name and line number. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf dlfilter: Add insn() to perf_dlfilter_fnsAdrian Hunter3-2/+39
Add a function, for use by dlfilters, to return instruction bytes. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf dlfilter: Add resolve_address() to perf_dlfilter_fnsAdrian Hunter3-2/+40
Add a function, for use by dlfilters, to resolve addresses from branch stacks or callchains. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf build: Install perf_dlfilter.hAdrian Hunter2-1/+6
Users of the --dlfilter option need to include perf_dlfilter.h in their filters. Install it to the include path. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf script: Add option to pass arguments to dlfiltersAdrian Hunter6-13/+91
Add option --dlarg to pass arguments to dlfilters. The --dlarg option can be repeated to pass more than 1 argument. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf script: Add option to list dlfiltersAdrian Hunter6-1/+134
Add option --list-dlfilters to list dlfilters in the current directory or the exec-path e.g. ~/libexec/perf-core/dlfilters. Use with option -v (must come before option --list-dlfilters) to show long descriptions. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf script: Add dlfilter__filter_event_early()Adrian Hunter5-16/+59
filter_event_early() can be more than 30% faster than filter_event() because it is called before internal filtering. In other respects it is the same as filter_event(), except that it will be passed events that have yet to be filtered out. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf script: Add API for filtering via dynamically loaded shared objectAdrian Hunter7-2/+780
In some cases, users want to filter very large amounts of data (e.g. from AUX area tracing like Intel PT) looking for something specific. While scripting such as Python can be used, Python is 10 to 20 times slower than C. So define a C API so that custom filters can be written and loaded. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf llvm: Return -ENOMEM when asprintf() failsArnaldo Carvalho de Melo1-0/+2
Zhihao sent a patch but it made llvm__compile_bpf() return what asprintf() returns on error, which is just -1, but since this function returns -errno, fix it by returning -ENOMEM for this case instead. Fixes: cb76371441d098 ("perf llvm: Allow passing options to llc ...") Fixes: 5eab5a7ee032ac ("perf llvm: Display eBPF compiling command ...") Reported-by: Hulk Robot <hulkci@huawei.com> Reported-by: Zhihao Cheng <chengzhihao1@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yu Kuai <yukuai3@huawei.com> Cc: clang-built-linux@googlegroups.com Link: http://lore.kernel.org/lkml/20210609115945.2193194-1-chengzhihao1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf cs-etm: Delay decode of non-timeless data until cs_etm__flush_events()James Clark1-1/+5
Currently, timeless mode starts the decode on PERF_RECORD_EXIT, and non-timeless mode starts decoding on the fist PERF_RECORD_AUX record. This can cause the "data has no samples!" error if the first PERF_RECORD_AUX record comes before the first (or any relevant) PERF_RECORD_MMAP2 record because the mmaps are required by the decoder to access the binary data. This change pushes the start of non-timeless decoding to the very end of parsing the file. The PERF_RECORD_EXIT event can't be used because it might not exist in system-wide or snapshot modes. I have not been able to find the exact cause for the events to be intermittently in the wrong order in the basic scenario: perf record -e cs_etm/@tmc_etr0/u top But it can be made to happen every time with the --delay option. This is because "enable_on_exec" is disabled, which causes tracing to start before the process to be launched is exec'd. For example: perf record -e cs_etm/@tmc_etr0/u --delay=1 top perf report -D | grep 'AUX\|MAP' 0 16714475632740 0x520 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x30 flags: 0 [] 0 16714476494960 0x5d0 [0x40]: PERF_RECORD_AUX offset: 0x30 size: 0x30 flags: 0 [] 0 16714478208900 0x660 [0x40]: PERF_RECORD_AUX offset: 0x60 size: 0x30 flags: 0 [] 4294967295 16714478293340 0x700 [0x70]: PERF_RECORD_MMAP2 8712/8712: [0x557a460000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top 4294967295 16714478353020 0x770 [0x88]: PERF_RECORD_MMAP2 8712/8712: [0x7f86f72000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so Another scenario in which decoding from the first aux record fails is a workload that forks. Although the aux record comes after 'bash', it comes before 'top', which is what we are interested in. For example: perf record -e cs_etm/@tmc_etr0/u -- bash -c top perf report -D | grep 'AUX\|MAP' 4294967295 16853946421300 0x510 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x558f280000(0x142000) @ 0 00:17 5213953 0]: r-xp /usr/bin/bash 4294967295 16853946543560 0x580 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba6e000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so 4294967295 16853946628420 0x608 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba9e000(0x1000) @ 0 00:00 0 0]: r-xp [vdso] 0 16853947067300 0x690 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x3a60 flags: 0 [] ... 0 16853966602580 0x1758 [0x40]: PERF_RECORD_AUX offset: 0xc2470 size: 0x30 flags: 0 [] 4294967295 16853967119860 0x1818 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x5559e70000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top 4294967295 16853967181620 0x1888 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed06000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so 4294967295 16853967237180 0x1910 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed36000(0x1000) @ 0 00:00 0 0]: r-xp [vdso] A third scenario is when the majority of time is spent in a shared library that is not loaded at startup. For example a dynamically loaded plugin. Testing ======= Testing was done by checking if any samples that are present in the old output are missing from the new output. Timestamps must be stripped out with awk because now they are set to the last AUX sample, rather than the first: ./perf script $4 | awk '!($4="")' > new.script ./perf-default script $4 | awk '!($4="")' > default.script comm -13 <(sort -u new.script) <(sort -u default.script) Testing showed that the new output is a superset of the old. When lines appear in the comm output, it is not because they are missing but because [unknown] is now resolved to sensible locations. For example last putp branch here now resolves to libtinfo, so it's not missing from the output, but is actually improved: Old: top 305 [001] 1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps) top 305 [001] 1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps) top 305 [001] 1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 0 [unknown] ([unknown]) New: top 305 [001] 1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps) top 305 [001] 1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps) top 305 [001] 1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 7f8ab39208 putp+0x0 (/lib/libtinfo.so.5.9) In the following two modes, decoding now works and the "data has no samples!" error is not displayed any more: perf record -e cs_etm/@tmc_etr0/u -- bash -c top perf record -e cs_etm/@tmc_etr0/u --delay=1 top In snapshot mode, there is also an improvement to decoding. Previously samples for the 'kill' process that was used to send SIGUSR2 were completely missing, because the process hadn't started yet. But now there are additional samples present: perf record -e cs_etm/@tmc_etr0/u --snapshot -a perf script stress 19380 [003] 161627.938153: 1000000 instructions:uH: aaaabb612fb4 [unknown] (/usr/bin/stress) kill 19644 [000] 161627.938153: 1000000 instructions:uH: ffffae0ef210 [unknown] (/lib/aarch64-linux-gnu/ld-2.27.so) stress 19380 [003] 161627.938153: 1000000 instructions:uH: ffff9e754d40 random_r+0x20 (/lib/aarch64-linux-gnu/libc-2.27.so) Also tested was the round trip of 'perf inject' followed by 'perf report' which has the same differences and improvements. Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Al Grant <al.grant@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Branislav Rankov <branislav.rankov@arm.com> Cc: Denis Nikitin <denik@chromium.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20210609130421.13934-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01tools headers UAPI: Synch KVM's svm.h header with the kernelArnaldo Carvalho de Melo1-0/+3
To pick up the changes from: 59d21d67f37481cf ("KVM: SVM: Software reserved fields") Picking the new SVM_EXIT_SW exit reasons. Addressing this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/svm.h' differs from latest version at 'arch/x86/include/uapi/asm/svm.h' diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Vineeth Pillai <viremana@linux.microsoft.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01tools kvm headers arm64: Update KVM headers from the kernel sourcesArnaldo Carvalho de Melo1-0/+11
To pick the changes from: f0376edb1ddcab19 ("KVM: arm64: Add ioctl to fetch/store tags in a guest") That don't causes any changes in tooling (when built on x86), only addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h' diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h Cc: Marc Zyngier <maz@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01tools headers UAPI: Sync linux/kvm.h with the kernel sourcesArnaldo Carvalho de Melo2-0/+118
To pick the changes in: 19238e75bd8ed8ff ("kvm: x86: Allow userspace to handle emulation errors") cb082bfab59a224a ("KVM: stats: Add fd-based API to read binary stats data") b87cc116c7e1bc62 ("KVM: PPC: Book3S HV: Add KVM_CAP_PPC_RPT_INVALIDATE capability") f0376edb1ddcab19 ("KVM: arm64: Add ioctl to fetch/store tags in a guest") 0dbb11230437895f ("KVM: X86: Introduce KVM_HC_MAP_GPA_RANGE hypercall") 6dba940352038b56 ("KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2") 644f706719f0297b ("KVM: x86: hyper-v: Introduce KVM_CAP_HYPERV_ENFORCE_CPUID") That automatically adds support for these new ioctls: $ tools/perf/trace/beauty/kvm_ioctl.sh > before $ cp include/uapi/linux/kvm.h tools/include/uapi/linux/kvm.h $ tools/perf/trace/beauty/kvm_ioctl.sh > after $ diff -u before after --- before 2021-07-01 13:42:07.006387354 -0300 +++ after 2021-07-01 13:45:16.051649301 -0300 @@ -95,6 +95,9 @@ [0xc9] = "XEN_HVM_SET_ATTR", [0xca] = "XEN_VCPU_GET_ATTR", [0xcb] = "XEN_VCPU_SET_ATTR", + [0xcc] = "GET_SREGS2", + [0xcd] = "SET_SREGS2", + [0xce] = "GET_STATS_FD", [0xe0] = "CREATE_DEVICE", [0xe1] = "SET_DEVICE_ATTR", [0xe2] = "GET_DEVICE_ATTR", $ This silences these perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h' diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Aaron Lewis <aaronlewis@google.com> Cc: Ashish Kalra <ashish.kalra@amd.com> Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Jing Zhang <jingzhangos@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Steven Price <steven.price@arm.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01tools headers cpufeatures: Sync with the kernel sourcesArnaldo Carvalho de Melo1-1/+2
To pick the changes from: 1348924ba8169f35 ("x86/msr: Define new bits in TSX_FORCE_ABORT MSR") cbcddaa33d7e11a0 ("perf/x86/rapl: Use CPUID bit on AMD and Hygon parts") This only causes these perf files to be rebuilt: CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o And addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h' diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Borislav Petkov <bp@suse.de> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01tools include UAPI: Update linux/mount.h copyArnaldo Carvalho de Melo1-0/+1
To pick the changes from: dd8b477f9a3d8edb ("mount: Support "nosymfollow" in new mount api") That ends up adding support for the new MOUNT_ATTR_NOSYMFOLLOW mount attribute: $ tools/perf/trace/beauty/fsmount.sh > before $ cp include/uapi/linux/mount.h tools/include/uapi/linux/mount.h $ tools/perf/trace/beauty/fsmount.sh > after $ diff -u before after --- before 2021-07-01 13:34:04.542517355 -0300 +++ after 2021-07-01 13:34:12.423694537 -0300 @@ -7,4 +7,5 @@ [ilog2(0x00000020) + 1] = "STRICTATIME", [ilog2(0x00000080) + 1] = "NODIRATIME", [ilog2(0x00100000) + 1] = "IDMAP", + [ilog2(0x00200000) + 1] = "NOSYMFOLLOW", }; $ So now one can use it in --filter expressions for tracepoints. This silences this perf build warnings: Warning: Kernel ABI header at 'tools/include/uapi/linux/mount.h' differs from latest version at 'include/uapi/linux/mount.h' diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h Cc: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo1-0/+4
To pick up the changes from these csets: 1348924ba8169f35 ("x86/msr: Define new bits in TSX_FORCE_ABORT MSR") That cause no changes to tooling: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after $ Just silences this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Cc: Borislav Petkov <bp@suse.de> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf arm-spe: Don't wait for PERF_RECORD_EXIT eventLeo Yan1-5/+1
When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last perf event) for processing trace data, which is needless and even might cause logic error, e.g. it might fail to correlate perf events with Arm SPE events correctly. So this patch removes the condition checking for PERF_RECORD_EXIT event. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: James Clark <james.clark@arm.com> Tested-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210519071939.1598923-6-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf arm-spe: Bail out if the trace is later than perf eventLeo Yan1-3/+34
It's possible that record in Arm SPE trace is later than perf event and vice versa. This asks to correlate the perf events and Arm SPE synthesized events to be processed in the manner of correct timing. To achieve the time ordering, this patch reverses the flow, it firstly calls arm_spe_sample() and then calls arm_spe_decode(). By comparing the timestamp value and detect the perf event is coming earlier than Arm SPE trace data, it bails out from the decoding loop, the last record is pushed into auxtrace stack and is deferred to generate sample. To track the timestamp, everytime it updates timestamp for the latest record. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: James Clark <james.clark@arm.com> Tested-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20210519071939.1598923-5-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf arm-spe: Assign kernel time to synthesized eventLeo Yan1-1/+1
In current code, it assigns the arch timer counter to the synthesized samples Arm SPE trace, thus the samples don't contain the kernel time but only contain the raw counter value. To fix the issue, this patch converts the timer counter to kernel time and assigns it to sample timestamp. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: James Clark <james.clark@arm.com> Tested-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20210519071939.1598923-4-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf arm-spe: Convert event kernel time to counter valueLeo Yan1-1/+1
When handle a perf event, Arm SPE decoder needs to decide if this perf event is earlier or later than the samples from Arm SPE trace data; to do comparision, it needs to use the same unit for the time. This patch converts the event kernel time to arch timer's counter value, thus it can be used to compare with counter value contained in Arm SPE Timestamp packet. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: James Clark <james.clark@arm.com> Tested-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20210519071939.1598923-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf arm-spe: Save clock parameters from TIME_CONV eventLeo Yan1-0/+26
During the recording phase, "perf record" tool synthesizes event PERF_RECORD_TIME_CONV for the hardware clock parameters and saves the event into the data file. Afterwards, when processing the data file, the event TIME_CONV will be processed at the very early time and is stored into session context. This patch extracts these parameters from the session context and saves into the structure "spe->tc" with the type perf_tsc_conversion, so that the parameters are ready for conversion between clock counter and time stamp. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: James Clark <james.clark@arm.com> Tested-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Grant <Al.Grant@arm.com> Cc: Dave Martin <Dave.Martin@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20210519071939.1598923-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf cs-etm: Remove callback cs_etm_find_snapshot()Leo Yan1-133/+0
The callback cs_etm_find_snapshot() is invoked for snapshot mode, its main purpose is to find the correct AUX trace data and returns "head" and "old" (we can call "old" as "old head") to the caller, the caller __auxtrace_mmap__read() uses these two pointers to decide the AUX trace data size. This patch removes cs_etm_find_snapshot() with below reasons: - The first thing in cs_etm_find_snapshot() is to check if the head has wrapped around, if it is not, directly bails out. The checking is pointless, this is because the "head" and "old" pointers both are monotonical increasing so they never wrap around. - cs_etm_find_snapshot() adjusts the "head" and "old" pointers and assumes the AUX ring buffer is fully filled with the hardware trace data, so it always subtracts the difference "mm->len" from "head" to get "old". Let's imagine the snapshot is taken in very short interval, the tracers only fill a small chunk of the trace data into the AUX ring buffer, in this case, it's wrongly to copy the whole the AUX ring buffer to perf file. - As the "head" and "old" pointers are monotonically increased, the function __auxtrace_mmap__read() handles these two pointers properly. It calculates the reminders for these two pointers, and the size is clamped to be never more than "snapshot_size". We can simply reply on the function __auxtrace_mmap__read() to calculate the correct result for data copying, it's not necessary to add Arm CoreSight specific callback. Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: James Clark <james.clark@arm.com> Tested-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Daniel Kiss <daniel.kiss@arm.com> Cc: Denis Nikitin <denik@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Link: http://lore.kernel.org/lkml/20210701093537.90759-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf bpf_counter: Move common functions to bpf_counter.hNamhyung Kim2-52/+52
Some helper functions will be used for cgroup counting too. Move them to a header file for sharing. Committer notes: Fix the build on older systems with: - struct bpf_map_info map_info = {0}; + struct bpf_map_info map_info = { .id = 0, }; This wasn't breaking the build in such systems as bpf_counter.c isn't built due to: tools/perf/util/Build: perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o The bpf_counter.h file on the other hand is included from places that are built everywhere. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210625071826.608504-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01selftests/vm/pkeys: exercise x86 XSAVE init stateDave Hansen3-2/+76
On x86, there is a set of instructions used to save and restore register state collectively known as the XSAVE architecture. There are about a dozen different features managed with XSAVE. The protection keys register, PKRU, is one of those features. The hardware optimizes XSAVE by tracking when the state has not changed from its initial (init) state. In this case, it can avoid the cost of writing state to memory (it would usually just be a bunch of 0's). When the pkey register is 0x0 the hardware optionally choose to track the register as being in the init state (optimize away the writes). AMD CPUs do this more aggressively compared to Intel. On x86, PKRU is rarely in its (very permissive) init state. Instead, the value defaults to something very restrictive. It is not surprising that bugs have popped up in the rare cases when PKRU reaches its init state. Add a protection key selftest which gets the protection keys register into its init state in a way that should work on Intel and AMD. Then, do a bunch of pkey register reads to watch for inadvertent changes. This adds "-mxsave" to CFLAGS for all the x86 vm selftests in order to allow use of the XSAVE instruction __builtin functions. This will make the builtins available on all of the vm selftests, but is expected to be harmless. Link: https://lkml.kernel.org/r/20210611164202.1849B712@viggo.jf.intel.com Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Suchanek <msuchanek@suse.de> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01selftests/vm/pkeys: refill shadow register after implicit kernel writeDave Hansen1-0/+7
The pkey test code keeps a "shadow" of the pkey register around. This ensures that any bugs which might write to the register can be caught more quickly. Generally, userspace has a good idea when the kernel is going to write to the register. For instance, alloc_pkey() is passed a permission mask. The caller of alloc_pkey() can update the shadow based on the return value and the mask. But, the kernel can also modify the pkey register in a more sneaky way. For mprotect(PROT_EXEC) mappings, the kernel will allocate a pkey and write the pkey register to create an execute-only mapping. The kernel never tells userspace what key it uses for this. This can cause the test to fail with messages like: protection_keys_64.2: pkey-helpers.h:132: _read_pkey_reg: Assertion `pkey_reg == shadow_pkey_reg' failed. because the shadow was not updated with the new kernel-set value. Forcibly update the shadow value immediately after an mprotect(). Link: https://lkml.kernel.org/r/20210611164200.EF76AB73@viggo.jf.intel.com Fixes: 6af17cf89e99 ("x86/pkeys/selftests: Add PROT_EXEC test") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Suchanek <msuchanek@suse.de> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01selftests/vm/pkeys: handle negative sys_pkey_alloc() return codeDave Hansen1-1/+1
The alloc_pkey() sefltest function wraps the sys_pkey_alloc() system call. On success, it updates its "shadow" register value because sys_pkey_alloc() updates the real register. But, the success check is wrong. pkey_alloc() considers any non-zero return code to indicate success where the pkey register will be modified. This fails to take negative return codes into account. Consider only a positive return value as a successful call. Link: https://lkml.kernel.org/r/20210611164157.87AB4246@viggo.jf.intel.com Fixes: 5f23f6d082a9 ("x86/pkeys: Add self-tests") Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Suchanek <msuchanek@suse.de> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really randomDave Hansen1-1/+2
Patch series "selftests/vm/pkeys: Bug fixes and a new test". There has been a lot of activity on the x86 front around the XSAVE architecture which is used to context-switch processor state (among other things). In addition, AMD has recently joined the protection keys club by adding processor support for PKU. The AMD implementation helped uncover a kernel bug around the PKRU "init state", which actually applied to Intel's implementation but was just harder to hit. This series adds a test which is expected to help find this class of bug both on AMD and Intel. All the work around pkeys on x86 also uncovered a few bugs in the selftest. This patch (of 4): The "random" pkey allocation code currently does the good old: srand((unsigned int)time(NULL)); *But*, it unfortunately does this on every random pkey allocation. There may be thousands of these a second. time() has a one second resolution. So, each time alloc_random_pkey() is called, the PRNG is *RESET* to time(). This is nasty. Normally, if you do: srand(<ANYTHING>); foo = rand(); bar = rand(); You'll be quite guaranteed that 'foo' and 'bar' are different. But, if you do: srand(1); foo = rand(); srand(1); bar = rand(); You are quite guaranteed that 'foo' and 'bar' are the *SAME*. The recent "fix" effectively forced the test case to use the same "random" pkey for the whole test, unless the test run crossed a second boundary. Only run srand() once at program startup. This explains some very odd and persistent test failures I've been seeing. Link: https://lkml.kernel.org/r/20210611164153.91B76FB8@viggo.jf.intel.com Link: https://lkml.kernel.org/r/20210611164155.192D00FF@viggo.jf.intel.com Fixes: 6e373263ce07 ("selftests/vm/pkeys: fix alloc_random_pkey() to make it really random") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Suchanek <msuchanek@suse.de> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01mm: selftests for exclusive device memoryAlistair Popple1-0/+158
Adds some selftests for exclusive device memory. Link: https://lkml.kernel.org/r/20210616105937.23201-9-apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hugh Dickins <hughd@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01perf tools: Add cgroup_is_v2() helperNamhyung Kim2-0/+21
The cgroup_is_v2() is to check if the given subsystem is mounted on cgroup v2 or not. It'll be used by BPF cgroup code later. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210625071826.608504-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01perf tools: Add read_cgroup_id() functionNamhyung Kim2-0/+35
The read_cgroup_id() is to read a cgroup id from a file handle using name_to_handle_at(2) for the given cgroup. It'll be used by bperf cgroup stat later. Committer notes: -int read_cgroup_id(struct cgroup *cgrp) +static inline int read_cgroup_id(struct cgroup *cgrp __maybe_unused) To fix the build when HAVE_FILE_HANDLE is not defined. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210625071826.608504-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>