summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2023-11-10bpf: handle ldimm64 properly in check_cfg()Andrii Nakryiko1-4/+4
ldimm64 instructions are 16-byte long, and so have to be handled appropriately in check_cfg(), just like the rest of BPF verifier does. This has implications in three places: - when determining next instruction for non-jump instructions; - when determining next instruction for callback address ldimm64 instructions (in visit_func_call_insn()); - when checking for unreachable instructions, where second half of ldimm64 is expected to be unreachable; We take this also as an opportunity to report jump into the middle of ldimm64. And adjust few test_verifier tests accordingly. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reported-by: Hao Sun <sunhao.th@gmail.com> Fixes: 475fb78fbf48 ("bpf: verifier (add branch/goto checks)") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231110002638.4168352-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-10selftests: bpf: xskxceiver: ksft_print_msg: fix format type errorAnders Roxell1-7/+12
Crossbuilding selftests/bpf for architecture arm64, format specifies type error show up like. xskxceiver.c:912:34: error: format specifies type 'int' but the argument has type '__u64' (aka 'unsigned long long') [-Werror,-Wformat] ksft_print_msg("[%s] expected meta_count [%d], got meta_count [%d]\n", ~~ %llu __func__, pkt->pkt_nb, meta->count); ^~~~~~~~~~~ xskxceiver.c:929:55: error: format specifies type 'unsigned long long' but the argument has type 'u64' (aka 'unsigned long') [-Werror,-Wformat] ksft_print_msg("Frag invalid addr: %llx len: %u\n", addr, len); ~~~~ ^~~~ Fixing the issues by casting to (unsigned long long) and changing the specifiers to be %llu from %d and %u, since with u64s it might be %llx or %lx, depending on architecture. Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Link: https://lore.kernel.org/r/20231109174328.1774571-1-anders.roxell@linaro.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-10Merge tag 'net-6.7-rc1' of ↵Linus Torvalds19-50/+524
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter and bpf. Current release - regressions: - sched: fix SKB_NOT_DROPPED_YET splat under debug config Current release - new code bugs: - tcp: - fix usec timestamps with TCP fastopen - fix possible out-of-bounds reads in tcp_hash_fail() - fix SYN option room calculation for TCP-AO - tcp_sigpool: fix some off by one bugs - bpf: fix compilation error without CGROUPS - ptp: - ptp_read() should not release queue - fix tsevqs corruption Previous releases - regressions: - llc: verify mac len before reading mac header Previous releases - always broken: - bpf: - fix check_stack_write_fixed_off() to correctly spill imm - fix precision tracking for BPF_ALU | BPF_TO_BE | BPF_END - check map->usercnt after timer->timer is assigned - dsa: lan9303: consequently nested-lock physical MDIO - dccp/tcp: call security_inet_conn_request() after setting IP addr - tg3: fix the TX ring stall due to incorrect full ring handling - phylink: initialize carrier state at creation - ice: fix direction of VF rules in switchdev mode Misc: - fill in a bunch of missing MODULE_DESCRIPTION()s, more to come" * tag 'net-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits) net: ti: icss-iep: fix setting counter value ptp: fix corrupted list in ptp_open ptp: ptp_read should not release queue net_sched: sch_fq: better validate TCA_FQ_WEIGHTS and TCA_FQ_PRIOMAP net: kcm: fill in MODULE_DESCRIPTION() net/sched: act_ct: Always fill offloading tuple iifidx netfilter: nat: fix ipv6 nat redirect with mapped and scoped addresses netfilter: xt_recent: fix (increase) ipv6 literal buffer length ipvs: add missing module descriptions netfilter: nf_tables: remove catchall element in GC sync path netfilter: add missing module descriptions drivers/net/ppp: use standard array-copy-function net: enetc: shorten enetc_setup_xdp_prog() error message to fit NETLINK_MAX_FMTMSG_LEN virtio/vsock: Fix uninit-value in virtio_transport_recv_pkt() r8169: respect userspace disabling IFF_MULTICAST selftests/bpf: get trusted cgrp from bpf_iter__cgroup directly bpf: Let verifier consider {task,cgroup} is trusted in bpf_iter_reg net: phylink: initialize carrier state at creation test/vsock: add dobule bind connect test test/vsock: refactor vsock_accept ...
2023-11-09Merge tag 'for-netdev' of ↵Jakub Kicinski11-25/+234
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2023-11-08 We've added 16 non-merge commits during the last 6 day(s) which contain a total of 30 files changed, 341 insertions(+), 130 deletions(-). The main changes are: 1) Fix a BPF verifier issue in precision tracking for BPF_ALU | BPF_TO_BE | BPF_END where the source register was incorrectly marked as precise, from Shung-Hsi Yu. 2) Fix a concurrency issue in bpf_timer where the former could still have been alive after an application releases or unpins the map, from Hou Tao. 3) Fix a BPF verifier issue where immediates are incorrectly cast to u32 before being spilled and therefore losing sign information, from Hao Sun. 4) Fix a misplaced BPF_TRACE_ITER in check_css_task_iter_allowlist which incorrectly compared bpf_prog_type with bpf_attach_type, from Chuyi Zhou. 5) Add __bpf_hook_{start,end} as well as __bpf_kfunc_{start,end}_defs macros, migrate all BPF-related __diag callsites over to it, and add a new __diag_ignore_all for -Wmissing-declarations to the macros to address recent build warnings, from Dave Marchevsky. 6) Fix broken BPF selftest build of xdp_hw_metadata test on architectures where char is not signed, from Björn Töpel. 7) Fix test_maps selftest to properly use LIBBPF_OPTS() macro to initialize the bpf_map_create_opts, from Andrii Nakryiko. 8) Fix bpffs selftest to avoid unmounting /sys/kernel/debug as it may have been mounted and used by other applications already, from Manu Bretelle. 9) Fix a build issue without CONFIG_CGROUPS wrt css_task open-coded iterators, from Matthieu Baerts. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: get trusted cgrp from bpf_iter__cgroup directly bpf: Let verifier consider {task,cgroup} is trusted in bpf_iter_reg selftests/bpf: Fix broken build where char is unsigned selftests/bpf: precision tracking test for BPF_NEG and BPF_END bpf: Fix precision tracking for BPF_ALU | BPF_TO_BE | BPF_END selftests/bpf: Add test for using css_task iter in sleepable progs selftests/bpf: Add tests for css_task iter combining with cgroup iter bpf: Relax allowlist for css_task iter selftests/bpf: fix test_maps' use of bpf_map_create_opts bpf: Check map->usercnt after timer->timer is assigned bpf: Add __bpf_hook_{start,end} macros bpf: Add __bpf_kfunc_{start,end}_defs macros selftests/bpf: fix test_bpffs selftests/bpf: Add test for immediate spilled to stack bpf: Fix check_stack_write_fixed_off() to correctly spill imm bpf: fix compilation error without CGROUPS ==================== Link: https://lore.kernel.org/r/20231108132448.1970-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-08Merge tag 'riscv-for-linus-6.7-rc1' of ↵Linus Torvalds4-46/+270
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for cbo.zero in userspace - Support for CBOs on ACPI-based systems - A handful of improvements for the T-Head cache flushing ops - Support for software shadow call stacks - Various cleanups and fixes * tag 'riscv-for-linus-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (31 commits) RISC-V: hwprobe: Fix vDSO SIGSEGV riscv: configs: defconfig: Enable configs required for RZ/Five SoC riscv: errata: prefix T-Head mnemonics with th. riscv: put interrupt entries into .irqentry.text riscv: mm: Update the comment of CONFIG_PAGE_OFFSET riscv: Using TOOLCHAIN_HAS_ZIHINTPAUSE marco replace zihintpause riscv/mm: Fix the comment for swap pte format RISC-V: clarify the QEMU workaround in ISA parser riscv: correct pt_level name via pgtable_l5/4_enabled RISC-V: Provide pgtable_l5_enabled on rv32 clocksource: timer-riscv: Increase rating of clock_event_device for Sstc clocksource: timer-riscv: Don't enable/disable timer interrupt lkdtm: Fix CFI_BACKWARD on RISC-V riscv: Use separate IRQ shadow call stacks riscv: Implement Shadow Call Stack riscv: Move global pointer loading to a macro riscv: Deduplicate IRQ stack switching riscv: VMAP_STACK overflow detection thread-safe RISC-V: cacheflush: Initialize CBO variables on ACPI systems RISC-V: ACPI: RHCT: Add function to get CBO block sizes ...
2023-11-08Merge tag 'pm-6.7-rc1-2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These add new hardware support to a cpufreq driver and fix cpupower utility documentation: - Add support for several Qualcomm SoC versions to the Qualcomm cpufreq driver (Robert Marko, Varadarajan Narayanan) - Fix a reference to a removed document in the cpupower utility documentation (Vegard Nossum)" * tag 'pm-6.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: qcom-nvmem: Introduce cpufreq for ipq95xx cpufreq: qcom-nvmem: Enable cpufreq for ipq53xx cpufreq: qcom-nvmem: add support for IPQ8074 cpupower: fix reference to nonexistent document
2023-11-08selftests/bpf: get trusted cgrp from bpf_iter__cgroup directlyChuyi Zhou1-12/+4
Commit f49843afde (selftests/bpf: Add tests for css_task iter combining with cgroup iter) added a test which demonstrates how css_task iter can be combined with cgroup iter. That test used bpf_cgroup_from_id() to convert bpf_iter__cgroup->cgroup to a trusted ptr which is pointless now, since with the previous fix, we can get a trusted cgroup directly from bpf_iter__cgroup. Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231107132204.912120-3-zhouchuyi@bytedance.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-08test/vsock: add dobule bind connect testFilippo Storniolo3-0/+100
This add bind connect test which creates a listening server socket and tries to connect a client with a bound local port to it twice. Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com> Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com> Signed-off-by: Filippo Storniolo <f.storniolo95@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-08test/vsock: refactor vsock_acceptFilippo Storniolo1-12/+20
This is a preliminary patch to introduce SOCK_STREAM bind connect test. vsock_accept() is split into vsock_listen() and vsock_accept(). Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com> Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com> Signed-off-by: Filippo Storniolo <f.storniolo95@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-08test/vsock fix: add missing check on socket creationFilippo Storniolo1-0/+8
Add check on socket() return value in vsock_listen() and vsock_connect() Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com> Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com> Signed-off-by: Filippo Storniolo <f.storniolo95@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-07Merge branch 'pm-tools'Rafael J. Wysocki1-1/+1
Merge cpupower utility update for 6.7-rc1: - Fix a reference to a removed document in the cpupower utility documentation (Vegard Nossum). * pm-tools: cpupower: fix reference to nonexistent document
2023-11-06nfsd: regenerate user space parsers after ynl-gen changesJakub Kicinski2-11/+153
Commit 8cea95b0bd79 ("tools: ynl-gen: handle do ops with no input attrs") added support for some of the previously-skipped ops in nfsd. Regenerate the user space parsers to fill them in. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-05Merge tag 'cxl-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds2-10/+75
Pull CXL (Compute Express Link) updates from Dan Williams: "The main new functionality this time is work to allow Linux to natively handle CXL link protocol errors signalled via PCIe AER for current generation CXL platforms. This required some enlightenment of the PCIe AER core to workaround the fact that current generation RCH (Restricted CXL Host) platforms physically hide topology details and registers via a mechanism called RCRB (Root Complex Register Block). The next major highlight is reworks to address bugs in parsing region configurations for next generation VH (Virtual Host) topologies. The old broken algorithm is replaced with a simpler one that significantly increases the number of region configurations supported by Linux. This is again relevant for error handling so that forward and reverse address translation of memory errors can be carried out by Linux for memory regions instantiated by platform firmware. As for other cross-tree work, the ACPI table parsing code has been refactored for reuse parsing the "CDAT" structure which is an ACPI-like data structure that is reported by CXL devices. That work is in preparation for v6.8 support for CXL QoS. Think of this as dynamic generation of NUMA node topology information generated by Linux rather than platform firmware. Lastly, a number of internal object lifetime issues have been resolved along with misc. fixes and feature updates (decoders_committed sysfs ABI). Summary: - Add support for RCH (Restricted CXL Host) Error recovery - Fix several region assembly bugs - Fix mem-device lifetime issues relative to the sanitize command and RCH topology. - Refactor ACPI table parsing for CDAT parsing re-use in preparation for CXL QOS support" * tag 'cxl-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (50 commits) lib/fw_table: Remove acpi_parse_entries_array() export cxl/pci: Change CXL AER support check to use native AER cxl/hdm: Remove broken error path cxl/hdm: Fix && vs || bug acpi: Move common tables helper functions to common lib cxl: Add support for reading CXL switch CDAT table cxl: Add checksum verification to CDAT from CXL cxl: Export QTG ids from CFMWS to sysfs as qos_class attribute cxl: Add decoders_committed sysfs attribute to cxl_port cxl: Add cxl_decoders_committed() helper cxl/core/regs: Rework cxl_map_pmu_regs() to use map->dev for devm cxl/core/regs: Rename phys_addr in cxl_map_component_regs() PCI/AER: Unmask RCEC internal errors to enable RCH downstream port error handling PCI/AER: Forward RCH downstream port-detected errors to the CXL.mem dev handler cxl/pci: Disable root port interrupts in RCH mode cxl/pci: Add RCH downstream port error logging cxl/pci: Map RCH downstream AER registers for logging protocol errors cxl/pci: Update CXL error logging to use RAS register address PCI/AER: Refactor cper_print_aer() for use by CXL driver module cxl/pci: Add RCH downstream port AER register discovery ...
2023-11-04Merge tag 'char-misc-6.7-rc1' of ↵Linus Torvalds2-1/+20
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc updates from Greg KH: "Here is the big set of char/misc and other small driver subsystem changes for 6.7-rc1. Included in here are: - IIO subsystem driver updates and additions (largest part of this pull request) - FPGA subsystem driver updates - Counter subsystem driver updates - ICC subsystem driver updates - extcon subsystem driver updates - mei driver updates and additions - nvmem subsystem driver updates and additions - comedi subsystem dependency fixes - parport driver fixups - cdx subsystem driver and core updates - splice support for /dev/zero and /dev/full - other smaller driver cleanups All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (326 commits) cdx: add sysfs for subsystem, class and revision cdx: add sysfs for bus reset cdx: add support for bus enable and disable cdx: Register cdx bus as a device on cdx subsystem cdx: Create symbol namespaces for cdx subsystem cdx: Introduce lock to protect controller ops cdx: Remove cdx controller list from cdx bus system dts: ti: k3-am625-beagleplay: Add beaglecc1352 greybus: Add BeaglePlay Linux Driver dt-bindings: net: Add ti,cc1352p7 dt-bindings: eeprom: at24: allow NVMEM cells based on old syntax dt-bindings: nvmem: SID: allow NVMEM cells based on old syntax Revert "nvmem: add new config option" MAINTAINERS: coresight: Add missing Coresight files misc: pci_endpoint_test: Add deviceID for J721S2 PCIe EP device support firmware: xilinx: Move EXPORT_SYMBOL_GPL next to zynqmp_pm_feature definition uacce: make uacce_class constant ocxl: make ocxl_class constant cxl: make cxl_class constant misc: phantom: make phantom_class constant ...
2023-11-03Merge tag 'landlock-6.7-rc1' of ↵Linus Torvalds5-11/+1815
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "A Landlock ruleset can now handle two new access rights: LANDLOCK_ACCESS_NET_BIND_TCP and LANDLOCK_ACCESS_NET_CONNECT_TCP. When handled, the related actions are denied unless explicitly allowed by a Landlock network rule for a specific port. The related patch series has been reviewed for almost two years, it has evolved a lot and we now have reached a decent design, code and testing. The refactored kernel code and the new test helpers also bring the foundation to support more network protocols. Test coverage for security/landlock is 92.4% of 710 lines according to gcc/gcov-13, and it was 93.1% of 597 lines before this series. The decrease in coverage is due to code refactoring to make the ruleset management more generic (i.e. dealing with inodes and ports) that also added new WARN_ON_ONCE() checks not possible to test from user space. syzkaller has been updated accordingly [4], and such patched instance (tailored to Landlock) has been running for a month, covering all the new network-related code [5]" Link: https://lore.kernel.org/r/20231026014751.414649-1-konstantin.meskhidze@huawei.com [1] Link: https://lore.kernel.org/r/CAHC9VhS1wwgH6NNd+cJz4MYogPiRV8NyPDd1yj5SpaxeUB4UVg@mail.gmail.com [2] Link: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next-history.git/commit/?id=c8dc5ee69d3a [3] Link: https://github.com/google/syzkaller/pull/4266 [4] Link: https://storage.googleapis.com/syzbot-assets/82e8608dec36/ci-upstream-linux-next-kasan-gce-root-ab577164.html#security%2flandlock%2fnet.c [5] * tag 'landlock-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Add tests for FS topology changes with network rules landlock: Document network support samples/landlock: Support TCP restrictions selftests/landlock: Add network tests selftests/landlock: Share enforce_ruleset() helper landlock: Support network rules with TCP bind and connect landlock: Refactor landlock_add_rule() syscall landlock: Refactor layer helpers landlock: Move and rename layer helpers landlock: Refactor merge/inherit_ruleset helpers landlock: Refactor landlock_find_rule/insert_rule helpers landlock: Allow FS topology changes for domains without such rule type landlock: Make ruleset's access masks more generic
2023-11-03Merge tag 'perf-tools-for-v6.7-1-2023-11-01' of ↵Linus Torvalds249-1810/+31676
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Namhyung Kim: "Build: - Compile BPF programs by default if clang (>= 12.0.1) is available to enable more features like kernel lock contention, off-cpu profiling, kwork, sample filtering and so on. This can be disabled by passing BUILD_BPF_SKEL=0 to make. - Produce better error messages for bison on debug build (make DEBUG=1) by defining YYDEBUG symbol internally. perf record: - Track sideband events (like FORK/MMAP) from all CPUs even if perf record targets a subset of CPUs only (using -C option). Otherwise it may lose some information happened on a CPU out of the target list. - Fix checking raw sched_switch tracepoint argument using system BTF. This affects off-cpu profiling which attaches a BPF program to the raw tracepoint. perf lock contention: - Add --lock-cgroup option to see contention by cgroups. This should be used with BPF only (using -b option). $ sudo perf lock con -ab --lock-cgroup -- sleep 1 contended total wait max wait avg wait cgroup 835 14.06 ms 41.19 us 16.83 us /system.slice/led.service 25 122.38 us 13.77 us 4.89 us / 44 23.73 us 3.87 us 539 ns /user.slice/user-657345.slice/session-c4.scope 1 491 ns 491 ns 491 ns /system.slice/connectd.service - Add -G/--cgroup-filter option to see contention only for given cgroups. This can be useful when you identified a cgroup in the above command and want to investigate more on it. It also works with other output options like -t/--threads and -l/--lock-addr. $ sudo perf lock con -ab -G /user.slice/user-657345.slice/session-c4.scope -- sleep 1 contended total wait max wait avg wait type caller 8 77.11 us 17.98 us 9.64 us spinlock futex_wake+0xc8 2 24.56 us 14.66 us 12.28 us spinlock tick_do_update_jiffies64+0x25 1 4.97 us 4.97 us 4.97 us spinlock futex_q_lock+0x2a - Use per-cpu array for better spinlock tracking. This is to improve performance of the BPF program and to avoid nested contention on a lock in the BPF hash map. - Update callstack check for PowerPC. To find a representative caller of a lock, it needs to look up the call stacks. It ends the lookup when it sees 0 in the call stack buffer. However, PowerPC call stacks can have 0 values in the beginning so skip them when it expects valid call stacks after. perf kwork: - Support 'sched' class (for -k option) so that it can see task scheduling event (using sched_switch tracepoint) as well as irq and workqueue items. - Add perf kwork top subcommand to show more accurate cpu utilization with sched class above. It works both with a recorded data (using perf kwork record command) and BPF (using -b option). Unlike perf top command, it does not support interactive mode (yet). $ sudo perf kwork top -b -k sched Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 160702.425 ms, 8 cpus %Cpu(s): 36.00% id, 0.00% hi, 0.00% si %Cpu0 [|||||||||||||||||| 61.66%] %Cpu1 [|||||||||||||||||| 61.27%] %Cpu2 [||||||||||||||||||| 66.40%] %Cpu3 [|||||||||||||||||| 61.28%] %Cpu4 [|||||||||||||||||| 61.82%] %Cpu5 [||||||||||||||||||||||| 77.41%] %Cpu6 [|||||||||||||||||| 61.73%] %Cpu7 [|||||||||||||||||| 63.25%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 0 0 38.72 8089.463 ms [swapper/1] 0 0 38.71 8084.547 ms [swapper/3] 0 0 38.33 8007.532 ms [swapper/0] 0 0 38.26 7992.985 ms [swapper/6] 0 0 38.17 7971.865 ms [swapper/4] 0 0 36.74 7447.765 ms [swapper/7] 0 0 33.59 6486.942 ms [swapper/2] 0 0 22.58 3771.268 ms [swapper/5] 9545 9351 2.48 447.136 ms sched-messaging 9574 9351 2.09 418.583 ms sched-messaging 9724 9351 2.05 372.407 ms sched-messaging 9531 9351 2.01 368.804 ms sched-messaging 9512 9351 2.00 362.250 ms sched-messaging 9514 9351 1.95 357.767 ms sched-messaging 9538 9351 1.86 384.476 ms sched-messaging 9712 9351 1.84 386.490 ms sched-messaging 9723 9351 1.83 380.021 ms sched-messaging 9722 9351 1.82 382.738 ms sched-messaging 9517 9351 1.81 354.794 ms sched-messaging 9559 9351 1.79 344.305 ms sched-messaging 9725 9351 1.77 365.315 ms sched-messaging <SNIP> - Add hard/soft-irq statistics to perf kwork top. This will show the total CPU utilization with IRQ stats like below: $ sudo perf kwork top -b -k sched,irq,softirq Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 12554.889 ms, 8 cpus %Cpu(s): 96.23% id, 0.10% hi, 0.19% si <---- here %Cpu0 [| 4.60%] %Cpu1 [| 4.59%] %Cpu2 [ 2.73%] %Cpu3 [| 3.81%] <SNIP> perf bench: - Add -G/--cgroups option to perf bench sched pipe. The pipe bench is good to measure context switch overhead. With this option, it puts the reader and writer tasks in separate cgroups to enforce context switch between two different cgroups. Also it needs to set CPU affinity of the tasks in a CPU to accurately measure the impact of cgroup context switches. $ sudo perf stat -e context-switches,cgroup-switches -- \ > taskset -c 0 perf bench sched pipe -l 100000 # Running 'sched/pipe' benchmark: # Executed 100000 pipe operations between two processes Total time: 0.307 [sec] 3.078180 usecs/op 324867 ops/sec Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000': 200,026 context-switches 63 cgroup-switches 0.321637922 seconds time elapsed You can see small number of cgroup-switches because both write and read tasks are in the same cgroup. $ sudo mkdir /sys/fs/cgroup/{AAA,BBB} $ sudo perf stat -e context-switches,cgroup-switches -- \ > taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB # Running 'sched/pipe' benchmark: # Executed 100000 pipe operations between two processes Total time: 0.351 [sec] 3.512990 usecs/op 284657 ops/sec Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB': 200,020 context-switches 200,019 cgroup-switches 0.365034567 seconds time elapsed Now context-switches and cgroup-switches are almost same. And you can see the pipe operation took little more. - Kill child processes when perf bench sched messaging exited abnormally. Otherwise it'd leave the child doing unnecessary work. perf test: - Fix various shellcheck issues on the tests written in shell script. - Skip tests when condition is not satisfied: - object code reading test for non-text section addresses. - CoreSight test if cs_etm// event is not available. - lock contention test if not enough CPUs. Event parsing: - Make PMU alias name loading lazy to reduce the startup time in the event parsing code for perf record, stat and others in the general case. - Lazily compute PMU default config. In the same sense, delay PMU initialization until it's really needed to reduce the startup cost. - Fix event term values that are raw events. The event specification can have several terms including event name. But sometimes it clashes with raw event encoding which starts with 'r' and has hex-digits. For example, an event named 'read' should be processed as a normal event but it was mis-treated as a raw encoding and caused a failure. $ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1 event syntax error: '..nning/event=read/' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Event metrics: - Add "Compat" regex to match event with multiple identifiers. - Usual updates for Intel, Power10, Arm telemetry/CMN and AmpereOne. Misc: - Assorted memory leak fixes and footprint reduction. - Add "bpf_skeletons" to perf version --build-options so that users can check whether their perf tools have BPF support easily. - Fix unaligned access in Intel-PT packet decoder found by undefined-behavior sanitizer. - Avoid frequency mode for the dummy event. Surprisingly it'd impact kernel timer tick handler performance by force iterating all PMU events. - Update bash shell completion for events and metrics" * tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (187 commits) perf vendor events intel: Update tsx_cycles_per_elision metrics perf vendor events intel: Update bonnell version number to v5 perf vendor events intel: Update westmereex events to v4 perf vendor events intel: Update meteorlake events to v1.06 perf vendor events intel: Update knightslanding events to v16 perf vendor events intel: Add typo fix for ivybridge FP perf vendor events intel: Update a spelling in haswell/haswellx perf vendor events intel: Update emeraldrapids to v1.01 perf vendor events intel: Update alderlake/alderlake events to v1.23 perf build: Disable BPF skeletons if clang version is < 12.0.1 perf callchain: Fix spelling mistake "statisitcs" -> "statistics" perf report: Fix spelling mistake "heirachy" -> "hierarchy" perf python: Fix binding linkage due to rename and move of evsel__increase_rlimit() perf tests: test_arm_coresight: Simplify source iteration perf vendor events intel: Add tigerlake two metrics perf vendor events intel: Add broadwellde two metrics perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metric perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit perf callchain: Minor layout changes to callchain_list perf callchain: Make brtype_stat in callchain_list optional ...
2023-11-03Merge tag 'trace-v6.7' of ↵Linus Torvalds4-4/+113
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Remove eventfs_file descriptor This is the biggest change, and the second part of making eventfs create its files dynamically. In 6.6 the first part was added, and that maintained a one to one mapping between eventfs meta descriptors and the directories and file inodes and dentries that were dynamically created. The directories were represented by a eventfs_inode and the files were represented by a eventfs_file. In v6.7 the eventfs_file is removed. As all events have the same directory make up (sched_switch has an "enable", "id", "format", etc files), the handing of what files are underneath each leaf eventfs directory is moved back to the tracing subsystem via a callback. When an event is added to the eventfs, it registers an array of evenfs_entry's. These hold the names of the files and the callbacks to call when the file is referenced. The callback gets the name so that the same callback may be used by multiple files. The callback then supplies the filesystem_operations structure needed to create this file. This has brought the memory footprint of creating multiple eventfs instances down by 2 megs each! - User events now has persistent events that are not associated to a single processes. These are privileged events that hang around even if no process is attached to them - Clean up of seq_buf There's talk about using seq_buf more to replace strscpy() and friends. But this also requires some minor modifications of seq_buf to be able to do this - Expand instance ring buffers individually Currently if boot up creates an instance, and a trace event is enabled on that instance, the ring buffer for that instance and the top level ring buffer are expanded (1.4 MB per CPU). This wastes memory as this happens when nothing is using the top level instance - Other minor clean ups and fixes * tag 'trace-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (34 commits) seq_buf: Export seq_buf_puts() seq_buf: Export seq_buf_putc() eventfs: Use simple_recursive_removal() to clean up dentries eventfs: Remove special processing of dput() of events directory eventfs: Delete eventfs_inode when the last dentry is freed eventfs: Hold eventfs_mutex when calling callback functions eventfs: Save ownership and mode eventfs: Test for ei->is_freed when accessing ei->dentry eventfs: Have a free_ei() that just frees the eventfs_inode eventfs: Remove "is_freed" union with rcu head eventfs: Fix kerneldoc of eventfs_remove_rec() tracing: Have the user copy of synthetic event address use correct context eventfs: Remove extra dget() in eventfs_create_events_dir() tracing: Have trace_event_file have ref counters seq_buf: Introduce DECLARE_SEQ_BUF and seq_buf_str() eventfs: Fix typo in eventfs_inode union comment eventfs: Fix WARN_ON() in create_file_dentry() powerpc: Remove initialisation of readpos tracing/histograms: Simplify last_cmd_set() seq_buf: fix a misleading comment ...
2023-11-03Merge tag 'trace-tools-v6.7' of ↵Linus Torvalds2-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing tools updates from Steven Rostedt: "RTLA: - In rtla/utils.c, initialize the 'found' variable to avoid garbage when a mount point is not found. Verification: - Remove duplicated imports on dot2k python script" * tag 'trace-tools-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rtla: Fix uninitialized variable found verification/dot2k: Delete duplicate imports
2023-11-03Merge tag 'linux-cpupower-6.7-rc1' of ↵Rafael J. Wysocki1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux Merge cpupower utility update for 6.7-rc1 from Shuah Khan: "This cpupower update for Linux 6.7-rc1 consists of a single fix to documentation to fix reference to a removed document." * tag 'linux-cpupower-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux: cpupower: fix reference to nonexistent document
2023-11-03selftests: pmtu.sh: fix result checkingHangbin Liu1-1/+1
In the PMTU test, when all previous tests are skipped and the new test passes, the exit code is set to 0. However, the current check mistakenly treats this as an assignment, causing the check to pass every time. Consequently, regardless of how many tests have failed, if the latest test passes, the PMTU test will report a pass. Fixes: 2a9d3716b810 ("selftests: pmtu.sh: improve the test result processing") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-03Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of ↵Linus Torvalds4-33/+166
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "As usual, lots of singleton and doubleton patches all over the tree and there's little I can say which isn't in the individual changelogs. The lengthier patch series are - 'kdump: use generic functions to simplify crashkernel reservation in arch', from Baoquan He. This is mainly cleanups and consolidation of the 'crashkernel=' kernel parameter handling - After much discussion, David Laight's 'minmax: Relax type checks in min() and max()' is here. Hopefully reduces some typecasting and the use of min_t() and max_t() - A group of patches from Oleg Nesterov which clean up and slightly fix our handling of reads from /proc/PID/task/... and which remove task_struct.thread_group" * tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits) scripts/gdb/vmalloc: disable on no-MMU scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n .mailmap: add address mapping for Tomeu Vizoso mailmap: update email address for Claudiu Beznea tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions .mailmap: map Benjamin Poirier's address scripts/gdb: add lx_current support for riscv ocfs2: fix a spelling typo in comment proc: test ProtectionKey in proc-empty-vm test proc: fix proc-empty-vm test with vsyscall fs/proc/base.c: remove unneeded semicolon do_io_accounting: use sig->stats_lock do_io_accounting: use __for_each_thread() ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error() ocfs2: fix a typo in a comment scripts/show_delta: add __main__ judgement before main code treewide: mark stuff as __ro_after_init fs: ocfs2: check status values proc: test /proc/${pid}/statm compiler.h: move __is_constexpr() to compiler.h ...
2023-11-03Merge tag 'mm-stable-2023-11-01-14-33' of ↵Linus Torvalds23-198/+2678
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Kemeng Shi has contributed some compation maintenance work in the series 'Fixes and cleanups to compaction' - Joel Fernandes has a patchset ('Optimize mremap during mutual alignment within PMD') which fixes an obscure issue with mremap()'s pagetable handling during a subsequent exec(), based upon an implementation which Linus suggested - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the following patch series: mm/damon: misc fixups for documents, comments and its tracepoint mm/damon: add a tracepoint for damos apply target regions mm/damon: provide pseudo-moving sum based access rate mm/damon: implement DAMOS apply intervals mm/damon/core-test: Fix memory leaks in core-test mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval - In the series 'Do not try to access unaccepted memory' Adrian Hunter provides some fixups for the recently-added 'unaccepted memory' feature. To increase the feature's checking coverage. 'Plug a few gaps where RAM is exposed without checking if it is unaccepted memory' - In the series 'cleanups for lockless slab shrink' Qi Zheng has done some maintenance work which is preparation for the lockless slab shrinking code - Qi Zheng has redone the earlier (and reverted) attempt to make slab shrinking lockless in the series 'use refcount+RCU method to implement lockless slab shrink' - David Hildenbrand contributes some maintenance work for the rmap code in the series 'Anon rmap cleanups' - Kefeng Wang does more folio conversions and some maintenance work in the migration code. Series 'mm: migrate: more folio conversion and unification' - Matthew Wilcox has fixed an issue in the buffer_head code which was causing long stalls under some heavy memory/IO loads. Some cleanups were added on the way. Series 'Add and use bdev_getblk()' - In the series 'Use nth_page() in place of direct struct page manipulation' Zi Yan has fixed a potential issue with the direct manipulation of hugetlb page frames - In the series 'mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO' has improved our handling of gigantic pages in the hugetlb vmmemmep optimizaton code. This provides significant boot time improvements when significant amounts of gigantic pages are in use - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code rationalization and folio conversions in the hugetlb code - Yin Fengwei has improved mlock()'s handling of large folios in the series 'support large folio for mlock' - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has added statistics for memcg v1 users which are available (and useful) under memcg v2 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable) prctl so that userspace may direct the kernel to not automatically propagate the denial to child processes. The series is named 'MDWE without inheritance' - Kefeng Wang has provided the series 'mm: convert numa balancing functions to use a folio' which does what it says - In the series 'mm/ksm: add fork-exec support for prctl' Stefan Roesch makes is possible for a process to propagate KSM treatment across exec() - Huang Ying has enhanced memory tiering's calculation of memory distances. This is used to permit the dax/kmem driver to use 'high bandwidth memory' in addition to Optane Data Center Persistent Memory Modules (DCPMM). The series is named 'memory tiering: calculate abstract distance based on ACPI HMAT' - In the series 'Smart scanning mode for KSM' Stefan Roesch has optimized KSM by teaching it to retain and use some historical information from previous scans - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the series 'mm: memcg: fix tracking of pending stats updates values' - In the series 'Implement IOCTL to get and optionally clear info about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits us to atomically read-then-clear page softdirty state. This is mainly used by CRIU - Hugh Dickins contributed the series 'shmem,tmpfs: general maintenance', a bunch of relatively minor maintenance tweaks to this code - Matthew Wilcox has increased the use of the VMA lock over file-backed page faults in the series 'Handle more faults under the VMA lock'. Some rationalizations of the fault path became possible as a result - In the series 'mm/rmap: convert page_move_anon_rmap() to folio_move_anon_rmap()' David Hildenbrand has implemented some cleanups and folio conversions - In the series 'various improvements to the GUP interface' Lorenzo Stoakes has simplified and improved the GUP interface with an eye to providing groundwork for future improvements - Andrey Konovalov has sent along the series 'kasan: assorted fixes and improvements' which does those things - Some page allocator maintenance work from Kemeng Shi in the series 'Two minor cleanups to break_down_buddy_pages' - In thes series 'New selftest for mm' Breno Leitao has developed another MM self test which tickles a race we had between madvise() and page faults - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups and an optimization to the core pagecache code - Nhat Pham has added memcg accounting for hugetlb memory in the series 'hugetlb memcg accounting' - Cleanups and rationalizations to the pagemap code from Lorenzo Stoakes, in the series 'Abstract vma_merge() and split_vma()' - Audra Mitchell has fixed issues in the procfs page_owner code's new timestamping feature which was causing some misbehaviours. In the series 'Fix page_owner's use of free timestamps' - Lorenzo Stoakes has fixed the handling of new mappings of sealed files in the series 'permit write-sealed memfd read-only shared mappings' - Mike Kravetz has optimized the hugetlb vmemmap optimization in the series 'Batch hugetlb vmemmap modification operations' - Some buffer_head folio conversions and cleanups from Matthew Wilcox in the series 'Finish the create_empty_buffers() transition' - As a page allocator performance optimization Huang Ying has added automatic tuning to the allocator's per-cpu-pages feature, in the series 'mm: PCP high auto-tuning' - Roman Gushchin has contributed the patchset 'mm: improve performance of accounted kernel memory allocations' which improves their performance by ~30% as measured by a micro-benchmark - folio conversions from Kefeng Wang in the series 'mm: convert page cpupid functions to folios' - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about kmemleak' - Qi Zheng has improved our handling of memoryless nodes by keeping them off the allocation fallback list. This is done in the series 'handle memoryless nodes more appropriately' - khugepaged conversions from Vishal Moola in the series 'Some khugepaged folio conversions'" [ bcachefs conflicts with the dynamically allocated shrinkers have been resolved as per Stephen Rothwell in https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/ with help from Qi Zheng. The clone3 test filtering conflict was half-arsed by yours truly ] * tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits) mm/damon/sysfs: update monitoring target regions for online input commit mm/damon/sysfs: remove requested targets when online-commit inputs selftests: add a sanity check for zswap Documentation: maple_tree: fix word spelling error mm/vmalloc: fix the unchecked dereference warning in vread_iter() zswap: export compression failure stats Documentation: ubsan: drop "the" from article title mempolicy: migration attempt to match interleave nodes mempolicy: mmap_lock is not needed while migrating folios mempolicy: alloc_pages_mpol() for NUMA policy without vma mm: add page_rmappable_folio() wrapper mempolicy: remove confusing MPOL_MF_LAZY dead code mempolicy: mpol_shared_policy_init() without pseudo-vma mempolicy trivia: use pgoff_t in shared mempolicy tree mempolicy trivia: slightly more consistent naming mempolicy trivia: delete those ancient pr_debug()s mempolicy: fix migrate_pages(2) syscall return nr_failed kernfs: drop shared NUMA mempolicy hooks hugetlbfs: drop shared NUMA mempolicy pretence mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets() ...
2023-11-03Merge tag 'v6.7-p1' of ↵Linus Torvalds3-30/+40
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Add virtual-address based lskcipher interface - Optimise ahash/shash performance in light of costly indirect calls - Remove ahash alignmask attribute Algorithms: - Improve AES/XTS performance of 6-way unrolling for ppc - Remove some uses of obsolete algorithms (md4, md5, sha1) - Add FIPS 202 SHA-3 support in pkcs1pad - Add fast path for single-page messages in adiantum - Remove zlib-deflate Drivers: - Add support for S4 in meson RNG driver - Add STM32MP13x support in stm32 - Add hwrng interface support in qcom-rng - Add support for deflate algorithm in hisilicon/zip" * tag 'v6.7-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (283 commits) crypto: adiantum - flush destination page before unmapping crypto: testmgr - move pkcs1pad(rsa,sha3-*) to correct place Documentation/module-signing.txt: bring up to date module: enable automatic module signing with FIPS 202 SHA-3 crypto: asymmetric_keys - allow FIPS 202 SHA-3 signatures crypto: rsa-pkcs1pad - Add FIPS 202 SHA-3 support crypto: FIPS 202 SHA-3 register in hash info for IMA x509: Add OIDs for FIPS 202 SHA-3 hash and signatures crypto: ahash - optimize performance when wrapping shash crypto: ahash - check for shash type instead of not ahash type crypto: hash - move "ahash wrapping shash" functions to ahash.c crypto: talitos - stop using crypto_ahash::init crypto: chelsio - stop using crypto_ahash::init crypto: ahash - improve file comment crypto: ahash - remove struct ahash_request_priv crypto: ahash - remove crypto_ahash_alignmask crypto: gcm - stop using alignmask of ahash crypto: chacha20poly1305 - stop using alignmask of ahash crypto: ccm - stop using alignmask of ahash net: ipv6: stop checking crypto_ahash_alignmask ...
2023-11-03Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds17-757/+1962
Pull kvm updates from Paolo Bonzini: "ARM: - Generalized infrastructure for 'writable' ID registers, effectively allowing userspace to opt-out of certain vCPU features for its guest - Optimization for vSGI injection, opportunistically compressing MPIDR to vCPU mapping into a table - Improvements to KVM's PMU emulation, allowing userspace to select the number of PMCs available to a VM - Guest support for memory operation instructions (FEAT_MOPS) - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing bugs and getting rid of useless code - Changes to the way the SMCCC filter is constructed, avoiding wasted memory allocations when not in use - Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing the overhead of errata mitigations - Miscellaneous kernel and selftest fixes LoongArch: - New architecture for kvm. The hardware uses the same model as x86, s390 and RISC-V, where guest/host mode is orthogonal to supervisor/user mode. The virtualization extensions are very similar to MIPS, therefore the code also has some similarities but it's been cleaned up to avoid some of the historical bogosities that are found in arch/mips. The kernel emulates MMU, timer and CSR accesses, while interrupt controllers are only emulated in userspace, at least for now. RISC-V: - Support for the Smstateen and Zicond extensions - Support for virtualizing senvcfg - Support for virtualized SBI debug console (DBCN) S390: - Nested page table management can be monitored through tracepoints and statistics x86: - Fix incorrect handling of VMX posted interrupt descriptor in KVM_SET_LAPIC, which could result in a dropped timer IRQ - Avoid WARN on systems with Intel IPI virtualization - Add CONFIG_KVM_MAX_NR_VCPUS, to allow supporting up to 4096 vCPUs without forcing more common use cases to eat the extra memory overhead. - Add virtualization support for AMD SRSO mitigation (IBPB_BRTYPE and SBPB, aka Selective Branch Predictor Barrier). - Fix a bug where restoring a vCPU snapshot that was taken within 1 second of creating the original vCPU would cause KVM to try to synchronize the vCPU's TSC and thus clobber the correct TSC being set by userspace. - Compute guest wall clock using a single TSC read to avoid generating an inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads. - "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain about a "Firmware Bug" if the bit isn't set for select F/M/S combos. Likewise "virtualize" (ignore) MSR_AMD64_TW_CFG to appease Windows Server 2022. - Don't apply side effects to Hyper-V's synthetic timer on writes from userspace to fix an issue where the auto-enable behavior can trigger spurious interrupts, i.e. do auto-enabling only for guest writes. - Remove an unnecessary kick of all vCPUs when synchronizing the dirty log without PML enabled. - Advertise "support" for non-serializing FS/GS base MSR writes as appropriate. - Harden the fast page fault path to guard against encountering an invalid root when walking SPTEs. - Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n. - Use the fast path directly from the timer callback when delivering Xen timer events, instead of waiting for the next iteration of the run loop. This was not done so far because previously proposed code had races, but now care is taken to stop the hrtimer at critical points such as restarting the timer or saving the timer information for userspace. - Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future flag. - Optimize injection of PMU interrupts that are simultaneous with NMIs. - Usual handful of fixes for typos and other warts. x86 - MTRR/PAT fixes and optimizations: - Clean up code that deals with honoring guest MTRRs when the VM has non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled. - Zap EPT entries when non-coherent DMA assignment stops/start to prevent using stale entries with the wrong memtype. - Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y This was done as a workaround for virtual machine BIOSes that did not bother to clear CR0.CD (because ancient KVM/QEMU did not bother to set it, in turn), and there's zero reason to extend the quirk to also ignore guest PAT. x86 - SEV fixes: - Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts SHUTDOWN while running an SEV-ES guest. - Clean up the recognition of emulation failures on SEV guests, when KVM would like to "skip" the instruction but it had already been partially emulated. This makes it possible to drop a hack that second guessed the (insufficient) information provided by the emulator, and just do the right thing. Documentation: - Various updates and fixes, mostly for x86 - MTRR and PAT fixes and optimizations" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (164 commits) KVM: selftests: Avoid using forced target for generating arm64 headers tools headers arm64: Fix references to top srcdir in Makefile KVM: arm64: Add tracepoint for MMIO accesses where ISV==0 KVM: arm64: selftest: Perform ISB before reading PAR_EL1 KVM: arm64: selftest: Add the missing .guest_prepare() KVM: arm64: Always invalidate TLB for stage-2 permission faults KVM: x86: Service NMI requests after PMI requests in VM-Enter path KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs KVM: arm64: Refine _EL2 system register list that require trap reinjection arm64: Add missing _EL2 encodings arm64: Add missing _EL12 encodings KVM: selftests: aarch64: vPMU test for validating user accesses KVM: selftests: aarch64: vPMU register test for unimplemented counters KVM: selftests: aarch64: vPMU register test for implemented counters KVM: selftests: aarch64: Introduce vpmu_counter_access test tools: Import arm_pmuv3.h KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} ...
2023-11-03Merge tag 'libnvdimm-for-6.7' of ↵Linus Torvalds2-15/+16
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Ira Weiny: - updates to deprecated and changed interfaces - bug/kdoc fixes * tag 'libnvdimm-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm: remove kernel-doc warnings: testing: nvdimm: make struct class structures constant libnvdimm: Annotate struct nd_region with __counted_by nd_btt: Make BTT lanes preemptible libnvdimm/of_pmem: Use devm_kstrdup instead of kstrdup and check its return value dax: refactor deprecated strncpy
2023-11-03Merge tag 'sound-6.7-rc1' of ↵Linus Torvalds3-47/+73
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "Most of changes at this time are for ASoC, spread over ASoC core and drivers due to the API prefix standardization. Other than that, there have little change wrt API, rather lots of driver-specific updates and fixes. Some highlight below: ASoC: - Standardization of API prefix - GPIO API usage improvements - Support for HDA patches - Lots of work on SOF, including crash dump support - Fixes for noise when stopping some Sounwire CODECs - Support for AMD platforms with es83xx, AMD ACP 6.3 and 7.0, Awinc AT87390 and AW88399, many Intel platforms, many Mediatek platforms, Qualcomm SM6115 and SC7180 platforms, Richtek RTQ9128 and Texas Instruments TAS575x HD-audio and USB-audio: - Deferred probe support of audio component binding - More fixes and enhancements for Cirrus subcodecs - USB Scarlett2 mixer and McIntosh DSD quirk Others: - More enhancement of snd-aloop driver - Update MAINTAINERS entry for linux-sound mailing list" * tag 'sound-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (485 commits) ALSA: hda: cs35l41: Fix missing error code in cs35l41_smart_amp() ALSA: hda: cs35l41: mark cs35l41_verify_id() static ASoC: codecs: wsa883x: make use of new mute_unmute_on_trigger flag ASoC: soc-dai: add flag to mute and unmute stream during trigger ASoC: ams-delta.c: use component after check ASoC: amd: acp: select SND_SOC_AMD_ACP_LEGACY_COMMON for ACP63 ASoC: codecs: aw88399: fix typo in Kconfig select ASoC: amd: acp: add ACPI dependency ASoC: Intel: avs: Add rt5514 machine board ASoC: Intel: avs: Add rt5514 machine board ALSA: scarlett2: Add missing check with firmware version control ALSA: virtio: use ack callback ALSA: scarlett2: Remap Level Meter values ALSA: scarlett2: Allow passing any output to line_out_remap() ALSA: scarlett2: Add support for reading firmware version ALSA: scarlett2: Rename Gen 3 config sets ALSA: scarlett2: Rename scarlett_gen2 to scarlett2 ASoC: cs35l41: Detect CSPL errors when sending CSPL commands ALSA: hda: cs35l41: Check CSPL state after loading firmware ALSA: hda: cs35l41: Do not unload firmware before reset in system suspend ...
2023-11-03Merge tag 'for-linus-2023110101' of ↵Linus Torvalds3-9/+81
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID updates from Jiri Kosina: - fixes for crashes detected by CONFIG_KUNIT_ALL_TESTS in hid-uclogic driver (Jinjie Ruan) - HID selftests fixes and improvements (Benjamin Tissoires) - probe error handling path fixes in hid-nvidia-shield driver (Christophe JAILLET) - cleanup of LED handling in hid-nintendo (Martino Fontana) - big cleanup of logitech-hidpp probe code (Hans de Goede) - Suspend/Resume fix for USB Thinkpad Compact Keyboard (Jamie Lentin) - firmware detection improvement for Lenovo cptkbd (Mikhail Khvainitski) - IRQ shutdown and workqueue initialization fixes for hid-cp2112 driver (Danny Kaehn) - #ifdef CONFIG_PM removal from HID code (Thomas Weißschuh) - other assorted device-ID additions and quirks * tag 'for-linus-2023110101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (31 commits) HID: Add quirk for Dell Pro Wireless Keyboard and Mouse KM5221W HID: logitech-hidpp: Stop IO before calling hid_connect() HID: logitech-hidpp: Drop HIDPP_QUIRK_UNIFYING HID: logitech-hidpp: Drop delayed_work_cb() HID: logitech-hidpp: Fix connect event race HID: logitech-hidpp: Remove unused connected param from *_connect() HID: logitech-hidpp: Remove connected check for non-unifying devices HID: logitech-hidpp: Add hidpp_non_unifying_init() helper HID: logitech-hidpp: Move hidpp_overwrite_name() to before connect check HID: logitech-hidpp: Move g920_get_config() to just before hidpp_ff_init() HID: logitech-hidpp: Remove wtp_get_config() call from probe() HID: logitech-hidpp: Move get_wireless_feature_index() check to hidpp_connect_event() HID: logitech-hidpp: Revert "Don't restart communication if not necessary" HID: logitech-hidpp: Don't restart IO, instead defer hid_connect() only HID: rmi: remove #ifdef CONFIG_PM HID: multitouch: remove #ifdef CONFIG_PM HID: usbhid: remove #ifdef CONFIG_PM HID: core: remove #ifdef CONFIG_PM from hid_driver hid: lenovo: Resend all settings on reset_resume for compact keyboards HID: uclogic: Fix a work->entry not empty bug in __queue_work() ...
2023-11-02selftests/bpf: Fix broken build where char is unsignedBjörn Töpel1-1/+1
There are architectures where char is not signed. If so, the following error is triggered: | xdp_hw_metadata.c:435:42: error: result of comparison of constant -1 \ | with expression of type 'char' is always true \ | [-Werror,-Wtautological-constant-out-of-range-compare] | 435 | while ((opt = getopt(argc, argv, "mh")) != -1) { | | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~ | 1 error generated. Correct by changing the char to int. Fixes: bb6a88885fde ("selftests/bpf: Add options and frags to xdp_hw_metadata") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Larysa Zaremba <larysa.zaremba@intel.com> Tested-by: Anders Roxell <anders.roxell@linaro.org> Link: https://lore.kernel.org/r/20231102103537.247336-1-bjorn@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-02selftests/bpf: precision tracking test for BPF_NEG and BPF_ENDShung-Hsi Yu2-0/+95
As seen from previous commit that fix backtracking for BPF_ALU | BPF_TO_BE | BPF_END, both BPF_NEG and BPF_END require special handling. Add tests written with inline assembly to check that the verifier does not incorrecly use the src_reg field of BPF_NEG and BPF_END (including bswap added in v4). Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Link: https://lore.kernel.org/r/20231102053913.12004-4-shung-hsi.yu@suse.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-02selftests/bpf: Add test for using css_task iter in sleepable progsChuyi Zhou2-0/+20
This Patch add a test to prove css_task iter can be used in normal sleepable progs. Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231031050438.93297-4-zhouchuyi@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-02selftests/bpf: Add tests for css_task iter combining with cgroup iterChuyi Zhou2-0/+77
This patch adds a test which demonstrates how css_task iter can be combined with cgroup iter and it won't cause deadlock, though cgroup iter is not sleepable. Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231031050438.93297-3-zhouchuyi@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-02bpf: Relax allowlist for css_task iterChuyi Zhou1-2/+2
The newly added open-coded css_task iter would try to hold the global css_set_lock in bpf_iter_css_task_new, so the bpf side has to be careful in where it allows to use this iter. The mainly concern is dead locking on css_set_lock. check_css_task_iter_allowlist() in verifier enforced css_task can only be used in bpf_lsm hooks and sleepable bpf_iter. This patch relax the allowlist for css_task iter. Any lsm and any iter (even non-sleepable) and any sleepable are safe since they would not hold the css_set_lock before entering BPF progs context. This patch also fixes the misused BPF_TRACE_ITER in check_css_task_iter_allowlist which compared bpf_prog_type with bpf_attach_type. Fixes: 9c66dc94b62ae ("bpf: Introduce css_task open-coded iterator kfuncs") Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231031050438.93297-2-zhouchuyi@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-02selftests/bpf: fix test_maps' use of bpf_map_create_optsAndrii Nakryiko1-15/+5
Use LIBBPF_OPTS() macro to properly initialize bpf_map_create_opts in test_maps' tests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231029011509.2479232-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-02bpf: Add __bpf_hook_{start,end} macrosDave Marchevsky1-4/+2
Not all uses of __diag_ignore_all(...) in BPF-related code in order to suppress warnings are wrapping kfunc definitions. Some "hook point" definitions - small functions meant to be used as attach points for fentry and similar BPF progs - need to suppress -Wmissing-declarations. We could use __bpf_kfunc_{start,end}_defs added in the previous patch in such cases, but this might be confusing to someone unfamiliar with BPF internals. Instead, this patch adds __bpf_hook_{start,end} macros, currently having the same effect as __bpf_kfunc_{start,end}_defs, then uses them to suppress warnings for two hook points in the kernel itself and some bpf_testmod hook points as well. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Cc: Yafang Shao <laoar.shao@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20231031215625.2343848-2-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-02selftests/bpf: fix test_bpffsManu Bretelle1-3/+8
Currently this tests tries to umount /sys/kernel/debug (TDIR) but the system it is running on may have mounts below. For example, danobi/vmtest [0] VMs have mount -t tracefs tracefs /sys/kernel/debug/tracing as part of their init. This change instead creates a "random" directory under /tmp and uses this as TDIR. If the directory already exists, ignore the error and keep moving on. Test: Originally: $ vmtest -k $KERNEL_REPO/arch/x86_64/boot/bzImage "./test_progs -vv -a test_bpffs" => bzImage ===> Booting ===> Setting up VM ===> Running command [ 2.138818] bpf_testmod: loading out-of-tree module taints kernel. [ 2.140913] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel bpf_testmod.ko is already unloaded. Loading bpf_testmod.ko... Successfully loaded bpf_testmod.ko. test_test_bpffs:PASS:clone 0 nsec fn:PASS:unshare 0 nsec fn:PASS:mount / 0 nsec fn:FAIL:umount /sys/kernel/debug unexpected error: -1 (errno 16) bpf_testmod.ko is already unloaded. Loading bpf_testmod.ko... Successfully loaded bpf_testmod.ko. test_test_bpffs:PASS:clone 0 nsec test_test_bpffs:PASS:waitpid 0 nsec test_test_bpffs:FAIL:bpffs test failed 255#282 test_bpffs:FAIL Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED Successfully unloaded bpf_testmod.ko. Command failed with exit code: 1 After this change: $ vmtest -k $(make image_name) 'cd tools/testing/selftests/bpf && ./test_progs -vv -a test_bpffs' => bzImage ===> Booting ===> Setting up VM ===> Running command [ 2.295696] bpf_testmod: loading out-of-tree module taints kernel. [ 2.296468] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel bpf_testmod.ko is already unloaded. Loading bpf_testmod.ko... Successfully loaded bpf_testmod.ko. test_test_bpffs:PASS:clone 0 nsec fn:PASS:unshare 0 nsec fn:PASS:mount / 0 nsec fn:PASS:mount tmpfs 0 nsec fn:PASS:mkdir /tmp/test_bpffs_testdir/fs1 0 nsec fn:PASS:mkdir /tmp/test_bpffs_testdir/fs2 0 nsec fn:PASS:mount bpffs /tmp/test_bpffs_testdir/fs1 0 nsec fn:PASS:mount bpffs /tmp/test_bpffs_testdir/fs2 0 nsec fn:PASS:reading /tmp/test_bpffs_testdir/fs1/maps.debug 0 nsec fn:PASS:reading /tmp/test_bpffs_testdir/fs2/progs.debug 0 nsec fn:PASS:creating /tmp/test_bpffs_testdir/fs1/a 0 nsec fn:PASS:creating /tmp/test_bpffs_testdir/fs1/a/1 0 nsec fn:PASS:creating /tmp/test_bpffs_testdir/fs1/b 0 nsec fn:PASS:create_map(ARRAY) 0 nsec fn:PASS:pin map 0 nsec fn:PASS:stat(/tmp/test_bpffs_testdir/fs1/a) 0 nsec fn:PASS:renameat2(/fs1/a, /fs1/b, RENAME_EXCHANGE) 0 nsec fn:PASS:stat(/tmp/test_bpffs_testdir/fs1/b) 0 nsec fn:PASS:b should have a's inode 0 nsec fn:PASS:access(/tmp/test_bpffs_testdir/fs1/b/1) 0 nsec fn:PASS:stat(/tmp/test_bpffs_testdir/fs1/map) 0 nsec fn:PASS:renameat2(/fs1/c, /fs1/b, RENAME_EXCHANGE) 0 nsec fn:PASS:stat(/tmp/test_bpffs_testdir/fs1/b) 0 nsec fn:PASS:b should have c's inode 0 nsec fn:PASS:access(/tmp/test_bpffs_testdir/fs1/c/1) 0 nsec fn:PASS:renameat2(RENAME_NOREPLACE) 0 nsec fn:PASS:access(/tmp/test_bpffs_testdir/fs1/b) 0 nsec bpf_testmod.ko is already unloaded. Loading bpf_testmod.ko... Successfully loaded bpf_testmod.ko. test_test_bpffs:PASS:clone 0 nsec test_test_bpffs:PASS:waitpid 0 nsec test_test_bpffs:PASS:bpffs test 0 nsec #282 test_bpffs:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Successfully unloaded bpf_testmod.ko. [0] https://github.com/danobi/vmtest This is a follow-up of https://lore.kernel.org/bpf/20231024201852.1512720-1-chantr4@gmail.com/T/ v1 -> v2: - use a TDIR name that is related to test - use C-style comments Signed-off-by: Manu Bretelle <chantr4@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20231031223606.2927976-1-chantr4@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-02selftests/bpf: Add test for immediate spilled to stackHao Sun1-0/+32
Add a test to check if the verifier correctly reason about the sign of an immediate spilled to stack by BPF_ST instruction. Signed-off-by: Hao Sun <sunhao.th@gmail.com> Link: https://lore.kernel.org/r/20231101-fix-check-stack-write-v3-2-f05c2b1473d5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-02tools: ynl-gen: don't touch the output file if content is the sameJakub Kicinski1-1/+6
I often regenerate all YNL files in the tree to make sure they are in sync with the codegen and specs. Generator rewrites the files unconditionally, so since make looks at file modification time to decide what to rebuild - my next build takes longer. We already generate the code to a tempfile most of the time, only overwrite the target when we have to. Before: $ stat include/uapi/linux/netdev.h File: include/uapi/linux/netdev.h Size: 2307 Blocks: 8 IO Block: 4096 regular file Access: 2023-10-27 15:19:56.347071940 -0700 Modify: 2023-10-27 15:19:45.089000900 -0700 Change: 2023-10-27 15:19:45.089000900 -0700 Birth: 2023-10-27 15:19:45.088000894 -0700 $ ./tools/net/ynl/ynl-regen.sh -f [...] $ stat include/uapi/linux/netdev.h File: include/uapi/linux/netdev.h Size: 2307 Blocks: 8 IO Block: 4096 regular file Access: 2023-10-27 15:19:56.347071940 -0700 Modify: 2023-10-27 15:22:18.417968446 -0700 Change: 2023-10-27 15:22:18.417968446 -0700 Birth: 2023-10-27 15:19:45.088000894 -0700 After: $ stat include/uapi/linux/netdev.h File: include/uapi/linux/netdev.h Size: 2307 Blocks: 8 IO Block: 4096 regular file Access: 2023-10-27 15:22:41.520114221 -0700 Modify: 2023-10-27 15:22:18.417968446 -0700 Change: 2023-10-27 15:22:18.417968446 -0700 Birth: 2023-10-27 15:19:45.088000894 -0700 $ ./tools/net/ynl/ynl-regen.sh -f [...] $ stat include/uapi/linux/netdev.h File: include/uapi/linux/netdev.h Size: 2307 Blocks: 8 IO Block: 4096 regular file Access: 2023-10-27 15:22:41.520114221 -0700 Modify: 2023-10-27 15:22:18.417968446 -0700 Change: 2023-10-27 15:22:18.417968446 -0700 Birth: 2023-10-27 15:19:45.088000894 -0700 Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231027223408.1865704-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-02netlink: specs: devlink: add forgotten port function caps enum valuesJiri Pirko1-0/+2
Add two enum values that the blamed commit omitted. Fixes: f2f9dd164db0 ("netlink: specs: devlink: add the remaining command to generate complete split_ops") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231030161750.110420-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-02Merge tag 'linux_kselftest-next-6.7-rc1' of ↵Linus Torvalds57-554/+692
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates from Shuah Khan: - kbuild kselftest-merge target fixes - fixes to several tests - resctrl test fixes and enhancements - ksft_perror() helper and reporting improvements - printf attribute to kselftest prints to improve reporting - documentation and clang build warning fixes The bulk of the patches are for resctrl fixes and enhancements. * tag 'linux_kselftest-next-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (51 commits) selftests/resctrl: Fix MBM test failure when MBA unavailable selftests/clone3: Report descriptive test names selftests:modify the incorrect print format selftests/efivarfs: create-read: fix a resource leak selftests/ftrace: Add riscv support for kprobe arg tests selftests/ftrace: add loongarch support for kprobe args char tests selftests/amd-pstate: Added option to provide perf binary path selftests/amd-pstate: Fix broken paths to run workloads in amd-pstate-ut selftests/resctrl: Move run_benchmark() to a more fitting file selftests/resctrl: Fix schemata write error check selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests selftests/resctrl: Fix feature checks selftests/resctrl: Refactor feature check to use resource and feature name selftests/resctrl: Move _GNU_SOURCE define into Makefile selftests/resctrl: Remove duplicate feature check from CMT test selftests/resctrl: Extend signal handler coverage to unmount on receiving signal selftests/resctrl: Fix uninitialized .sa_flags selftests/resctrl: Cleanup benchmark argument parsing selftests/resctrl: Remove ben_count variable selftests/resctrl: Make benchmark command const and build it with pointers ...
2023-11-02Merge tag 'for-linus-iommufd' of ↵Linus Torvalds3-32/+587
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "This brings three new iommufd capabilities: - Dirty tracking for DMA. AMD/ARM/Intel CPUs can now record if a DMA writes to a page in the IOPTEs within the IO page table. This can be used to generate a record of what memory is being dirtied by DMA activities during a VM migration process. A VMM like qemu will combine the IOMMU dirty bits with the CPU's dirty log to determine what memory to transfer. VFIO already has a DMA dirty tracking framework that requires PCI devices to implement tracking HW internally. The iommufd version provides an alternative that the VMM can select, if available. The two are designed to have very similar APIs. - Userspace controlled attributes for hardware page tables (HWPT/iommu_domain). There are currently a few generic attributes for HWPTs (support dirty tracking, and parent of a nest). This is an entry point for the userspace iommu driver to control the HW in detail. - Nested translation support for HWPTs. This is a 2D translation scheme similar to the CPU where a DMA goes through a first stage to determine an intermediate address which is then translated trough a second stage to a physical address. Like for CPU translation the first stage table would exist in VM controlled memory and the second stage is in the kernel and matches the VM's guest to physical map. As every IOMMU has a unique set of parameter to describe the S1 IO page table and its associated parameters the userspace IOMMU driver has to marshal the information into the correct format. This is 1/3 of the feature, it allows creating the nested translation and binding it to VFIO devices, however the API to support IOTLB and ATC invalidation of the stage 1 io page table, and forwarding of IO faults are still in progress. The series includes AMD and Intel support for dirty tracking. Intel support for nested translation. Along the way are a number of internal items: - New iommu core items: ops->domain_alloc_user(), ops->set_dirty_tracking, ops->read_and_clear_dirty(), IOMMU_DOMAIN_NESTED, and iommu_copy_struct_from_user - UAF fix in iopt_area_split() - Spelling fixes and some test suite improvement" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (52 commits) iommufd: Organize the mock domain alloc functions closer to Joerg's tree iommufd/selftest: Fix page-size check in iommufd_test_dirty() iommufd: Add iopt_area_alloc() iommufd: Fix missing update of domains_itree after splitting iopt_area iommu/vt-d: Disallow read-only mappings to nest parent domain iommu/vt-d: Add nested domain allocation iommu/vt-d: Set the nested domain to a device iommu/vt-d: Make domain attach helpers to be extern iommu/vt-d: Add helper to setup pasid nested translation iommu/vt-d: Add helper for nested domain allocation iommu/vt-d: Extend dmar_domain to support nested domain iommufd: Add data structure for Intel VT-d stage-1 domain allocation iommu/vt-d: Enhance capability check for nested parent domain allocation iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with nested HWPTs iommufd/selftest: Add nested domain allocation for mock domain iommu: Add iommu_copy_struct_from_user helper iommufd: Add a nested HW pagetable object iommu: Pass in parent domain with user_data to domain_alloc_user op iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable ...
2023-11-02Merge tag 'asm-generic-6.7' of ↵Linus Torvalds8-80/+5
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull ia64 removal and asm-generic updates from Arnd Bergmann: - The ia64 architecture gets its well-earned retirement as planned, now that there is one last (mostly) working release that will be maintained as an LTS kernel. - The architecture specific system call tables are updated for the added map_shadow_stack() syscall and to remove references to the long-gone sys_lookup_dcookie() syscall. * tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: hexagon: Remove unusable symbols from the ptrace.h uapi asm-generic: Fix spelling of architecture arch: Reserve map_shadow_stack() syscall number for all architectures syscalls: Cleanup references to sys_lookup_dcookie() Documentation: Drop or replace remaining mentions of IA64 lib/raid6: Drop IA64 support Documentation: Drop IA64 from feature descriptions kernel: Drop IA64 support from sig_fault handlers arch: Remove Itanium (IA-64) architecture
2023-11-02Merge tag 'for-6.7/io_uring-sockopt-2023-10-30' of git://git.kernel.dk/linuxLinus Torvalds5-272/+1149
Pull io_uring {get,set}sockopt support from Jens Axboe: "This adds support for using getsockopt and setsockopt via io_uring. The main use cases for this is to enable use of direct descriptors, rather than first instantiating a normal file descriptor, doing the option tweaking needed, then turning it into a direct descriptor. With this support, we can avoid needing a regular file descriptor completely. The net and bpf bits have been signed off on their side" * tag 'for-6.7/io_uring-sockopt-2023-10-30' of git://git.kernel.dk/linux: selftests/bpf/sockopt: Add io_uring support io_uring/cmd: Introduce SOCKET_URING_OP_SETSOCKOPT io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT io_uring/cmd: return -EOPNOTSUPP if net is disabled selftests/net: Extract uring helpers to be reusable tools headers: Grab copy of io_uring.h io_uring/cmd: Pass compat mode in issue_flags net/socket: Break down __sys_getsockopt net/socket: Break down __sys_setsockopt bpf: Add sockptr support for setsockopt bpf: Add sockptr support for getsockopt
2023-11-01Merge tag 'x86_tdx_for_6.7' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 TDX updates from Dave Hansen: "The majority of this is a rework of the assembly and C wrappers that are used to talk to the TDX module and VMM. This is a nice cleanup in general but is also clearing the way for using this code when Linux is the TDX VMM. There are also some tidbits to make TDX guests play nicer with Hyper-V and to take advantage the hardware TSC. Summary: - Refactor and clean up TDX hypercall/module call infrastructure - Handle retrying/resuming page conversion hypercalls - Make sure to use the (shockingly) reliable TSC in TDX guests" [ TLA reminder: TDX is "Trust Domain Extensions", Intel's guest VM confidentiality technology ] * tag 'x86_tdx_for_6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tdx: Mark TSC reliable x86/tdx: Fix __noreturn build warning around __tdx_hypercall_failed() x86/virt/tdx: Make TDX_MODULE_CALL handle SEAMCALL #UD and #GP x86/virt/tdx: Wire up basic SEAMCALL functions x86/tdx: Remove 'struct tdx_hypercall_args' x86/tdx: Reimplement __tdx_hypercall() using TDX_MODULE_CALL asm x86/tdx: Make TDX_HYPERCALL asm similar to TDX_MODULE_CALL x86/tdx: Extend TDX_MODULE_CALL to support more TDCALL/SEAMCALL leafs x86/tdx: Pass TDCALL/SEAMCALL input/output registers via a structure x86/tdx: Rename __tdx_module_call() to __tdcall() x86/tdx: Make macros of TDCALLs consistent with the spec x86/tdx: Skip saving output regs when SEAMCALL fails with VMFailInvalid x86/tdx: Zero out the missing RSI in TDX_HYPERCALL macro x86/tdx: Retry partially-completed page conversion hypercalls
2023-11-01tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissionsItaru Kitayama1-0/+1
On Ubuntu and probably other distros, ptrace permissions are tightend a bit by default; i.e., /proc/sys/kernel/yama/ptrace_score is set to 1. This cases memfd_secret's ptrace attach test fails with a permission error. Set it to 0 piror to running the program. Link: https://lkml.kernel.org/r/20231030-selftest-v1-1-743df68bb996@linux.dev Signed-off-by: Itaru Kitayama <itaru.kitayama@linux.dev> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01proc: test ProtectionKey in proc-empty-vm testSwarup Laxman Kotiaklapudi1-18/+61
Check ProtectionKey field in /proc/*/smaps output, if system supports protection keys feature. [adobriyan@gmail.com: test support in the beginning of the program, use syscall, not glibc pkey_alloc(3) which may not compile] Link: https://lkml.kernel.org/r/ac05efa7-d2a0-48ad-b704-ffdd5450582e@p183 Signed-off-by: Swarup Laxman Kotiaklapudi <swarupkotikalapudi@gmail.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Swarup Laxman Kotikalapudi<swarupkotikalapudi@gmail.com> Tested-by: Swarup Laxman Kotikalapudi<swarupkotikalapudi@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01proc: fix proc-empty-vm test with vsyscallAlexey Dobriyan1-4/+6
* fix embarassing /proc/*/smaps test bug due to a typo in variable name it tested only the first line of the output if vsyscall is enabled: ffffffffff600000-ffffffffff601000 r-xp ... so test passed but tested only VMA location and permissions. * add "KSM" entry, unnoticed because (1) * swap "r-xp" and "--xp" vsyscall test strings, also unnoticed because (1) Link: https://lkml.kernel.org/r/76f42cce-b1ab-45ec-b6b2-4c64f0dccb90@p183 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Tested-by: Swarup Laxman Kotikalapudi<swarupkotikalapudi@mail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01selftests: add a sanity check for zswapNhat Pham1-0/+48
We recently encountered a bug that makes all zswap store attempt fail. Specifically, after: "141fdeececb3 mm/zswap: delay the initialization of zswap" if we build a kernel with zswap disabled by default, then enabled after the swapfile is set up, the zswap tree will not be initialized. As a result, all zswap store calls will be short-circuited. We have to perform another swapon to get zswap working properly again. Fortunately, this issue has since been fixed by the patch that kills frontswap: "42c06a0e8ebe mm: kill frontswap" which performs zswap_swapon() unconditionally, i.e always initializing the zswap tree. This test add a sanity check that ensure zswap storing works as intended. Link: https://lkml.kernel.org/r/20231020222009.2358953-1-nphamcs@gmail.com Signed-off-by: Nhat Pham <nphamcs@gmail.com> Acked-by: Rik van Riel <riel@surriel.com> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Zefan Li <lizefan.x@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-01Merge tag 'arm64-upstream' of ↵Linus Torvalds2-0/+73
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "No major architecture features this time around, just some new HWCAP definitions, support for the Ampere SoC PMUs and a few fixes/cleanups. The bulk of the changes is reworking of the CPU capability checking code (cpus_have_cap() etc). - Major refactoring of the CPU capability detection logic resulting in the removal of the cpus_have_const_cap() function and migrating the code to "alternative" branches where possible - Backtrace/kgdb: use IPIs and pseudo-NMI - Perf and PMU: - Add support for Ampere SoC PMUs - Multi-DTC improvements for larger CMN configurations with multiple Debug & Trace Controllers - Rework the Arm CoreSight PMU driver to allow separate registration of vendor backend modules - Fixes: add missing MODULE_DEVICE_TABLE to the amlogic perf driver; use device_get_match_data() in the xgene driver; fix NULL pointer dereference in the hisi driver caused by calling cpuhp_state_remove_instance(); use-after-free in the hisi driver - HWCAP updates: - FEAT_SVE_B16B16 (BFloat16) - FEAT_LRCPC3 (release consistency model) - FEAT_LSE128 (128-bit atomic instructions) - SVE: remove a couple of pseudo registers from the cpufeature code. There is logic in place already to detect mismatched SVE features - Miscellaneous: - Reduce the default swiotlb size (currently 64MB) if no ZONE_DMA bouncing is needed. The buffer is still required for small kmalloc() buffers - Fix module PLT counting with !RANDOMIZE_BASE - Restrict CPU_BIG_ENDIAN to LLVM IAS 15.x or newer move synchronisation code out of the set_ptes() loop - More compact cpufeature displaying enabled cores - Kselftest updates for the new CPU features" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits) arm64: Restrict CPU_BIG_ENDIAN to GNU as or LLVM IAS 15.x or newer arm64: module: Fix PLT counting when CONFIG_RANDOMIZE_BASE=n arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper perf: hisi: Fix use-after-free when register pmu fails drivers/perf: hisi_pcie: Initialize event->cpu only on success drivers/perf: hisi_pcie: Check the type first in pmu::event_init() arm64: cpufeature: Change DBM to display enabled cores arm64: cpufeature: Display the set of cores with a feature perf/arm-cmn: Enable per-DTC counter allocation perf/arm-cmn: Rework DTC counters (again) perf/arm-cmn: Fix DTC domain detection drivers: perf: arm_pmuv3: Drop some unused arguments from armv8_pmu_init() drivers: perf: arm_pmuv3: Read PMMIR_EL1 unconditionally drivers/perf: hisi: use cpuhp_state_remove_instance_nocalls() for hisi_hns3_pmu uninit process clocksource/drivers/arm_arch_timer: limit XGene-1 workaround arm64: Remove system_uses_lse_atomics() arm64: Mark the 'addr' argument to set_ptes() and __set_pte_at() as unused drivers/perf: xgene: Use device_get_match_data() perf/amlogic: add missing MODULE_DEVICE_TABLE arm64/mm: Hoist synchronization out of set_ptes() loop ...
2023-11-01Merge tag 'devicetree-for-6.7' of ↵Linus Torvalds6-0/+177
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: - Add a kselftest to check for unprobed DT devices - Fix address translation for some 3 address cells cases - Refactor firmware node refcounting for AMBA bus - Add bindings for qcom,sm4450-pdc, Qualcomm Kryo 465 CPU, and Freescale QMC HDLC - Add Marantec vendor prefix - Convert qcom,pm8921-keypad, cnxt,cx92755-wdt, da9062-wdt, and atmel,at91rm9200-wdt bindings to DT schema - Several additionalProperties/unevaluatedProperties on child node schemas fixes - Drop reserved-memory bindings which now live in dtschema project - Fix a reference to rockchip,inno-usb2phy.yaml - Remove backlight nodes from display panel examples - Expand example for using DT_SCHEMA_FILES - Merge simple LVDS panel bindings to one binding doc * tag 'devicetree-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (34 commits) dt-bindings: soc: fsl: cpm_qe: cpm1-scc-qmc: Add support for QMC HDLC dt-bindings: soc: fsl: cpm_qe: cpm1-scc-qmc: Add 'additionalProperties: false' in child nodes dt-bindings: soc: fsl: cpm_qe: cpm1-scc-qmc: Fix example property name dt-bindings: arm,coresight-cti: Add missing additionalProperties on child nodes dt-bindings: arm,coresight-cti: Drop type for 'cpu' property dt-bindings: soundwire: Add reference to soundwire-controller.yaml schema dt-bindings: input: syna,rmi4: Make "additionalProperties: true" explicit media: dt-bindings: ti,ds90ub960: Add missing type for "i2c-alias" dt-bindings: input: qcom,pm8921-keypad: convert to YAML format of: overlay: unittest: overlay_bad_unresolved: Spelling s/ok/okay/ of: address: Consolidate bus .map() functions of: address: Store number of bus flag cells rather than bool of: unittest: Add tests for address translations of: address: Remove duplicated functions of: address: Fix address translation when address-size is greater than 2 dt-bindings: watchdog: cnxt,cx92755-wdt: convert txt to yaml dt-bindings: watchdog: da9062-wdt: convert txt to yaml dt-bindings: watchdog: fsl,scu-wdt: Document imx8dl dt-bindings: watchdog: atmel,at91rm9200-wdt: convert txt to yaml dt-bindings: usb: rockchip,dwc3: update inno usb2 phy binding name ...
2023-11-01Merge tag 'platform-drivers-x86-v6.7-1' of ↵Linus Torvalds3-51/+168
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver updates from Ilpo Järvinen: - asus-wmi: Support for screenpad and solve brightness key press duplication - int3472: Eliminate the last use of deprecated GPIO functions - mlxbf-pmc: New HW support - msi-ec: Support new EC configurations - thinkpad_acpi: Support reading aux MAC address during passthrough - wmi: Fixes & improvements - x86-android-tablets: Detection fix and avoid use of GPIO private APIs - Debug & metrics interface improvements - Miscellaneous cleanups / fixes / improvements * tag 'platform-drivers-x86-v6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (80 commits) platform/x86: inspur-platform-profile: Add platform profile support platform/x86: thinkpad_acpi: Add battery quirk for Thinkpad X120e platform/x86: wmi: Decouple WMI device removal from wmi_block_list platform/x86: wmi: Fix opening of char device platform/x86: wmi: Fix probe failure when failing to register WMI devices platform/x86: wmi: Fix refcounting of WMI devices in legacy functions platform/x86: wmi: Decouple probe deferring from wmi_block_list platform/x86/amd/hsmp: Fix iomem handling platform/x86: asus-wmi: Do not report brightness up/down keys when also reported by acpi_video platform/x86: thinkpad_acpi: replace deprecated strncpy with memcpy tools/power/x86/intel-speed-select: v1.18 release tools/power/x86/intel-speed-select: Use cgroup isolate for CPU 0 tools/power/x86/intel-speed-select: Increase max CPUs in one request tools/power/x86/intel-speed-select: Display error for core-power support tools/power/x86/intel-speed-select: No TRL for non compute domains tools/power/x86/intel-speed-select: turbo-mode enable disable swapped tools/power/x86/intel-speed-select: Update help for TRL tools/power/x86/intel-speed-select: Sanitize integer arguments platform/x86: acer-wmi: Remove void function return platform/x86/amd/pmc: Add dump_custom_stb module parameter ...