summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2025-07-11perf tests bp_account: Fix leaked file descriptorLeo Yan1-0/+1
Since the commit e9846f5ead26 ("perf test: In forked mode add check that fds aren't leaked"), the test "Breakpoint accounting" reports the error: # perf test -vvv "Breakpoint accounting" 20: Breakpoint accounting: --- start --- test child forked, pid 373 failed opening event 0 failed opening event 0 watchpoints count 4, breakpoints count 6, has_ioctl 1, share 0 wp 0 created wp 1 created wp 2 created wp 3 created wp 0 modified to bp wp max created ---- end(0) ---- Leak of file descriptor 7 that opened: 'anon_inode:[perf_event]' A watchpoint's file descriptor was not properly released. This patch fixes the leak. Fixes: 032db28e5fa3 ("perf tests: Add breakpoint accounting/modify test") Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250711-perf_fix_breakpoint_accounting-v1-1-b314393023f9@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-11/+18
Cross-merge networking fixes after downstream PR (net-6.16-rc6-2). No conflicts. Adjacent changes: drivers/net/wireless/mediatek/mt76/mt7925/mcu.c c701574c5412 ("wifi: mt76: mt7925: fix invalid array index in ssid assignment during hw scan") b3a431fe2e39 ("wifi: mt76: mt7925: fix off by one in mt7925_mcu_hw_scan()") drivers/net/wireless/mediatek/mt76/mt7996/mac.c 62da647a2b20 ("wifi: mt76: mt7996: Add MLO support to mt7996_tx_check_aggr()") dc66a129adf1 ("wifi: mt76: add a wrapper for wcid access with validation") drivers/net/wireless/mediatek/mt76/mt7996/main.c 3dd6f67c669c ("wifi: mt76: Move RCU section in mt7996_mcu_add_rate_ctrl()") 8989d8e90f5f ("wifi: mt76: mt7996: Do not set wcid.sta to 1 in mt7996_mac_sta_event()") net/mac80211/cfg.c 58fcb1b4287c ("wifi: mac80211: reject VHT opmode for unsupported channel widths") 037dc18ac3fb ("wifi: mac80211: add support for storing station S1G capabilities") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11objtool: Add vpanic() to the noreturn listNam Cao1-0/+1
vpanic() does not return. However, objtool doesn't know this and gets confused: kernel/trace/rv/reactor_panic.o: warning: objtool: rv_panic_reaction(): unexpected end of section .text Add vpanic() to the list of noreturn functions. Cc: John Ogness <john.ogness@linutronix.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/073f826ebec18b2bb59cba88606cd865d8039fd2.1752232374.git.namcao@linutronix.de Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202507110826.2ekbVdWZ-lkp@intel.com/ Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-11selftests/bpf: Range analysis test case for JSETPaul Chaignon1-0/+18
This patch adds coverage for the warning detected by syzkaller and fixed in the previous patch. Without the previous patch, this test fails with: verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0x0, 0x0] s64=[0x0, 0x0] u32=[0x1, 0x0] s32=[0x0, 0x0] var_off=(0x0, 0x0)(1) Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/c7893be1170fdbcf64e0200c110cdbd360ce7086.1752171365.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-11selftests/bpf: add selftests for bpf_arena_reserve_pagesEmil Tsalapatis3-0/+207
Add selftests for the new bpf_arena_reserve_pages kfunc. Acked-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250709191312.29840-3-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-11iommufd/selftest: Update hw_info coverage for an input data_typeNicolin Chen3-23/+46
Test both IOMMU_HW_INFO_TYPE_DEFAULT and IOMMU_HW_INFO_TYPE_SELFTEST, and add a negative test for an unsupported type. Also drop the unused mask in test_cmd_get_hw_capabilities() as checkpatch is complaining. Link: https://patch.msgid.link/r/f01a1e50cd7366f217cbf192ad0b2b79e0eb89f0.1752126748.git.nicolinc@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11iommufd/selftest: Add coverage for the new mmap interfaceNicolin Chen2-0/+23
Extend the loopback test to a new mmap page. Link: https://patch.msgid.link/r/b02b1220c955c3cf9ea5dd9fe9349ab1b4f8e20b.1752126748.git.nicolinc@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11selftests: drv-net: Add bpftool utilMohsin Bashir2-1/+5
Add bpf utility to simplify the use of bpftool for XDP tests included in this series. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20250710184351.63797-2-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11iommufd/selftest: Add coverage for IOMMUFD_CMD_HW_QUEUE_ALLOCNicolin Chen3-0/+96
Some simple tests for IOMMUFD_CMD_HW_QUEUE_ALLOC infrastructure covering the new iommufd_hw_queue_depend/undepend() helpers. Link: https://patch.msgid.link/r/e8a194d187d7ef445f43e4a3c04fb39472050afd.1752126748.git.nicolinc@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-11perf bench futex: Remove support for IMMUTABLESebastian Andrzej Siewior9-26/+5
It has been decided to remove the support IMMUTABLE futex. perf bench was one of the eary users for testing purposes. Now that the API is removed before it could be used in an official release, remove the bits from perf, too. Remove Remove support for IMMUTABLE futex. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250710110011.384614-7-bigeasy@linutronix.de
2025-07-11selftests/futex: Remove support for IMMUTABLESebastian Andrzej Siewior1-49/+22
Testing for the IMMUTABLE part of the futex interface is not needed after the removal of the interface. Remove support for IMMUTABLE from the sefltest. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250710110011.384614-6-bigeasy@linutronix.de
2025-07-11selftests/futex: Adapt the private hash test to RCU related changesSebastian Andrzej Siewior1-1/+41
The auto scaling on create creation used to automatically assign the new hash because there was the private hash was unused and could be replaced right away. This is already racy because if the private hash is in use by a thread then the visibile resize will be delayed. With the upcoming change to wait for a RCU grace period before the hash can be assigned, the test will always fail. If the reported number of hash buckets is not updated after an auto scaling event, block on an acquired lock with a timeout. The timeout is the delay to wait towards a grace period and locking and a locked pthread_mutex_t ensure that glibc calls into kernel using futex operation which will assign new private hash if available. This will retry every 100ms up to 2 seconds in total. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250710110011.384614-2-bigeasy@linutronix.de
2025-07-11selftests: net: lib: fix shift count out of rangeHangbin Liu1-1/+1
I got the following warning when writing other tests: + handle_test_result_pass 'bond 802.3ad' '(lacp_active off)' + local 'test_name=bond 802.3ad' + shift + local 'opt_str=(lacp_active off)' + shift + log_test_result 'bond 802.3ad' '(lacp_active off)' ' OK ' + local 'test_name=bond 802.3ad' + shift + local 'opt_str=(lacp_active off)' + shift + local 'result= OK ' + shift + local retmsg= + shift /net/tools/testing/selftests/net/forwarding/../lib.sh: line 315: shift: shift count out of range This happens because an extra shift is executed even after all arguments have been consumed. Remove the last shift in log_test_result() to avoid this warning. Fixes: a923af1ceee7 ("selftests: forwarding: Convert log_test() to recognize RET values") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250709091244.88395-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11selftests: Add IPv6 multicast route generation tests for GRE devices.Guillaume Nault1-10/+17
The previous patch fixes a bug that prevented the creation of the default IPv6 multicast route (ff00::/8) for some GRE devices. Now let's extend the GRE IPv6 selftests to cover this case. Also, rename check_ipv6_ll_addr() to check_ipv6_device_config() and adapt comments and script output to take into account the fact that we're not limited to link-local address generation. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/65a89583bde3bf866a1922c2e5158e4d72c520e2.1752070620.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11selftests: drv-net: test RSS header field configurationJakub Kicinski1-0/+47
Test reading RXFH fields over IOCTL and netlink. # ./tools/testing/selftests/drivers/net/hw/rss_api.py TAP version 13 1..3 ok 1 rss_api.test_rxfh_indir_ntf ok 2 rss_api.test_rxfh_indir_ctx_ntf ok 3 rss_api.test_rxfh_fields # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 Link: https://patch.msgid.link/20250708220640.2738464-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11tools: ynl: decode enums in auto-intsJakub Kicinski1-0/+2
Use enum decoding on auto-ints. Looks like we only had enum auto-ints for input values until now. Upcoming RSS work will need this to declare an attribute with flags as a uint. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20250708220640.2738464-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10selftests: breakpoints: use suspend_stats to reliably check suspend successMoon Hee Lee1-10/+31
The step_after_suspend_test verifies that the system successfully suspended and resumed by setting a timerfd and checking whether the timer fully expired. However, this method is unreliable due to timing races. In practice, the system may take time to enter suspend, during which the timer may expire just before or during the transition. As a result, the remaining time after resume may show non-zero nanoseconds, even if suspend/resume completed successfully. This leads to false test failures. Replace the timer-based check with a read from /sys/power/suspend_stats/success. This counter is incremented only after a full suspend/resume cycle, providing a reliable and race-free indicator. Also remove the unused file descriptor for /sys/power/state, which remained after switching to a system() call to trigger suspend [1]. [1] https://lore.kernel.org/all/20240930224025.2858767-1-yifei.l.liu@oracle.com/ Link: https://lore.kernel.org/r/20250626191626.36794-1-moonhee.lee.ca@gmail.com Fixes: c66be905cda2 ("selftests: breakpoints: use remaining time to check if suspend succeed") Signed-off-by: Moon Hee Lee <moonhee.lee.ca@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski7-2/+100
Cross-merge networking fixes after downstream PR (net-6.16-rc6). No conflicts. Adjacent changes: Documentation/devicetree/bindings/net/allwinner,sun8i-a83t-emac.yaml 0a12c435a1d6 ("dt-bindings: net: sun8i-emac: Add A100 EMAC compatible") b3603c0466a8 ("dt-bindings: net: sun8i-emac: Rename A523 EMAC0 to GMAC0") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-10Merge tag 'net-6.16-rc6' of ↵Linus Torvalds2-0/+90
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from Bluetooth. Current release - regressions: - tcp: refine sk_rcvbuf increase for ooo packets - bluetooth: fix attempting to send HCI_Disconnect to BIS handle - rxrpc: fix over large frame size warning - eth: bcmgenet: initialize u64 stats seq counter Previous releases - regressions: - tcp: correct signedness in skb remaining space calculation - sched: abort __tc_modify_qdisc if parent class does not exist - vsock: fix transport_{g2h,h2g} TOCTOU - rxrpc: fix bug due to prealloc collision - tipc: fix use-after-free in tipc_conn_close(). - bluetooth: fix not marking Broadcast Sink BIS as connected - phy: qca808x: fix WoL issue by utilizing at8031_set_wol() - eth: am65-cpsw-nuss: fix skb size by accounting for skb_shared_info Previous releases - always broken: - netlink: fix wraparounds of sk->sk_rmem_alloc. - atm: fix infinite recursive call of clip_push(). - eth: - stmmac: fix interrupt handling for level-triggered mode in DWC_XGMAC2 - rtsn: fix a null pointer dereference in rtsn_probe()" * tag 'net-6.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits) net/sched: sch_qfq: Fix null-deref in agg_dequeue rxrpc: Fix oops due to non-existence of prealloc backlog struct rxrpc: Fix bug due to prealloc collision MAINTAINERS: remove myself as netronome maintainer selftests/net: packetdrill: add tcp_ooo-before-and-after-accept.pkt tcp: refine sk_rcvbuf increase for ooo packets net/sched: Abort __tc_modify_qdisc if parent class does not exist net: ethernet: ti: am65-cpsw-nuss: Fix skb size by accounting for skb_shared_info net: thunderx: avoid direct MTU assignment after WRITE_ONCE() selftests/tc-testing: Create test case for UAF scenario with DRR/NETEM/BLACKHOLE chain atm: clip: Fix NULL pointer dereference in vcc_sendmsg() atm: clip: Fix infinite recursive call of clip_push(). atm: clip: Fix memory leak of struct clip_vcc. atm: clip: Fix potential null-ptr-deref in to_atmarpd(). net: phy: smsc: Fix link failure in forced mode with Auto-MDIX net: phy: smsc: Force predictable MDI-X state on LAN87xx net: phy: smsc: Fix Auto-MDIX configuration when disabled by strap net: stmmac: Fix interrupt handling for level-triggered mode in DWC_XGMAC2 rxrpc: Fix over large frame size warning net: airoha: Fix an error handling path in airoha_probe() ...
2025-07-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+1
Pull KVM fixes from Paolo Bonzini: "Many patches, pretty much all of them small, that accumulated while I was on vacation. ARM: - Remove the last leftovers of the ill-fated FPSIMD host state mapping at EL2 stage-1 - Fix unexpected advertisement to the guest of unimplemented S2 base granule sizes - Gracefully fail initialising pKVM if the interrupt controller isn't GICv3 - Also gracefully fail initialising pKVM if the carveout allocation fails - Fix the computing of the minimum MMIO range required for the host on stage-2 fault - Fix the generation of the GICv3 Maintenance Interrupt in nested mode x86: - Reject SEV{-ES} intra-host migration if one or more vCPUs are actively being created, so as not to create a non-SEV{-ES} vCPU in an SEV{-ES} VM - Use a pre-allocated, per-vCPU buffer for handling de-sparsification of vCPU masks in Hyper-V hypercalls; fixes a "stack frame too large" issue - Allow out-of-range/invalid Xen event channel ports when configuring IRQ routing, to avoid dictating a specific ioctl() ordering to userspace - Conditionally reschedule when setting memory attributes to avoid soft lockups when userspace converts huge swaths of memory to/from private - Add back MWAIT as a required feature for the MONITOR/MWAIT selftest - Add a missing field in struct sev_data_snp_launch_start that resulted in the guest-visible workarounds field being filled at the wrong offset - Skip non-canonical address when processing Hyper-V PV TLB flushes to avoid VM-Fail on INVVPID - Advertise supported TDX TDVMCALLs to userspace - Pass SetupEventNotifyInterrupt arguments to userspace - Fix TSC frequency underflow" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: avoid underflow when scaling TSC frequency KVM: arm64: Remove kvm_arch_vcpu_run_map_fp() KVM: arm64: Fix handling of FEAT_GTG for unimplemented granule sizes KVM: arm64: Don't free hyp pages with pKVM on GICv2 KVM: arm64: Fix error path in init_hyp_mode() KVM: arm64: Adjust range correctly during host stage-2 faults KVM: arm64: nv: Fix MI line level calculation in vgic_v3_nested_update_mi() KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush KVM: SVM: Add missing member in SNP_LAUNCH_START command structure Documentation: KVM: Fix unexpected unindent warnings KVM: selftests: Add back the missing check of MONITOR/MWAIT availability KVM: Allow CPU to reschedule while setting per-page memory attributes KVM: x86/xen: Allow 'out of range' event channel ports in IRQ routing table. KVM: x86/hyper-v: Use preallocated per-vCPU buffer for de-sparsified vCPU masks KVM: SVM: Initialize vmsa_pa in VMCB to INVALID_PAGE if VMSA page is NULL KVM: SVM: Reject SEV{-ES} intra host migration if vCPU creation is in-flight KVM: TDX: Report supported optional TDVMCALLs in TDX capabilities KVM: TDX: Exit to userspace for SetupEventNotifyInterrupt
2025-07-10iommufd/selftest: Add coverage for viommu dataNicolin Chen3-17/+40
Extend the existing test_cmd/err_viommu_alloc helpers to accept optional user data. And add a TEST_F for a loopback test. Link: https://patch.msgid.link/r/8ceb64d30e9953f29270a7d341032ca439317271.1752126748.git.nicolinc@nvidia.com Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-10selftests/hid: sync python tests to hid-tools 0.10Benjamin Tissoires1-0/+19
hid-tools 0.10 fixes one inconvenience introduced by commit 6a9e76f75c1a ("HID: multitouch: Disable touchpad on firmware level while not in use") This change added a new callback when a hid-nultitouch device is opened or closed to put the underlying device into a given operating mode. However, in the test cases, that means that while the single threaded test is run, it opens the device but has to react to the device while the open() is still running. hid-tools now implements a minimal thread to circumvent this. This makes the HID kernel tests in sync with hid-tools 0.10. This has the net effect of running the full HID python testsuite in 6 minutes instead of 1 hour. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Link: https://patch.msgid.link/20250709-wip-fix-ci-v1-3-b7df4c271cf8@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2025-07-10selftests/hid: sync the python tests to hid-tools 0.8Benjamin Tissoires9-43/+69
Instead of backporting one by one each commits, let's pull them in bulk and refer to the hid-tools project for a detailed history. The short summary is: - make use of dataclass when possible, to avoid tuples - wacom: remove unused uhdev parameter - various small fixes not worth mentioning Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Link: https://patch.msgid.link/20250709-wip-fix-ci-v1-2-b7df4c271cf8@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2025-07-10selftests/hid: run ruff format on the python partBenjamin Tissoires2-115/+325
We aim at syncing with the hid-tools repo on gitlab.freedesktop.org/libevdev/hid-tools. One of the commits is this mechanical formatting, so pull it over here so changes are not hidden by those. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Link: https://patch.msgid.link/20250709-wip-fix-ci-v1-1-b7df4c271cf8@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2025-07-10KVM: selftests: Add CONFIG_EVENTFD for irqfd selftestMark Brown1-0/+1
In 7e9b231c402a ("KVM: selftests: Add a KVM_IRQFD test to verify uniqueness requirements") we added a test for the newly added irqfd support but since this feature works with eventfds it won't work unless the kernel has been built wth eventfd support. Add CONFIG_EVENTFD to the list of required options for the KVM selftests. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250710-kvm-selftests-eventfd-config-v1-1-78c276e4b80f@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-07-10net: xsk: introduce XDP_MAX_TX_SKB_BUDGET setsockoptJason Xing1-0/+1
This patch provides a setsockopt method to let applications leverage to adjust how many descs to be handled at most in one send syscall. It mitigates the situation where the default value (32) that is too small leads to higher frequency of triggering send syscall. Considering the prosperity/complexity the applications have, there is no absolutely ideal suggestion fitting all cases. So keep 32 as its default value like before. The patch does the following things: - Add XDP_MAX_TX_SKB_BUDGET socket option. - Set max_tx_budget to 32 by default in the initialization phase as a per-socket granular control. - Set the range of max_tx_budget as [32, xs->tx->nentries]. The idea behind this comes out of real workloads in production. We use a user-level stack with xsk support to accelerate sending packets and minimize triggering syscalls. When the packets are aggregated, it's not hard to hit the upper bound (namely, 32). The moment user-space stack fetches the -EAGAIN error number passed from sendto(), it will loop to try again until all the expected descs from tx ring are sent out to the driver. Enlarging the XDP_MAX_TX_SKB_BUDGET value contributes to less frequency of sendto() and higher throughput/PPS. Here is what I did in production, along with some numbers as follows: For one application I saw lately, I suggested using 128 as max_tx_budget because I saw two limitations without changing any default configuration: 1) XDP_MAX_TX_SKB_BUDGET, 2) socket sndbuf which is 212992 decided by net.core.wmem_default. As to XDP_MAX_TX_SKB_BUDGET, the scenario behind this was I counted how many descs are transmitted to the driver at one time of sendto() based on [1] patch and then I calculated the possibility of hitting the upper bound. Finally I chose 128 as a suitable value because 1) it covers most of the cases, 2) a higher number would not bring evident results. After twisting the parameters, a stable improvement of around 4% for both PPS and throughput and less resources consumption were found to be observed by strace -c -p xxx: 1) %time was decreased by 7.8% 2) error counter was decreased from 18367 to 572 [1]: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/ Signed-off-by: Jason Xing <kernelxing@tencent.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20250704160138.48677-1-kerneljasonxing@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-10selftests: ptrace: add set_syscall_info to .gitignoreMoon Hee Lee1-0/+1
Add the set_syscall_info test binary to .gitignore to avoid tracking build artifacts in the ptrace selftests directory. Link: https://lkml.kernel.org/r/20250623183405.133434-2-moonhee.lee.ca@gmail.com Signed-off-by: Moon Hee Lee <moonhee.lee.ca@gmail.com> Cc: "Dmitry V. Levin" <ldv@strace.io> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10tools/accounting/delaytop: add delaytop to record top-n task delayYaxin Wang2-1/+674
Problem ======= The "getdelays" can only display the latency of a single task by specifying a PID, but it has the following limitations: 1. single-task perspective: only supports querying the latency (CPU, I/O, memory, etc.) of an individual task via PID and cannot provide a global analysis of high-latency processes across the system. 2. lack of High-Latency process awareness: when the overall system latency is high (e.g., a spike in CPU latency), there is no way to quickly identify the top N processes contributing to the highest latency. 3. poor interactivity: It lacks dynamic sorting and refresh capabilities (similar to top), making it difficult to monitor latency changes in real time. Solution ======== To address these limitations, we introduce the "delaytop" with the following capabilities: 1. system view: monitors latency metrics (CPU, I/O, memory, IRQ, etc.) for all system processes 2. supports field-based sorting (e.g., default sort by CPU latency in descending order) 3. dynamic interactive interface: focus on specific processes with --pid; limit displayed entries with --processes 20; control monitoring duration with --iterations; Use case ======== bash# ./delaytop Top 20 processes (sorted by CPU delay): PID TGID COMMAND CPU(ms) IO(ms) SWAP(ms) RCL(ms) THR(ms) CMP(ms) WP(ms) IRQ(ms) --------------------------------------------------------------------------------------------- 26 26 kworker/1:0H 5.55 0.00 0.00 0.00 0.00 0.00 0.00 0.00 32 32 kworker/2:0H-kb 2.93 0.00 0.00 0.00 0.00 0.00 0.00 0.00 38 38 kworker/3:0H-ev 2.88 0.00 0.00 0.00 0.00 0.00 0.00 0.00 84 84 kworker/R-vfio- 1.62 0.00 0.00 0.00 0.00 0.00 0.00 0.00 24 24 ksoftirqd/1 1.43 0.00 0.00 0.00 0.00 0.00 0.00 0.00 19 19 idle_inject/0 0.99 0.00 0.00 0.00 0.00 0.00 0.00 0.00 16 16 rcu_exp_par_gp_ 0.87 0.00 0.00 0.00 0.00 0.00 0.00 0.00 11 11 kworker/0:1 0.87 0.00 0.00 0.00 0.00 0.00 0.00 0.00 22 22 idle_inject/1 0.80 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3 3 pool_workqueue_ 0.74 0.00 0.00 0.00 0.00 0.00 0.00 0.00 81 81 scsi_eh_1 0.59 0.00 0.00 0.00 0.00 0.00 0.00 0.00 30 30 ksoftirqd/2 0.42 0.00 0.00 0.00 0.00 0.00 0.00 0.00 36 36 ksoftirqd/3 0.37 0.00 0.00 0.00 0.00 0.00 0.00 0.00 9 9 kworker/0:0-eve 0.36 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8 8 kworker/R-netns 0.34 0.00 0.00 0.00 0.00 0.00 0.00 0.00 76 76 kworker/1:1-pm 0.32 0.00 0.00 0.00 0.00 0.00 0.00 0.00 21 21 cpuhp/1 0.30 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4 4 kworker/R-rcu_g 0.21 0.00 0.00 0.00 0.00 0.00 0.00 0.00 12 12 kworker/u16:0-i 0.20 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 1 init 0.18 0.00 0.00 0.00 0.00 0.00 0.08 0.00 Link: https://lkml.kernel.org/r/20250619211843633h05gWrBDMFkEH6xAVm_5y@zte.com.cn Co-developed-by: Fan Yu <fan.yu9@zte.com.cn> Signed-off-by: Fan Yu <fan.yu9@zte.com.cn> Signed-off-by: Yaxin Wang <wang.yaxin@zte.com.cn> Cc: Balbir Singh <bsingharora@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Peilin He <he.peilin@zte.com.cn> Cc: Qiang Tu <tu.qiang35@zte.com.cn> Cc: wangyong <wang.yong12@zte.com.cn> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: ye xingchen <ye.xingchen@zte.com.cn> Cc: Yunkai Zhang <zhang.yunkai@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10ksm_tests: skip hugepage test when Transparent Hugepages are disabledLi Wang6-1/+40
Some systems (e.g. minimal or real-time kernels) may not enable Transparent Hugepages (THP), causing MADV_HUGEPAGE to return EINVAL. This patch introduces a runtime check using the existing THP sysfs interface and skips the hugepage merging test (`-H`) when THP is not available. To avoid those failures: # ----------------------------- # running ./ksm_tests -H -s 100 # ----------------------------- # ksm_tests: MADV_HUGEPAGE: Invalid argument # [FAIL] not ok 1 ksm_tests -H -s 100 # exit=2 # -------------------- # running ./khugepaged # -------------------- # Reading PMD pagesize failed# [FAIL] not ok 1 khugepaged # exit=1 # -------------------- # running ./soft-dirty # -------------------- # TAP version 13 # 1..15 # ok 1 Test test_simple # ok 2 Test test_vma_reuse dirty bit of allocated page # ok 3 Test test_vma_reuse dirty bit of reused address page # Bail out! Reading PMD pagesize failed# Planned tests != run tests (15 != 3) # # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 # [FAIL] not ok 1 soft-dirty # exit=1 # SUMMARY: PASS=0 SKIP=0 FAIL=1 # ------------------- # running ./migration # ------------------- # TAP version 13 # 1..3 # # Starting 3 tests from 1 test cases. # # RUN migration.private_anon ... # # OK migration.private_anon # ok 1 migration.private_anon # # RUN migration.shared_anon ... # # OK migration.shared_anon # ok 2 migration.shared_anon # # RUN migration.private_anon_thp ... # # migration.c:196:private_anon_thp:Expected madvise(ptr, TWOMEG, MADV_HUGEPAGE) (-1) == 0 (0) # # private_anon_thp: Test terminated by assertion # # FAIL migration.private_anon_thp # not ok 3 migration.private_anon_thp # # FAILED: 2 / 3 tests passed. # # Totals: pass:2 fail:1 xfail:0 xpass:0 skip:0 error:0 # [FAIL] not ok 1 migration # exit=1 It's true that CONFIG_TRANSPARENT_HUGEPAGE=y is explicitly enabled in tools/testing/selftests/mm/config, so ideally the runtime environment should also support THP. However, in practice, we've found that on some systems: - THP is disabled at boot time (transparent_hugepage=never) - Or manually disabled via sysfs - Or unavailable in RT kernels, containers, or minimal CI environments In these cases, the test will fail with EINVAL on madvise(MADV_HUGEPAGE), even though the kernel config is correct. To make the test suite more robust and avoid false negatives, this patch adds a runtime check for /sys/kernel/mm/transparent_hugepage/enabled. If THP is not available, the hugepage test (-H) is skipped with a clear message. Link: https://lkml.kernel.org/r/20250624032748.393836-1-liwang@redhat.com Signed-off-by: Li Wang <liwang@redhat.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftests/mm: fix UFFDIO_API usage with proper two-step feature negotiationLi Wang1-2/+26
The current implementation of test_unmerge_uffd_wp() explicitly sets `uffdio_api.features = UFFD_FEATURE_PAGEFAULT_FLAG_WP` before calling UFFDIO_API. This can cause the ioctl() call to fail with EINVAL on kernels that do not support UFFD-WP, leading the test to fail unnecessarily: # ------------------------------ # running ./ksm_functional_tests # ------------------------------ # TAP version 13 # 1..9 # # [RUN] test_unmerge # ok 1 Pages were unmerged # # [RUN] test_unmerge_zero_pages # ok 2 KSM zero pages were unmerged # # [RUN] test_unmerge_discarded # ok 3 Pages were unmerged # # [RUN] test_unmerge_uffd_wp # not ok 4 UFFDIO_API failed <----- # # [RUN] test_prot_none # ok 5 Pages were unmerged # # [RUN] test_prctl # ok 6 Setting/clearing PR_SET_MEMORY_MERGE works # # [RUN] test_prctl_fork # # No pages got merged # # [RUN] test_prctl_fork_exec # ok 7 PR_SET_MEMORY_MERGE value is inherited # # [RUN] test_prctl_unmerge # ok 8 Pages were unmerged # Bail out! 1 out of 8 tests failed # # Planned tests != run tests (9 != 8) # # Totals: pass:7 fail:1 xfail:0 xpass:0 skip:0 error:0 # [FAIL] This patch improves compatibility and robustness of the UFFD-WP test (test_unmerge_uffd_wp) by correctly implementing the UFFDIO_API two-step handshake as recommended by the userfaultfd(2) man page. Key changes: 1. Use features=0 in the initial UFFDIO_API call to query supported feature bits, rather than immediately requesting WP support. 2. Skip the test gracefully if: - UFFDIO_API fails with EINVAL (e.g. unsupported API version), or - UFFD_FEATURE_PAGEFAULT_FLAG_WP is not advertised by the kernel. 3. Close the initial userfaultfd and create a new one before enabling the required feature, since UFFDIO_API can only be called once per fd. 4. Improve diagnostics by distinguishing between expected and unexpected failures, using strerror() to report errors. This ensures the test behaves correctly across a wider range of kernel versions and configurations, while preserving the intended behavior on kernels that support UFFD-WP. [liwang@redhat.com: fail the test if sys_userfaultfd() fails, per David] Link: https://lkml.kernel.org/r/20250625004645.400520-1-liwang@redhat.com Link: https://lkml.kernel.org/r/20250624042411.395285-1-liwang@redhat.com Signed-off-by: Li Wang <liwang@redhat.com> Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Li Wang <liwang@redhat.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftests/mm: remove duplicate .gitignore entriesMoon Hee Lee1-3/+0
Remove redundant entries in .gitignore confirmed by: $ sort tools/testing/selftests/mm/.gitignore | uniq -d hugetlb_dio pkey_sighandler_tests_32 pkey_sighandler_tests_64 These entries were originally added by [1], and later duplicated by [2]. [1] https://lore.kernel.org/all/20240924185911.117937-1-lorenzo.stoakes@oracle.com/ [2] https://lore.kernel.org/all/20241125064036.413536-1-lizhijian@fujitsu.com/ Link: https://lkml.kernel.org/r/20250626020758.163243-1-moonhee.lee.ca@gmail.com Signed-off-by: Moon Hee Lee <moonhee.lee.ca@gmail.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftests/mm: reduce uffd-unit-test poison test to minimumPeter Xu1-6/+14
The test will still generate quite some unwanted MCE error messages to syslog. There was old proposal ratelimiting the MCE messages from kernel, but that has risk of hiding real useful information on production systems. We can at least reduce the test to minimum to not over-pollute dmesg, however trying to not lose its coverage too much. [peterx@redhat.com: reduce uffd-unit-test poison test to minimum] Link: https://lkml.kernel.org/r/aF2RSsjuEOtzXcUa@x1.local Link: https://lkml.kernel.org/r/20250620150058.1729489-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftets/damon: add a test for memcg_path leakSeongJae Park2-0/+44
There was a memory leak bug in DAMOS sysfs memcg_path file. Add a selftest to ensure the bug never comes back. Link: https://lkml.kernel.org/r/20250619183608.6647-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10mm: remove callers of pfn_t functionalityAlistair Popple3-11/+3
All PFN_* pfn_t flags have been removed. Therefore there is no longer a need for the pfn_t type and all uses can be replaced with normal pfns. Link: https://lkml.kernel.org/r/bbedfa576c9822f8032494efbe43544628698b1f.1750323463.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Balbir Singh <balbirs@nvidia.com> Cc: Björn Töpel <bjorn@kernel.org> Cc: Björn Töpel <bjorn@rivosinc.com> Cc: Chunyan Zhang <zhang.lyra@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Deepak Gupta <debug@rivosinc.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Inki Dae <m.szyprowski@samsung.com> Cc: John Groves <john@groves.net> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10mm: remove PFN_DEV, PFN_MAP, PFN_SPECIAL, PFN_SG_CHAIN and PFN_SG_LASTAlistair Popple1-4/+0
The PFN_MAP flag is no longer used for anything, so remove it. The PFN_SG_CHAIN and PFN_SG_LAST flags never appear to have been used so also remove them. The last user of PFN_SPECIAL was removed by 653d7825c149 ("dcssblk: mark DAX broken, remove FS_DAX_LIMITED support"). Users of PFN_DEV were removed earlier in this series by "mm: Remove remaining uses of PFN_DEV". Link: https://lkml.kernel.org/r/670b3950d70b4d97b905bb597dadfd3633de4314.1750323463.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: Balbir Singh <balbirs@nvidia.com> Cc: Björn Töpel <bjorn@kernel.org> Cc: Björn Töpel <bjorn@rivosinc.com> Cc: Chunyan Zhang <zhang.lyra@gmail.com> Cc: Deepak Gupta <debug@rivosinc.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Inki Dae <m.szyprowski@samsung.com> Cc: John Groves <john@groves.net> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftests/udmabuf: add a test to pin first before writing to memfdVivek Kasireddy1-1/+19
Unlike the existing tests, this new test will create a memfd (backed by hugetlb) and pin the folios in it (a small subset) before writing/ populating it with data. This is a valid use-case that invokes the memfd_alloc_folio() kernel API and is expected to work unless there aren't enough hugetlb folios to satisfy the allocation needs. Link: https://lkml.kernel.org/r/20250618053415.1036185-4-vivek.kasireddy@intel.com Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Steve Sistare <steven.sistare@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: David Hildenbrand <david@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10mm: update architecture and driver code to use vm_flags_tLorenzo Stoakes1-1/+1
In future we intend to change the vm_flags_t type, so it isn't correct for architecture and driver code to assume it is unsigned long. Correct this assumption across the board. Overall, this patch does not introduce any functional change. Link: https://lkml.kernel.org/r/b6eb1894abc5555ece80bb08af5c022ef780c8bc.1750274467.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <kees@kernel.org> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10mm: update core kernel code to use vm_flags_t consistentlyLorenzo Stoakes2-137/+137
The core kernel code is currently very inconsistent in its use of vm_flags_t vs. unsigned long. This prevents us from changing the type of vm_flags_t in the future and is simply not correct, so correct this. While this results in rather a lot of churn, it is a critical pre-requisite for a future planned change to VMA flag type. Additionally, update VMA userland tests to account for the changes. To make review easier and to break things into smaller parts, driver and architecture-specific changes is left for a subsequent commit. The code has been adjusted to cascade the changes across all calling code as far as is needed. We will adjust architecture-specific and driver code in a subsequent patch. Overall, this patch does not introduce any functional change. Link: https://lkml.kernel.org/r/d1588e7bb96d1ea3fe7b9df2c699d5b4592d901d.1750274467.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Kees Cook <kees@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10mm: change vm_get_page_prot() to accept vm_flags_t argumentLorenzo Stoakes1-1/+1
Patch series "use vm_flags_t consistently". The VMA flags field vma->vm_flags is of type vm_flags_t. Right now this is exactly equivalent to unsigned long, but it should not be assumed to be. Much code that references vma->vm_flags already correctly uses vm_flags_t, but a fairly large chunk of code simply uses unsigned long and assumes that the two are equivalent. This series corrects that and has us use vm_flags_t consistently. This series is motivated by the desire to, in a future series, adjust vm_flags_t to be a u64 regardless of whether the kernel is 32-bit or 64-bit in order to deal with the VMA flag exhaustion issue and avoid all the various problems that arise from it (being unable to use certain features in 32-bit, being unable to add new flags except for 64-bit, etc.) This is therefore a critical first step towards that goal. At any rate, using the correct type is of value regardless. We additionally take the opportunity to refer to VMA flags as vm_flags where possible to make clear what we're referring to. Overall, this series does not introduce any functional change. This patch (of 3): We abstract the type of the VMA flags to vm_flags_t, however in may places it is simply assumed this is unsigned long, which is simply incorrect. At the moment this is simply an incongruity, however in future we plan to change this type and therefore this change is a critical requirement for doing so. Overall, this patch does not introduce any functional change. [lorenzo.stoakes@oracle.com: add missing vm_get_page_prot() instance, remove include] Link: https://lkml.kernel.org/r/552f88e1-2df8-4e95-92b8-812f7c8db829@lucifer.local Link: https://lkml.kernel.org/r/cover.1750274467.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/a12769720a2743f235643b158c4f4f0a9911daf0.1750274467.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <kees@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10tools/testing/radix-tree: test maple tree chaining mas_preallocate() callsLiam R. Howlett1-0/+12
Testing calling multiple mas_preallocate() calls in a row after adjusting the maple state. Ensures new calls to mas_preallocate() will change the number of allocated nodes. Link: https://lkml.kernel.org/r/20250616184521.3382795-4-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Hailong Liu <hailong.liu@oppo.com> Cc: "Liam R. Howlett" <howlett@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10testing/radix-tree/maple: increase readers and reduce delay for faster machinesLiam R. Howlett1-3/+4
Faster machines may not see the initial or updated value in the race condition. Reduce the delay so that faster machines are less likely to fail testing of the race conditions. Link: https://lkml.kernel.org/r/20250616184521.3382795-2-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <howlett@gmail.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Hailong Liu <hailong.liu@oppo.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftest/mm: skip if fallocate() is unsupported in gup_longtermMark Brown1-1/+9
Currently gup_longterm assumes that filesystems support fallocate() and uses that to allocate space in files, however this is an optional feature and is in particular not implemented by NFSv3 which is commonly used in CI systems leading to spurious failures. Check for lack of support and report a skip instead for that case. Link: https://lkml.kernel.org/r/20250613-selftest-mm-gup-longterm-fallocate-nfs-v1-1-758a104c175f@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10mm/vma: use vmg->target to specify target VMA for new VMA mergeLorenzo Stoakes1-3/+3
In commit 3a75ccba047b ("mm: simplify vma merge structure and expand comments") we introduced the vmg->target field to make the merging of existing VMAs simpler - clarifying precisely which VMA would eventually become the merged VMA once the merge operation was complete. New VMA merging did not get quite the same treatment, retaining the rather confusing convention of storing the target VMA in vmg->middle. This patch corrects this state of affairs, utilising vmg->target for this purpose for both vma_merge_new_range() and also for vma_expand(). We retain the WARN_ON for vmg->middle being specified in vma_merge_new_range() as doing so would make no sense, but add an additional debug assert for setting vmg->target. This patch additionally updates VMA userland testing to account for this change. [lorenzo.stoakes@oracle.com: make comment consistent in vma_expand()] Link: https://lkml.kernel.org/r/c54f45e3-a6ac-4749-93c0-cc9e3080ee37@lucifer.local Link: https://lkml.kernel.org/r/20250613184807.108089-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <kees@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftests: mm: add shmem collapse as a default test itemBaolin Wang1-0/+4
Currently, we only test anonymous memory collapse by default. We should also add shmem collapse as a default test item to catch issues that could break the test cases. Link: https://lkml.kernel.org/r/a30b1529b399f2e649b5a05c3d352f41a68faeae.1749779183.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Tested-by: Dev Jain <dev.jain@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Cc: Barry Song <baohua@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftests: khugepaged: fix the shmem collapse failureBaolin Wang1-2/+0
When running the khugepaged selftest for shmem (./khugepaged all:shmem), I encountered the following test failures: : Run test: collapse_full (khugepaged:shmem) : Collapse multiple fully populated PTE table.... Fail : ... : Run test: collapse_single_pte_entry (khugepaged:shmem) : Collapse PTE table with single PTE entry present.... Fail : ... : Run test: collapse_full_of_compound (khugepaged:shmem) : Allocate huge page... OK : Split huge page leaving single PTE page table full of compound pages... OK : Collapse PTE table full of compound pages.... Fail The reason for the failure is that it will set MADV_NOHUGEPAGE to prevent khugepaged from continuing to scan shmem VMA after khugepaged finishes scanning in the wait_for_scan() function. Moreover, shmem requires a refault to establish PMD mappings. However, after commit 2b0f922323cc ("mm: don't install PMD mappings when THPs are disabled by the hw/process/vma"), PMD mappings are prevented if the VMA is set with MADV_NOHUGEPAGE flag, so shmem cannot establish PMD mappings during refault. One way to fix this issue is to move the MADV_NOHUGEPAGE setting after the shmem refault. After shmem refault and check huge, the test case will unmap the shmem immediately. So it seems unnecessary to set the MADV_NOHUGEPAGE. Then we can simply drop the MADV_NOHUGEPAGE setting, and all khugepaged test cases passed. Link: https://lkml.kernel.org/r/d8502fc50d0304c2afd27ced062b1d636b7a872e.1749779183.git.baolin.wang@linux.alibaba.com Fixes: 2b0f922323cc ("mm: don't install PMD mappings when THPs are disabled by the hw/process/vma") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Tested-by: Dev Jain <dev.jain@arm.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftests/mm: use generic read_sysfs in thuge-gen testPu Lehui1-28/+10
As generic read_sysfs is available in vm_utils, let's use is in thuge-gen test. Link: https://lkml.kernel.org/r/20250611100106.1331197-1-pulehui@huaweicloud.com Signed-off-by: Pu Lehui <pulehui@huawei.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftests/mm: check for YAMA ptrace_scope configuraiton before modifying itMark Brown1-1/+3
When running the memfd_secret test run_vmtests.sh unconditionally tries to confgiure the YAMA LSM's ptrace_scope configuration, leading to an error if YAMA is not in the running kernel: # ./run_vmtests.sh: line 432: /proc/sys/kernel/yama/ptrace_scope: No such file or directory # # ---------------------- # # running ./memfd_secret # # ---------------------- Check that this file is present before trying to write to it. The indentation here is a bit odd, and it doesn't seem great that we configure but don't restore ptrace_scope. Link: https://lkml.kernel.org/r/20250610-selftest-mm-enable-yama-v1-1-0097b6713116@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftests/mm: add messages about test errors to the cow testsMark Brown1-8/+20
It is not sufficiently clear what the individual tests in the cow test program are checking so add messages for the failure cases. Link: https://lkml.kernel.org/r/20250610-selftest-mm-cow-tweaks-v1-4-43cd7457500f@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftests/mm: don't compare return values to in cowMark Brown1-3/+3
Tweak the coding style for checking for non-zero return values. While we're at it also remove a now redundant oring of the madvise() return code. Link: https://lkml.kernel.org/r/20250610-selftest-mm-cow-tweaks-v1-3-43cd7457500f@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10selftests/mm: convert some cow error reports to ksft_perror()Mark Brown1-3/+3
This prints the errno and a string decode of it. Link: https://lkml.kernel.org/r/20250610-selftest-mm-cow-tweaks-v1-2-43cd7457500f@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>