kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2025-12-02	selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py	Carolina Jubran	1	-44/+30
	Currently, tolerance is computed against the TC’s expected percentage, making TC3 (20%) validation overly strict and TC4 (80%) overly loose. Update BandwidthValidator to take a dict of shares and compute bounds relative to the overall total, so that all shares are validated consistently. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-7-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-02	selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py	Carolina Jubran	1	-13/+13
	Correct the documented bandwidth distribution between TC3 and TC4 from 80/20 to 20/80. Update test descriptions and printed messages to consistently reflect the intended split. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-6-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-02	selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py	Carolina Jubran	1	-2/+2
	Commit 7c32f7a2d3db ("selftests: net: py: don't default to shell=True") changed the cmd() helper to avoid spawning a shell unless explicitly requested. The devlink_rate_tc_bw test enables SR-IOV by writing to the sriov_numvfs sysfs attribute using redirection. Without shell=True the redirection is not interpreted and the VF device never appears, causing the test to fail. Fix by explicitly passing shell=True in the two places that update sriov_numvfs. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-5-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-02	selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py	Carolina Jubran	1	-41/+29
	Replace the inline iperf3 subprocess and JSON parsing with Iperf3Runner. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-4-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-02	selftests: drv-net: introduce Iperf3Runner for measurement use cases	Carolina Jubran	3	-12/+82
	GenerateTraffic was added to spin up long-running iperf3 load, mainly to drive high PPS background traffic. It was never meant to provide stable throughput numbers, and trying to repurpose it for measurement does not make sense. Introduce Iperf3Runner to allow tests to split out server/client configuration, control start/stop, and collect JSON output for analysis. This makes it possible to measure bandwidth directly when validating egress shaping. GenerateTraffic stays as the background load generator, reusing the common iperf3 helpers under the hood. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-3-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-02	selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS	Carolina Jubran	1	-0/+1
	This makes devlink_rate_tc_bw.py present in the Makefile under the same directory. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-2-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01	selftests: netconsole: remove log noise due to socat exit	Andre Carvalho	1	-1/+1
	This removes some noise that can be distracting while looking at selftests by redirecting socat stderr to /dev/null. Before this commit, netcons_basic would output: Running with target mode: basic (ipv6) 2025/11/29 12:08:03 socat[259] W exiting on signal 15 2025/11/29 12:08:03 socat[271] W exiting on signal 15 basic : ipv6 : Test passed Running with target mode: basic (ipv4) 2025/11/29 12:08:05 socat[329] W exiting on signal 15 2025/11/29 12:08:05 socat[322] W exiting on signal 15 basic : ipv4 : Test passed Running with target mode: extended (ipv6) 2025/11/29 12:08:08 socat[386] W exiting on signal 15 2025/11/29 12:08:08 socat[386] W exiting on signal 15 2025/11/29 12:08:08 socat[380] W exiting on signal 15 extended : ipv6 : Test passed Running with target mode: extended (ipv4) 2025/11/29 12:08:10 socat[440] W exiting on signal 15 2025/11/29 12:08:10 socat[435] W exiting on signal 15 2025/11/29 12:08:10 socat[435] W exiting on signal 15 extended : ipv4 : Test passed After these changes, output looks like: Running with target mode: basic (ipv6) basic : ipv6 : Test passed Running with target mode: basic (ipv4) basic : ipv4 : Test passed Running with target mode: extended (ipv6) extended : ipv6 : Test passed Running with target mode: extended (ipv4) extended : ipv4 : Test passed Signed-off-by: Andre Carvalho <asantostc@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251129-netcons-socat-noise-v1-1-605a0cea8fca@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01	selftests: net: add a hint about MACAddressPolicy=persistent	Jakub Kicinski	1	-1/+1
	New NIPA installation had been reporting a few flaky tests. arp_ndisc_evict_nocarrier is most flaky of them all. I suspect that the flakiness is due to udev swapping the MAC addresses on the interfaces. Extend the message in arp_ndisc_evict_nocarrier to hint at this potential issue. Having the neigh get fail right after ping is rather unusual, unless udev changes the MAC addr causing a flush in the meantime. Link: https://patch.msgid.link/20251127194556.2409574-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01	selftests: net: py: handle interrupt during cleanup	Jakub Kicinski	1	-2/+16
	Following up on the old discussion [1]. Let the BaseExceptions out of defer()'ed cleanup. And handle it in the main loop. This allows us to exit the tests if user hit Ctrl-C during defer(). Link: https://lore.kernel.org/20251119063228.3adfd743@kernel.org # [1] Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20251128004846.2602687-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01	ynl: samples: Fix spelling mistake "failedq" -> "failed"	Colin Ian King	1	-1/+1
	There is a spelling mistake in an error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://patch.msgid.link/20251128173802.318520-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01	Merge tag 'vfs-6.19-rc1.coredump' of ↵	Linus Torvalds	9	-1663/+2851
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pidfd and coredump updates from Christian Brauner: "Features: - Expose coredump signal via pidfd Expose the signal that caused the coredump through the pidfd interface. The recent changes to rework coredump handling to rely on unix sockets are in the process of being used in systemd. The previous systemd coredump container interface requires the coredump file descriptor and basic information including the signal number to be sent to the container. This means the signal number needs to be available before sending the coredump to the container. - Add supported_mask field to pidfd Add a new supported_mask field to struct pidfd_info that indicates which information fields are supported by the running kernel. This allows userspace to detect feature availability without relying on error codes or kernel version checks. Cleanups: - Drop struct pidfs_exit_info and prepare to drop exit_info pointer, simplifying the internal publication mechanism for exit and coredump information retrievable via the pidfd ioctl - Use guard() for task_lock in pidfs - Reduce wait_pidfd lock scope - Add missing PIDFD_INFO_SIZE_VER1 constant - Add missing BUILD_BUG_ON() assert on struct pidfd_info Fixes: - Fix PIDFD_INFO_COREDUMP handling Selftests: - Split out coredump socket tests and common helpers into separate files for better organization - Fix userspace coredump client detection issues - Handle edge-triggered epoll correctly - Ignore ENOSPC errors in tests - Add debug logging to coredump socket tests, socket protocol tests, and test helpers - Add tests for PIDFD_INFO_COREDUMP_SIGNAL - Add tests for supported_mask field - Update pidfd header for selftests" * tag 'vfs-6.19-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits) pidfs: reduce wait_pidfd lock scope selftests/coredump: add second PIDFD_INFO_COREDUMP_SIGNAL test selftests/coredump: add first PIDFD_INFO_COREDUMP_SIGNAL test selftests/coredump: ignore ENOSPC errors selftests/coredump: add debug logging to coredump socket protocol tests selftests/coredump: add debug logging to coredump socket tests selftests/coredump: add debug logging to test helpers selftests/coredump: handle edge-triggered epoll correctly selftests/coredump: fix userspace coredump client detection selftests/coredump: fix userspace client detection selftests/coredump: split out coredump socket tests selftests/coredump: split out common helpers selftests/pidfd: add second supported_mask test selftests/pidfd: add first supported_mask test selftests/pidfd: update pidfd header pidfs: expose coredump signal pidfs: drop struct pidfs_exit_info pidfs: prepare to drop exit_info pointer pidfd: add a new supported_mask field pidfs: add missing BUILD_BUG_ON() assert on struct pidfd_info ...
2025-12-01	Merge tag 'namespace-6.19-rc1' of ↵	Linus Torvalds	15	-58/+8344
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull namespace updates from Christian Brauner: "This contains substantial namespace infrastructure changes including a new system call, active reference counting, and extensive header cleanups. The branch depends on the shared kbuild branch for -fms-extensions support. Features: - listns() system call Add a new listns() system call that allows userspace to iterate through namespaces in the system. This provides a programmatic interface to discover and inspect namespaces, addressing longstanding limitations: Currently, there is no direct way for userspace to enumerate namespaces. Applications must resort to scanning /proc//ns/ across all processes, which is: - Inefficient - requires iterating over all processes - Incomplete - misses namespaces not attached to any running process but kept alive by file descriptors, bind mounts, or parent references - Permission-heavy - requires access to /proc for many processes - No ordering or ownership information - No filtering per namespace type The listns() system call solves these problems: ssize_t listns(const struct ns_id_req req, u64 ns_ids, size_t nr_ns_ids, unsigned int flags); struct ns_id_req { __u32 size; __u32 spare; __u64 ns_id; struct / listns / { __u32 ns_type; __u32 spare2; __u64 user_ns_id; }; }; Features include: - Pagination support for large namespace sets - Filtering by namespace type (MNT_NS, NET_NS, USER_NS, etc.) - Filtering by owning user namespace - Permission checks respecting namespace isolation - Active Reference Counting Introduce an active reference count that tracks namespace visibility to userspace. A namespace is visible in the following cases: - The namespace is in use by a task - The namespace is persisted through a VFS object (namespace file descriptor or bind-mount) - The namespace is a hierarchical type and is the parent of child namespaces The active reference count does not regulate lifetime (that's still done by the normal reference count) - it only regulates visibility to namespace file handles and listns(). This prevents resurrection of namespaces that are pinned only for internal kernel reasons (e.g., user namespaces held by file->f_cred, lazy TLB references on idle CPUs, etc.) which should not be accessible via (1)-(3). - Unified Namespace Tree Introduce a unified tree structure for all namespaces with: - Fixed IDs assigned to initial namespaces - Lookup based solely on inode number - Maintained list of owned namespaces per user namespace - Simplified rbtree comparison helpers Cleanups - Header Reorganization: - Move namespace types into separate header (ns_common_types.h) - Decouple nstree from ns_common header - Move nstree types into separate header - Switch to new ns_tree_{node,root} structures with helper functions - Use guards for ns_tree_lock - Initial Namespace Reference Count Optimization - Make all reference counts on initial namespaces a nop to avoid pointless cacheline ping-pong for namespaces that can never go away - Drop custom reference count initialization for initial namespaces - Add NS_COMMON_INIT() macro and use it for all namespaces - pid: rely on common reference count behavior - Miscellaneous Cleanups - Rename exit_task_namespaces() to exit_nsproxy_namespaces() - Rename is_initial_namespace() and make argument const - Use boolean to indicate anonymous mount namespace - Simplify owner list iteration in nstree - nsfs: raise SB_I_NODEV, SB_I_NOEXEC, and DCACHE_DONTCACHE explicitly - nsfs: use inode_just_drop() - pidfs: raise DCACHE_DONTCACHE explicitly - pidfs: simplify PIDFD_GET__NAMESPACE ioctls - libfs: allow to specify s_d_flags - cgroup: add cgroup namespace to tree after owner is set - nsproxy: fix free_nsproxy() and simplify create_new_namespaces() Fixes: - setns(pidfd, ...) race condition Fix a subtle race when using pidfds with setns(). When the target task exits after prepare_nsset() but before commit_nsset(), the namespace's active reference count might have been dropped. If setns() then installs the namespaces, it would bump the active reference count from zero without taking the required reference on the owner namespace, leading to underflow when later decremented. The fix resurrects the ownership chain if necessary - if the caller succeeded in grabbing passive references, the setns() should succeed even if the target task exits or gets reaped. - Return EFAULT on put_user() error instead of success - Make sure references are dropped outside of RCU lock (some namespaces like mount namespace sleep when putting the last reference) - Don't skip active reference count initialization for network namespace - Add asserts for active refcount underflow - Add asserts for initial namespace reference counts (both passive and active) - ipc: enable is_ns_init_id() assertions - Fix kernel-doc comments for internal nstree functions - Selftests - 15 active reference count tests - 9 listns() functionality tests - 7 listns() permission tests - 12 inactive namespace resurrection tests - 3 threaded active reference count tests - commit_creds() active reference tests - Pagination and stress tests - EFAULT handling test - nsid tests fixes" tag 'namespace-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (103 commits) pidfs: simplify PIDFD_GET_<type>_NAMESPACE ioctls nstree: fix kernel-doc comments for internal functions nsproxy: fix free_nsproxy() and simplify create_new_namespaces() selftests/namespaces: fix nsid tests ns: drop custom reference count initialization for initial namespaces pid: rely on common reference count behavior ns: add asserts for initial namespace active reference counts ns: add asserts for initial namespace reference counts ns: make all reference counts on initial namespace a nop ipc: enable is_ns_init_id() assertions fs: use boolean to indicate anonymous mount namespace ns: rename is_initial_namespace() ns: make is_initial_namespace() argument const nstree: use guards for ns_tree_lock nstree: simplify owner list iteration nstree: switch to new structures nstree: add helper to operate on struct ns_tree_{node,root} nstree: move nstree types into separate header nstree: decouple from ns_common header ns: move namespace types into separate header ...
2025-12-01	Merge branch 'for-linus' into for-next	Takashi Iwai	65	-155/+1868
	Pull remaining 6.18-devel changes. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-12-01	objtool: Fix segfault on unknown alternatives	Ingo Molnar	1	-0/+3
	So 'objtool --link -d vmlinux.o' gets surprised by this endbr64+endbr64 pattern in ___bpf_prog_run(): ___bpf_prog_run: 1e7680: ___bpf_prog_run+0x0 push %r12 1e7682: ___bpf_prog_run+0x2 mov %rdi,%r12 1e7685: ___bpf_prog_run+0x5 push %rbp 1e7686: ___bpf_prog_run+0x6 xor %ebp,%ebp 1e7688: ___bpf_prog_run+0x8 push %rbx 1e7689: ___bpf_prog_run+0x9 mov %rsi,%rbx 1e768c: ___bpf_prog_run+0xc movzbl (%rbx),%esi 1e768f: ___bpf_prog_run+0xf movzbl %sil,%edx 1e7693: ___bpf_prog_run+0x13 mov %esi,%eax 1e7695: ___bpf_prog_run+0x15 mov 0x0(,%rdx,8),%rdx 1e769d: ___bpf_prog_run+0x1d jmp 0x1e76a2 <__x86_indirect_thunk_rdx> 1e76a2: ___bpf_prog_run+0x22 endbr64 1e76a6: ___bpf_prog_run+0x26 endbr64 1e76aa: ___bpf_prog_run+0x2a mov 0x4(%rbx),%edx And crashes due to blindly dereferencing alt->insn->alt_group. Bail out on NULL ->alt_group, which produces this warning and continues with the disassembly, instead of a segfault: .git/O/vmlinux.o: warning: objtool: <alternative.1e769d>: failed to disassemble alternative Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-12-01	Merge branch 'kvm-arm64/nv-xnx-haf' into kvmarm/next	Oliver Upton	4	-0/+178
	* kvm-arm64/nv-xnx-haf: (22 commits) : Support for FEAT_XNX and FEAT_HAF in nested : : Add support for a couple of MMU-related features that weren't : implemented by KVM's software page table walk: : : - FEAT_XNX: Allows the hypervisor to describe execute permissions : separately for EL0 and EL1 : : - FEAT_HAF: Hardware update of the Access Flag, which in the context of : nested means software walkers must also set the Access Flag. : : The series also adds some basic support for testing KVM's emulation of : the AT instruction, including the implementation detail that AT sets the : Access Flag in KVM. KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2 KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX} KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot() KVM: arm64: Add endian casting to kvm_swap_s[12]_desc() KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n KVM: arm64: selftests: Add test for AT emulation KVM: arm64: nv: Expose hardware access flag management to NV guests KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW KVM: arm64: Implement HW access flag management in stage-1 SW PTW KVM: arm64: Propagate PTW errors up to AT emulation KVM: arm64: Add helper for swapping guest descriptor KVM: arm64: nv: Use pgtable definitions in stage-2 walk KVM: arm64: Handle endianness in read helper for emulated PTW KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW KVM: arm64: Call helper for reading descriptors directly KVM: arm64: nv: Advertise support for FEAT_XNX KVM: arm64: Teach ptdump about FEAT_XNX permissions KVM: arm64: nv: Forward FEAT_XNX permissions to the shadow stage-2 ... Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-12-01	Merge branch 'kvm-arm64/vgic-lr-overflow' into kvmarm/next	Oliver Upton	5	-22/+288
	* kvm-arm64/vgic-lr-overflow: (50 commits) : Support for VGIC LR overflows, courtesy of Marc Zyngier : : Address deficiencies in KVM's GIC emulation when a vCPU has more active : IRQs than can be represented in the VGIC list registers. Sort the AP : list to prioritize inactive and pending IRQs, potentially spilling : active IRQs outside of the LRs. : : Handle deactivation of IRQs outside of the LRs for both EOImode=0/1, : which involves special consideration for SPIs being deactivated from a : different vCPU than the one that acked it. KVM: arm64: Convert ICH_HCR_EL2_TDIR cap to EARLY_LOCAL_CPU_FEATURE KVM: arm64: selftests: vgic_irq: Add timer deactivation test KVM: arm64: selftests: vgic_irq: Add Group-0 enable test KVM: arm64: selftests: vgic_irq: Add asymmetric SPI deaectivation test KVM: arm64: selftests: vgic_irq: Perform EOImode==1 deactivation in ack order KVM: arm64: selftests: vgic_irq: Remove LR-bound limitation KVM: arm64: selftests: vgic_irq: Exclude timer-controlled interrupts KVM: arm64: selftests: vgic_irq: Change configuration before enabling interrupt KVM: arm64: selftests: vgic_irq: Fix GUEST_ASSERT_IAR_EMPTY() helper KVM: arm64: selftests: gic_v3: Disable Group-0 interrupts by default KVM: arm64: selftests: gic_v3: Add irq group setting helper KVM: arm64: GICv2: Always trap GICV_DIR register KVM: arm64: GICv2: Handle deactivation via GICV_DIR traps KVM: arm64: GICv2: Handle LR overflow when EOImode==0 KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En KVM: arm64: GICv3: nv: Plug L1 LR sync into deactivation primitive KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation KVM: arm64: GICv3: Avoid broadcast kick on CPUs lacking TDIR KVM: arm64: GICv3: Handle in-LR deactivation when possible KVM: arm64: GICv3: Add SPI tracking to handle asymmetric deactivation ... Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-12-01	Merge branch 'kvm-arm64/sea-user' into kvmarm/next	Oliver Upton	4	-0/+335
	* kvm-arm64/sea-user: : Userspace handling of SEAs, courtesy of Jiaqi Yan : : Add support for processing external aborts in userspace in situations : where the host has failed to do so, allowing the VMM to potentially : reinject an external abort into the VM. Documentation: kvm: new UAPI for handling SEA KVM: selftests: Test for KVM_EXIT_ARM_SEA KVM: arm64: VM exit to userspace to handle SEA Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-12-01	KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected"	Colin Ian King	1	-1/+1
	There is a spelling mistake in a TEST_FAIL message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://msgid.link/20251128175124.319094-1-colin.i.king@gmail.com Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-12-01	KVM: arm64: selftests: Add test for AT emulation	Oliver Upton	4	-0/+178
	Add a basic test for AT emulation in the EL2&0 and EL1&0 translation regimes. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://msgid.link/20251124190158.177318-16-oupton@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-12-01	Merge branch 'rcu/misc' into next	Frederic Weisbecker	3	-16/+157
	- In order to prepare the layout for nohz_full work deferral to user exit, the context tracking state must shrink the counter of transitions to/from RCU not watching. The only possible hazard is to trigger wrap-around more easily, delaying a bit grace periods when that happens. This should be a rare event though. Yet add debugging and torture code to test that assumption. - Fix memory leak on locktorture module - Annotate accesses in rculist_nulls.h to prevent from KCSAN warnings. On recent discussions, we also concluded that all those WRITE_ONCE() and READ_ONCE() on list APIs deserve appropriate comments. Something to be expected for the next cycle. - Provide a script to apply several configs to several commits with torture. - Allow torture to reuse a build directory in order to save needless rebuild time. - Various cleanups.
2025-11-29	perf trace: Skip internal syscall arguments	Namhyung Kim	1	-0/+21
	Recent changes in the linux-next kernel will add new field for syscalls to have contents in the userspace like below. # cat /sys/kernel/tracing/events/syscalls/sys_enter_write/format name: sys_enter_write ID: 758 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int __syscall_nr; offset:8; size:4; signed:1; field:unsigned int fd; offset:16; size:8; signed:0; field:const char * buf; offset:24; size:8; signed:0; field:size_t count; offset:32; size:8; signed:0; field:__data_loc char[] __buf_val; offset:40; size:4; signed:0; print fmt: "fd: 0x%08lx, buf: 0x%08lx (%s), count: 0x%08lx", ((unsigned long)(REC->fd)), ((unsigned long)(REC->buf)), __print_dynamic_array(__buf_val, 1), ((unsigned long)(REC->count)) We have a different way to handle those arguments and this change confuses perf trace then make some tests failing. Fix it by skipping the new fields that have "__data_loc char[]" type. Maybe we can switch to this instead of the BPF augmentation later. Reviewed-by: Howard Chu <howardchu95@gmail.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Howard Chu <howardchu95@gmail.com> Reported-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-11-29	selftests/mm/uffd: initialize char variable to Null	Ankit Khushwaha	2	-5/+5
	In "uffd-stress.c" & "uffd-unit-tests.c". address of char variable having garbage value (uninitialized) is passed to 'write' syscall triggers warning. uffd-stress.c:246:39: warning: variable 'c' is uninitialized when passed as a const pointer argument here [-Wuninitialized-const-pointer] uffd-unit-tests.c:581:31: warning: variable 'c' is uninitialized when passed as a const pointer argument here [-Wuninitialized-const-pointer] so the fix is to assign char variable to '\0' to prevent writing of garbage value. Link: https://lkml.kernel.org/r/20251126160830.52124-1-ankitkhushwaha.linux@gmail.com Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-29	mm: introduce VMA flags bitmap type	Lorenzo Stoakes	1	-30/+120
	It is useful to transition to using a bitmap for VMA flags so we can avoid running out of flags, especially for 32-bit kernels which are constrained to 32 flags, necessitating some features to be limited to 64-bit kernels only. By doing so, we remove any constraint on the number of VMA flags moving forwards no matter the platform and can decide in future to extend beyond 64 if required. We start by declaring an opaque types, vma_flags_t (which resembles mm_struct flags of type mm_flags_t), setting it to precisely the same size as vm_flags_t, and place it in union with vm_flags in the VMA declaration. We additionally update struct vm_area_desc equivalently placing the new opaque type in union with vm_flags. This change therefore does not impact the size of struct vm_area_struct or struct vm_area_desc. In order for the change to be iterative and to avoid impacting performance, we designate VM_xxx declared bitmap flag values as those which must exist in the first system word of the VMA flags bitmap. We therefore declare vma_flags_clear_all(), vma_flags_overwrite_word(), vma_flags_overwrite_word(), vma_flags_overwrite_word_once(), vma_flags_set_word() and vma_flags_clear_word() in order to allow us to update the existing vm_flags_*() functions to utilise these helpers. This is a stepping stone towards converting users to the VMA flags bitmap and behaves precisely as before. By doing this, we can eliminate the existing private vma->__vm_flags field in the vma->vm_flags union and replace it with the newly introduced opaque type vma_flags, which we call flags so we refer to the new bitmap field as vma->flags. We update vma_flag_[test, set]_atomic() to account for the change also. We adapt vm_flags_reset_once() to only clear those bits above the first system word providing write-once semantics to the first system word (which it is presumed the caller requires - and in all current use cases this is so). As we currently only specify that the VMA flags bitmap size is equal to BITS_PER_LONG number of bits, this is a noop, but is defensive in preparation for a future change that increases this. We additionally update the VMA userland test declarations to implement the same changes there. Finally, we update the rust code to reference vma->vm_flags on update rather than vma->__vm_flags which has been removed. This is safe for now, albeit it is implicitly performing a const cast. Once we introduce flag helpers we can improve this more. No functional change intended. Link: https://lkml.kernel.org/r/bab179d7b153ac12f221b7d65caac2759282cfe9.1764064557.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-29	tools/testing/vma: eliminate dependency on vma->__vm_flags	Lorenzo Stoakes	1	-10/+10
	The userland VMA test code relied on an internal implementation detail - the existence of vma->__vm_flags to directly access VMA flags. There is no need to do so when we have the vm_flags_*() helper functions available. This is ugly, but also a subsequent commit will eliminate this field altogether so this will shortly become broken. This patch has us utilise the helper functions instead. Link: https://lkml.kernel.org/r/6275c53a6bb20743edcbe92d3e130183b47d18d0.1764064557.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-29	mm: declare VMA flags by bit	Lorenzo Stoakes	1	-45/+259
	Patch series "initial work on making VMA flags a bitmap", v3. We are in the rather silly situation that we are running out of VMA flags as they are currently limited to a system word in size. This leads to absurd situations where we limit features to 64-bit architectures only because we simply do not have the ability to add a flag for 32-bit ones. This is very constraining and leads to hacks or, in the worst case, simply an inability to implement features we want for entirely arbitrary reasons. This also of course gives us something of a Y2K type situation in mm where we might eventually exhaust all of the VMA flags even on 64-bit systems. This series lays the groundwork for getting away from this limitation by establishing VMA flags as a bitmap whose size we can increase in future beyond 64 bits if required. This is necessarily a highly iterative process given the extensive use of VMA flags throughout the kernel, so we start by performing basic steps. Firstly, we declare VMA flags by bit number rather than by value, retaining the VM_xxx fields but in terms of these newly introduced VMA_xxx_BIT fields. While we are here, we use sparse annotations to ensure that, when dealing with VMA bit number parameters, we cannot be passed values which are not declared as such - providing some useful type safety. We then introduce an opaque VMA flag type, much like the opaque mm_struct flag type introduced in commit bb6525f2f8c4 ("mm: add bitmap mm->flags field"), which we establish in union with vma->vm_flags (but still set at system word size meaning there is no functional or data type size change). We update the vm_flags_xxx() helpers to use this new bitmap, introducing sensible helpers to do so. This series lays the foundation for further work to expand the use of bitmap VMA flags and eventually eliminate these arbitrary restrictions. This patch (of 4): In order to lay the groundwork for VMA flags being a bitmap rather than a system word in size, we need to be able to consistently refer to VMA flags by bit number rather than value. Take this opportunity to do so in an enum which we which is additionally useful for tooling to extract metadata from. This additionally makes it very clear which bits are being used for what at a glance. We use the VMA_ prefix for the bit values as it is logical to do so since these reference VMAs. We consistently suffix with _BIT to make it clear what the values refer to. We declare bit values even when the flags that use them would not be enabled by config options as this is simply clearer and clearly defines what bit numbers are used for what, at no additional cost. We declare a sparse-bitwise type vma_flag_t which ensures that users can't pass around invalid VMA flags by accident and prepares for future work towards VMA flags being a bitmap where we want to ensure bit values are type safe. To make life easier, we declare some macro helpers - DECLARE_VMA_BIT() allows us to avoid duplication in the enum bit number declarations (and maintaining the sparse __bitwise attribute), and INIT_VM_FLAG() is used to assist with declaration of flags. Unfortunately we can't declare both in the enum, as we run into issue with logic in the kernel requiring that flags are preprocessor definitions, and additionally we cannot have a macro which declares another macro so we must define each flag macro directly. Additionally, update the VMA userland testing vma_internal.h header to include these changes. We also have to fix the parameters to the vma_flag_*_atomic() functions since VMA_MAYBE_GUARD_BIT is now of type vma_flag_t and sparse will complain otherwise. We have to update some rather silly if-deffery found in mm/task_mmu.c which would otherwise break. Finally, we update the rust binding helper as now it cannot auto-detect the flags at all. Link: https://lkml.kernel.org/r/cover.1764064556.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/3a35e5a0bcfa00e84af24cbafc0653e74deda64a.1764064556.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-29	selftests/bpf: do not hardcode target rate in test_tc_edt BPF program	Alexis Lothoré (eBPF Foundation)	2	-3/+4
	test_tc_edt currently defines the target rate in both the userspace and BPF parts. This value could be defined once in the userspace part if we make it able to configure the BPF program before starting the test. Add a target_rate variable in the BPF part, and make the userspace part set it to the desired rate before attaching the shaping program. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-4-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-29	selftests/bpf: remove test_tc_edt.sh	Alexis Lothoré (eBPF Foundation)	2	-102/+0
	Now that test_tc_edt has been integrated in test_progs, remove the legacy shell script. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-3-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-29	selftests/bpf: integrate test_tc_edt into test_progs	Alexis Lothoré (eBPF Foundation)	2	-1/+145
	test_tc_edt.sh uses a pair of veth and a BPF program attached to the TX veth to shape the traffic to 5MBps. It then checks that the amount of received bytes (at interface level), compared to the TX duration, indeed matches 5Mbps. Convert this test script to the test_progs framework: - keep the double veth setup, isolated in two veths - run a small tcp server, and connect client to server - push a pre-configured amount of bytes, and measure how much time has been needed to push those - ensure that this rate is in a 2% error margin around the target rate This two percent value, while being tight, is hopefully large enough to not make the test too flaky in CI, while also turning it into a small example of BPF-based shaping. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-2-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-29	selftests/bpf: rename test_tc_edt.bpf.c section to expose program type	Alexis Lothoré (eBPF Foundation)	2	-2/+3
	The test_tc_edt BPF program uses a custom section name, which works fine when manually loading it with tc, but prevents it from being loaded with libbpf. Update the program section name to "tc" to be able to manipulate it with a libbpf-based C test. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-1-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-29	selftests/bpf: Add success stats to rqspinlock stress test	Kumar Kartikeya Dwivedi	1	-12/+43
	Add stats to observe the success and failure rate of lock acquisition attempts in various contexts. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251128232802.1031906-7-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-29	docs: makefile: move rustdoc check to the build wrapper	Mauro Carvalho Chehab	1	-9/+32
	The makefile logic to detect if rust is enabled is not working the way it was expected: instead of using the current setup for CONFIG_RUST, it uses a cached version from a previous build. The root cause is that the current logic inside docs/Makefile uses a cached version of CONFIG_RUST, from the last time a non documentation target was executed. That's perfectly fine for Sphinx build, as it doesn't need to read or depend on any CONFIG_*. So, instead of relying at the cache, move the logic to the wrapper script and let it check the current content of .config, to verify if CONFIG_RUST was selected. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <c06b1834ef02099735c13ee1109fa2a2b9e47795.1763722971.git.mchehab+huawei@kernel.org>
2025-11-29	docs: kdoc: various fixes for grammar, spelling, punctuation	Randy Dunlap	5	-26/+26
	Correct grammar, spelling, and punctuation in comments, strings, print messages, logs. Change two instances of two spaces between words to just one space. codespell was used to find misspelled words. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20251124041011.3030571-1-rdunlap@infradead.org>
2025-11-29	docs: kdoc_parser: use '@' for Excess enum value	Randy Dunlap	1	-1/+1
	kdoc is looking for "@value" here, so use that kind of string in the warning message. The "%value" can be confusing. This changes: Warning: drivers/net/wireless/mediatek/mt76/testmode.h:92 Excess enum value '%MT76_TM_ATTR_TX_PENDING' description in 'mt76_testmode_attr' to this: Warning: drivers/net/wireless/mediatek/mt76/testmode.h:92 Excess enum value '@MT76_TM_ATTR_TX_PENDING' description in 'mt76_testmode_attr' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20251126061752.3497106-1-rdunlap@infradead.org>
2025-11-29	docs: kdoc_parser: add data/function attributes to ignore	Randy Dunlap	1	-0/+3
	Recognize and ignore __rcu (in struct members), __private (in struct members), and __always_unused (in function parameters) to prevent kernel-doc warnings: Warning: include/linux/rethook.h:38 struct member 'void (__rcu handler' not described in 'rethook' Warning: include/linux/hrtimer_types.h:47 Invalid param: enum hrtimer_restart (__private function)(struct hrtimer *) Warning: security/ipe/hooks.c:81 function parameter '__always_unused' not described in 'ipe_mmap_file' Warning: security/ipe/hooks.c:109 function parameter '__always_unused' not described in 'ipe_file_mprotect' There are more of these (in compiler_types.h, compiler_attributes.h) that can be added as needed. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20251127063117.150384-1-rdunlap@infradead.org>
2025-11-29	selftests: bonding: add delay before each xvlan_over_bond connectivity check	Hangbin Liu	1	-0/+1
	Jakub reported increased flakiness in bond_macvlan_ipvlan.sh on regular kernel, while the tests consistently pass on a debug kernel. This suggests a timing-sensitive issue. To mitigate this, introduce a short sleep before each xvlan_over_bond connectivity check. The delay helps ensure neighbor and route cache have fully converged before verifying connectivity. The sleep interval is kept minimal since check_connection() is invoked nearly 100 times during the test. Fixes: 246af950b940 ("selftests: bonding: add macvlan over bond testing") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20251114082014.750edfad@kernel.org Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251127143310.47740-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-29	Merge tag 'nf-next-25-11-28' of ↵	Jakub Kicinski	1	-14/+112
	git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following batch contains Netfilter updates for net-next: 0) Add sanity check for maximum encapsulations in bridge vlan, reported by the new AI robot. 1) Move the flowtable path discovery code to its own file, the nft_flow_offload.c mixes the nf_tables evaluation with the path discovery logic, just split this in two for clarity. 2) Consolidate flowtable xmit path by using dev_queue_xmit() and the real device behind the layer 2 vlan/pppoe device. This allows to inline encapsulation. After this update, hw_ifidx can be removed since both ifidx and hw_ifidx now point to the same device. 3) Support for IPIP encapsulation in the flowtable, extend selftest to cover for this new layer 3 offload, from Lorenzo Bianconi. 4) Push down the skb into the conncount API to fix duplicates in the conncount list for packets with non-confirmed conntrack entries, this is due to an optimization introduced in d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC"). From Fernando Fernandez Mancera. 5) In conncount, disable BH when performing garbage collection to consolidate existing behaviour in the conncount API, also from Fernando. 6) A matching packet with a confirmed conntrack invokes GC if conncount reaches the limit in an attempt to release slots. This allows the existing extensions to be used for real conntrack counting, not just limiting new connections, from Fernando. 7) Support for updating ct count objects in nf_tables, from Fernando. 8) Extend nft_flowtables.sh selftest to send IPv6 TCP traffic, from Lorenzo Bianconi. 9) Fixes for UAPI kernel-doc documentation, from Randy Dunlap. * tag 'nf-next-25-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: improve UAPI kernel-doc comments netfilter: ip6t_srh: fix UAPI kernel-doc comments format selftests: netfilter: nft_flowtable.sh: Add the capability to send IPv6 TCP traffic netfilter: nft_connlimit: add support to object update operation netfilter: nft_connlimit: update the count if add was skipped netfilter: nf_conncount: make nf_conncount_gc_list() to disable BH netfilter: nf_conncount: rework API to use sk_buff directly selftests: netfilter: nft_flowtable.sh: Add IPIP flowtable selftest netfilter: flowtable: Add IPIP tx sw acceleration netfilter: flowtable: Add IPIP rx sw acceleration netfilter: flowtable: use tuple address to calculate next hop netfilter: flowtable: remove hw_ifidx netfilter: flowtable: inline pppoe encapsulation in xmit path netfilter: flowtable: inline vlan encapsulation in xmit path netfilter: flowtable: consolidate xmit path netfilter: flowtable: move path discovery infrastructure to its own file netfilter: flowtable: check for maximum number of encapsulations in bridge vlan ==================== Link: https://patch.msgid.link/20251128002345.29378-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-29	tools: ynl: add a lint makefile target	Donald Hunter	1	-1/+3
	Add a lint target to run yamllint on the YNL specs. make -C tools/net/ynl lint make: Entering directory '/home/donaldh/net-next/tools/net/ynl' yamllint ../../../Documentation/netlink/specs/*.yaml ../../../Documentation/netlink/specs/ethtool.yaml 1272:21 warning truthy value should be one of [false, true] (truthy) make: Leaving directory '/home/donaldh/net-next/tools/net/ynl' Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20251127123502.89142-3-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-29	tools: ynl: add schema checking	Donald Hunter	2	-6/+35
	Add a --validate flag to pyynl for explicit schema check with error reporting and add a schema_check make target to check all YNL specs. make -C tools/net/ynl schema_check make: Entering directory '/home/donaldh/net-next/tools/net/ynl' ok 1 binder.yaml schema validation not ok 2 conntrack.yaml schema validation 'labels mask' does not match '^[0-9a-z-]+$' Failed validating 'pattern' in schema['properties']['attribute-sets']['items']['properties']['attributes']['items']['properties']['name']: {'type': 'string', 'pattern': '^[0-9a-z-]+$'} On instance['attribute-sets'][14]['attributes'][22]['name']: 'labels mask' ok 3 devlink.yaml schema validation [...] Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20251127123502.89142-2-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-29	bpf: Remove runqslower tool	Hoyeon Lee	9	-413/+4
	runqslower was added in commit 9c01546d26d2 "tools/bpf: Add runqslower tool to tools/bpf" as a BCC port to showcase early BPF CO-RE + libbpf workflows. runqslower continues to live in BCC (libbpf-tools), so there is no need to keep building and maintaining it. Drop tools/bpf/runqslower and remove all build hooks in tools/bpf and selftests accordingly. Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Link: https://lore.kernel.org/r/20251126093821.373291-1-hoyeon.lee@suse.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-29	selftests/bpf: Remove usage of lsm/file_alloc_security in selftest	Amery Hung	3	-7/+7
	file_alloc_security hook is disabled. Use other LSM hooks in selftests instead. Signed-off-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20251126202927.2584874-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-29	bpf: force BPF_F_RDONLY_PROG on insn array creation	Anton Protopopov	1	-1/+1
	The original implementation added a hack to check_mem_access() to prevent programs from writing into insn arrays. To get rid of this hack, enforce BPF_F_RDONLY_PROG on map creation. Also fix the corresponding selftest, as the error message changes with this patch. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251128063224.1305482-2-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-28	vfio: selftests: Add vfio_pci_device_init_perf_test	David Matlack	2	-0/+171
	Add a new VFIO selftest for measuring the time it takes to run vfio_pci_device_init() in parallel for one or more devices. This test serves as manual regression test for the performance improvement of commit e908f58b6beb ("vfio/pci: Separate SR-IOV VF dev_set"). For example, when running this test with 64 VFs under the same PF: Before: $ ./vfio_pci_device_init_perf_test -r vfio_pci_device_init_perf_test.iommufd.init 0000:1a:00.0 0000:1a:00.1 ... ... Wall time: 6.653234463s Min init time (per device): 0.101215344s Max init time (per device): 6.652755941s Avg init time (per device): 3.377609608s After: $ ./vfio_pci_device_init_perf_test -r vfio_pci_device_init_perf_test.iommufd.init 0000:1a:00.0 0000:1a:00.1 ... ... Wall time: 0.122978332s Min init time (per device): 0.108121915s Max init time (per device): 0.122762761s Avg init time (per device): 0.113816748s This test does not make any assertions about performance, since any such assertion is likely to be flaky due to system differences and random noise. However this test can be fed into automation to detect regressions, and can be used by developers in the future to measure performance optimizations. Suggested-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-19-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Eliminate INVALID_IOVA	David Matlack	4	-10/+13
	Eliminate INVALID_IOVA as there are platforms where UINT64_MAX is a valid iova. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-18-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Split libvfio.h into separate header files	David Matlack	6	-334/+381
	Split out the contents of libvfio.h into separate header files, but keep libvfio.h as the top-level include that all tests can use. Put all new header files into a libvfio/ subdirectory to avoid future name conflicts in include paths when libvfio is used by other selftests like KVM. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-17-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Move vfio_selftests_*() helpers into libvfio.c	David Matlack	3	-71/+79
	Move the vfio_selftests_*() helpers into their own file libvfio.c. These helpers have nothing to do with struct vfio_pci_device, so they don't make sense in vfio_pci_device.c. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-16-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Rename vfio_util.h to libvfio.h	David Matlack	11	-13/+13
	Rename vfio_util.h to libvfio.h to match the name of libvfio.mk. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-15-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Stop passing device for IOMMU operations	David Matlack	4	-65/+22
	Drop the struct vfio_pci_device wrappers for IOMMU map/unmap functions and require tests to directly call iommu_map(), iommu_unmap(), etc. This results in more concise code, and also makes it clear the map operations are happening on a struct iommu, not necessarily on a specific device, especially when multi-device tests are introduced. Do the same for iova_allocator_init() as that function only needs the struct iommu, not struct vfio_pci_device. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-14-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Move IOVA allocator into iova_allocator.c	David Matlack	3	-71/+95
	Move the IOVA allocator into its own file, to provide better separation between the allocator and the struct vfio_pci_device helper code. The allocator could go into iommu.c, but it is standalone enough that a separate file seems cleaner. This also continues the trend of having a .c for every major object in VFIO selftests (vfio_pci_device.c, vfio_pci_driver.c, iommu.c, and now iova_allocator.c). No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-13-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Move IOMMU library code into iommu.c	David Matlack	4	-453/+527
	Move all the IOMMU related library code into their own file iommu.c. This provides a better separation between the vfio_pci_device helper code and the iommu code. No function change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-12-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Rename struct vfio_dma_region to dma_region	David Matlack	4	-21/+21
	Rename struct vfio_dma_region to dma_region. This is in preparation for separating the VFIO PCI device library code from the IOMMU library code. This name change also better reflects the fact that DMA mappings can be managed by either VFIO or IOMMUFD. i.e. the "vfio_" prefix is misleading. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-11-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>