kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2 days	selftests: net: Fix checksums in xdp_native	Nimrod Oren	1	-25/+30
	[ Upstream commit dfc077043351a81887d1e4c9ac244e9243f3cbf2 ] Data adjustment cases failed with "Data exchange failed" when using IPv4 because the program did not update the IP and UDP checksums in the IPv4 branch. The issue was masked when both IPv4 and IPv6 were configured, since the test harness prefers IPv6. While here, generalize csum_fold_helper() to fold twice so it works for any 32-bit input. Fixes: 0b65cfcef9c5 ("selftests: drv-net: Test tail-adjustment support") Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20260520153928.3371765-1-noren@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	selftests: ublk: cap nthreads to kernel's actual nr_hw_queues	Ming Lei	1	-0/+11
	[ Upstream commit 87d0740b7c4cc847be1b6f307ab6d8547cb1a726 ] dev->nthreads is derived from the user-requested queue count before the ADD command, but the kernel may reduce nr_hw_queues (capped to nr_cpu_ids). When the VM has fewer CPUs than requested queues, the daemon creates more handler threads than there are kernel queues. In non-batch mode, the extra threads access uninitialized queues (q_depth=0), submit zero io_uring SQEs, and block forever in io_cqring_wait. In batch mode, the extra threads cause similar hangs during device removal. In both cases, the stuck threads prevent the daemon from closing the char device, holding the last ublk_device reference and causing ublk_ctrl_del_dev() to hang in wait_event_interruptible(). Fix by capping dev->nthreads to the kernel-returned nr_hw_queues after the ADD command completes. per_io_tasks mode is excluded because threads interleave across all queues, so nthreads > nr_hw_queues is valid. Fixes: abe54c160346 ("selftests: ublk: kublk: decouple ublk_queues from ublk server threads") Signed-off-by: Ming Lei <tom.leiming@gmail.com> Link: https://patch.msgid.link/20260513101941.1373998-1-tom.leiming@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	selftests/mm: run_vmtests.sh: fix destructive tests invocation	Luiz Capitulino	1	-1/+1
	commit 3432cbb291aabf85f8af4b9d1ec37179168ff999 upstream. Destructive tests should be invoked with -d command-line option, but this won't work today since 'd' is missing in getopts command-line. This commit fixes it. Link: https://lore.kernel.org/214fd9e4-5398-4c26-859e-c982c2e277c3@redhat.com Fixes: f16ff3b692ad ("selftests/mm: run_vmtests.sh: add missing tests") Signed-off-by: Luiz Capitulino <luizcap@redhat.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Liam R. Howlett <liam@infradead.org> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	mm/memory: fix spurious warning when unmapping device-private/exclusive pages	Alistair Popple	1	-0/+50
	commit be3f38d05cc5a7c3f13e51994c5dd043ab604d28 upstream. Device private and exclusive entries are only supported for anonymous folios. This condition is tested in __migrate_device_pages() and make_device_exclusive() using folio_test_anon(). However the unmap path tests this assumption using vma_is_anonymous(). This is wrong because whilst anonymous VMAs can only contain folios where folio_test_anon() is true the opposite relation does not hold. A folio for which folio_test_anon() is true does not imply vma_is_anonymous() is true. Such a condition can occur if for example a folio is part of a private filebacked mapping. In this case vma_is_anonymous() is false as the mapping is filebacked, but folio_test_anon() may be true, thus permitting devices to migrate the folio to device private memory. This can lead to the following spurious warnings during process teardown: [ 772.737706] ------------[ cut here ]------------ [ 772.739201] WARNING: mm/memory.c:1754 at unmap_page_range.cold+0x26/0x18a, CPU#17: hmm-tests/2041 [ 772.742050] Modules linked in: test_hmm nvidia_uvm(O) nvidia(O) [ 772.743959] CPU: 17 UID: 0 PID: 2041 Comm: hmm-tests Tainted: G W O 7.0.0+ #387 PREEMPT(full) [ 772.747104] Tainted: [W]=WARN, [O]=OOT_MODULE [ 772.748509] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 [ 772.752117] RIP: 0010:unmap_page_range.cold+0x26/0x18a [ 772.753780] Code: 7e fe ff ff 48 89 4c 24 78 4c 89 44 24 38 e8 f2 ff b1 00 48 8b 4c 24 78 4c 8b 44 24 38 48 8b 44 24 18 48 83 78 48 00 74 04 90 <0f> 0b 90 48 89 ca b8 ff ff 37 00 48 c1 ea 03 48 c1 e0 2a 80 3c 02 [ 772.759602] RSP: 0018:ffff888112607550 EFLAGS: 00010286 [ 772.761310] RAX: ffff88811bbf4dc0 RBX: dffffc0000000000 RCX: ffffea03e9bfffd8 [ 772.763583] RDX: 1ffff1102377e9c1 RSI: 0000000000000008 RDI: ffff88811bbf4e08 [ 772.765914] RBP: 0000000000000006 R08: ffff8881059f7448 R09: ffffed10224c0e68 [ 772.768184] R10: ffff888112607347 R11: 0000000000000001 R12: 0000000000000001 [ 772.770461] R13: ffffea03e9bfffc0 R14: ffff888112607908 R15: ffffea03e9bfffc0 [ 772.772782] FS: 00007f327caa2780(0000) GS:ffff888427b7d000(0000) knlGS:0000000000000000 [ 772.775328] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 772.777187] CR2: 00007f327ca89000 CR3: 00000001994d5000 CR4: 00000000000006f0 [ 772.779135] Call Trace: [ 772.779792] <TASK> [ 772.780317] ? dmirror_interval_invalidate+0x1a3/0x290 [test_hmm] [ 772.781873] ? vm_normal_page_pud+0x2b0/0x2b0 [ 772.782992] ? __rwlock_init+0x150/0x150 [ 772.784006] ? lock_release+0x216/0x2b0 [ 772.785008] ? __mmu_notifier_invalidate_range_start+0x505/0x6e0 [ 772.786522] ? lock_release+0x216/0x2b0 [ 772.787498] ? unmap_single_vma+0xb6/0x210 [ 772.788573] unmap_vmas+0x27d/0x520 [ 772.789506] ? unmap_single_vma+0x210/0x210 [ 772.790607] ? mas_update_gap.part.0+0x620/0x620 [ 772.791834] unmap_region+0x19e/0x350 [ 772.792769] ? remove_vma+0x130/0x130 [ 772.793684] ? mas_alloc_nodes+0x1f2/0x300 [ 772.794730] vms_complete_munmap_vmas+0x8c1/0xe20 [ 772.795926] ? unmap_region+0x350/0x350 [ 772.796917] do_vmi_align_munmap+0x36a/0x4e0 [ 772.798018] ? lock_release+0x216/0x2b0 [ 772.799024] ? vma_shrink+0x620/0x620 [ 772.799983] do_vmi_munmap+0x150/0x2c0 [ 772.800939] __vm_munmap+0x161/0x2c0 [ 772.801872] ? expand_downwards+0xd60/0xd60 [ 772.802948] ? clockevents_program_event+0x1ef/0x540 [ 772.804217] ? lock_release+0x216/0x2b0 [ 772.805158] __x64_sys_munmap+0x59/0x80 [ 772.805776] do_syscall_64+0xfc/0x670 [ 772.806336] ? irqentry_exit+0xda/0x580 [ 772.806976] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 772.807772] RIP: 0033:0x7f327cbb2717 [ 772.808323] Code: 73 01 c3 48 8b 0d f9 76 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c9 76 0d 00 f7 d8 64 89 01 48 [ 772.811337] RSP: 002b:00007ffde7f57d38 EFLAGS: 00000202 ORIG_RAX: 000000000000000b [ 772.812564] RAX: ffffffffffffffda RBX: 00007f327cc9c000 RCX: 00007f327cbb2717 [ 772.813733] RDX: 0000000000000000 RSI: 0000000000400000 RDI: 00007f327c289000 [ 772.814867] RBP: 0000000000421360 R08: 000000000000001a R09: 0000000000000000 [ 772.815991] R10: 0000000000000003 R11: 0000000000000202 R12: 00007ffde7f57d74 [ 772.817121] R13: 00007f327c689010 R14: 0000000000100000 R15: 00007f327c289000 [ 772.818272] </TASK> [ 772.818614] irq event stamp: 0 [ 772.819159] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 772.820174] hardirqs last disabled at (0): [<ffffffff82a57ab3>] copy_process+0x19f3/0x6440 [ 772.821511] softirqs last enabled at (0): [<ffffffff82a57b00>] copy_process+0x1a40/0x6440 [ 772.822869] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 772.823871] ---[ end trace 0000000000000000 ]--- Fix this by using the same check for folio_test_anon() in zap_nonpresent_ptes(). Also add a hmm-test case for this. Link: https://lore.kernel.org/20260501065116.2057242-1-apopple@nvidia.com Fixes: 999dad824c39 ("mm/shmem: persist uffd-wp bit across zapping for file-backed") Signed-off-by: Alistair Popple <apopple@nvidia.com> Reported-by: Arsen Arsenović <aarsenovic@baylibre.com> Reviewed-by: Balbir Singh <balbirs@nvidia.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam R. Howlett <liam@infradead.org> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 days	selftests/bpf: Remove test_access_variable_array	Venkat Rao Bagalkote	2	-35/+0
	commit aacee214d57636fa1f63007c65f333b5ea75a7a0 upstream. test_access_variable_array relied on accessing struct sched_domain::span to validate variable-length array handling via BTF. Recent scheduler refactoring removed or hid this field, causing the test to fail to build. Given that this test depends on internal scheduler structures that are subject to refactoring, and equivalent variable-length array coverage already exists via bpf_testmod-based tests, remove test_access_variable_array entirely. Link: https://lore.kernel.org/all/177434340048.1647592.8586759362906719839.tip-bot2@tip-bot2/ Signed-off-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Tested-by: Naveen Kumar Thummalapenta <naveen66@linux.ibm.com> Link: https://lore.kernel.org/r/20260410105404.91126-1-venkat88@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 days	rtla: Fix parse_cpu_set() bug introduced by strtoi()	Costa Shulyupin	1	-6/+4
	[ Upstream commit 6ea8a206108fe8b5940c2797afc54ae9f5a7bbdd ] The patch 'Replace atoi() with a robust strtoi()' introduced a bug in parse_cpu_set(), which relies on partial parsing of the input string. The function parses CPU specifications like '0-3,5' by incrementing a pointer through the string. strtoi() rejects strings with trailing characters, causing parse_cpu_set() to fail on any CPU list with multiple entries. Restore the original use of atoi() in parse_cpu_set(). Fixes: 7e9dfccf8f11 ("rtla: Replace atoi() with a robust strtoi()") Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20260112192642.212848-2-costa.shul@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	kselftest/arm64: Include <asm/ptrace.h> for user_gcs definition	Leo Yan	2	-6/+1
	[ Upstream commit bb7235e226888607e6aac1288062fcb1ac105589 ] kselftest includes kernel uAPI headers with option: -isystem $(top_srcdir)/usr/include Include <asm/ptrace.h> in libc-gcs.c for the definition of struct user_gcs from the uAPI headers, and remove the redundant definition in gcs-util.h. This fixes a compilation error on systems where the toolchain defines NT_ARM_GCS. Fixes: a505a52b4e29 ("kselftest/arm64: Add a GCS test program built with the system libc") Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	tools/power turbostat: Fix unrecognized option '-P'	David Arcari	1	-1/+1
	[ Upstream commit ce012c966b518c53475ba9a4e979242d7322d819 ] The '-P' short option (shorthand for --no-perf) is not present in the optstring of the second call to getopt_long_only(). This results in the "unrecognized option" error when the tool reaches the main parsing loop. Add 'P' to the second getopt_long_only() call to ensure it is consistently recognized. Fixes: a0e86c90b83c ("tools/power turbostat: Add --no-perf option") Signed-off-by: David Arcari <darcari@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	tools/power turbostat: Fix and document --header_iterations	Len Brown	2	-12/+12
	[ Upstream commit 96718ad296af4a6d984b3a09276b165ab6a3b0c8 ] The "header_iterations" option is commonly used to de-clutter the screen of redundant header label rows in an interactive session: Eg. every 10 rows: $ sudo turbostat --header_iterations 10 -S -q -i 1 But --header_iterations was missing from turbostat.8 Also turbostat help advertised the "-N" short option that did not actually work: $ turbostat --help -N, --header_iterations num print header every num iterations Repair "-N" Document "--header_iterations" on turbostat.8 Signed-off-by: Len Brown <len.brown@intel.com> Stable-dep-of: ce012c966b51 ("tools/power turbostat: Fix unrecognized option '-P'") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	tools/power turbostat: Use strtoul() for iteration parsing	Kaushlendra Kumar	1	-6/+8
	[ Upstream commit 8e5c0cc326f2e95a71bb6e6063e65caa60c8f951 ] Replace strtod() with strtoul() and check errno for -n/-N options, since num_iterations and header_iterations are unsigned long counters. Reject zero and conversion errors; negative inputs wrap to large positive values per standard unsigned semantics. Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Stable-dep-of: ce012c966b51 ("tools/power turbostat: Fix unrecognized option '-P'") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	tools/power turbostat.8: Document the "--force" option	Len Brown	1	-3/+7
	[ Upstream commit 785953cf6e63aa5a9fcdfa9577b1411e0281c4bc ] Starting in turbostat v2025.01.14, turbostat refused to run on unsupported hardware, pointing to "RUN THE LATEST VERSION" on turbostat(8). At that time, turbostat supported and advertised the "--force" parameter to run anyway (with unsupported results). Also document "--force" on turbostat.8. Signed-off-by: Len Brown <len.brown@intel.com> Stable-dep-of: ce012c966b51 ("tools/power turbostat: Fix unrecognized option '-P'") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	tcp: send a challenge ACK on SEG.ACK > SND.NXT	Jiayuan Chen	1	-1/+3
	[ Upstream commit 42726ec644cbdde0035c3e0417fee8ed9547e120 ] RFC 5961 Section 5.2 validates an incoming segment's ACK value against the range [SND.UNA - MAX.SND.WND, SND.NXT] and states: "All incoming segments whose ACK value doesn't satisfy the above condition MUST be discarded and an ACK sent back." Commit 354e4aa391ed ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") opted Linux into this mitigation and implements the challenge ACK on the lower side (SEG.ACK < SND.UNA - MAX.SND.WND), but the symmetric upper side (SEG.ACK > SND.NXT) still takes the pre-RFC-5961 path and silently returns SKB_DROP_REASON_TCP_ACK_UNSENT_DATA, even though RFC 793 Section 3.9 (now RFC 9293 Section 3.10.7.4) has always required: "If the ACK acknowledges something not yet sent (SEG.ACK > SND.NXT) then send an ACK, drop the segment, and return." Complete the mitigation by sending a challenge ACK on that branch, reusing the existing tcp_send_challenge_ack() path which already enforces the per-socket RFC 5961 Section 7 rate limit via __tcp_oow_rate_limited(). FLAG_NO_CHALLENGE_ACK is honoured for symmetry with the lower-edge case. Update the existing tcp_ts_recent_invalid_ack.pkt selftest, which drives this exact path, to consume the new challenge ACK. Fixes: 354e4aa391ed ("tcp: RFC 5961 5.2 Blind Data Injection Attack Mitigation") Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260422123605.320000-2-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf util: Kill die() prototype, dead for a long time	Arnaldo Carvalho de Melo	1	-1/+0
	[ Upstream commit e5cce1b9c82fbd48e2f1f7a25a9fad8ee228176f ] In fef2a735167a827a ("perf tools: Kill die()") the die() function was removed, but not the prototype in util.h, now when building with LIBPERL=1, during a 'make -C tools/perf build-test' routine test, it is failing as perl likes die() calls and then this clashes with this remnant, remove it. Fixes: fef2a735167a827a ("perf tools: Kill die()") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf maps: Fix copy_from that can break sorted by name order	Ian Rogers	1	-10/+3
	[ Upstream commit f552b132e4d5248715828e7e5c2bf7889bf05b2e ] When an parent is copied into a child the name array is populated in address not name order. Make sure the name array isn't flagged as sorted. Fixes: 659ad3492b91 ("perf maps: Switch from rbtree to lazily sorted array for addresses") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf maps: Fix fixup_overlap_and_insert that can break sorted by name order	Ian Rogers	1	-0/+2
	[ Upstream commit c4f3ff3289380437d26177e8f2fe4b7507816ee3 ] When an entry in the address array is replaced, the corresponding name entry is replaced. The entries names may sort differently and so it is important that the sorted by name property be cleared on the maps. Fixes: 0d11fab32714 ("perf maps: Fixup maps_by_name when modifying maps_by_address") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf cgroup: Update metric leader in evlist__expand_cgroup	Ian Rogers	1	-7/+23
	[ Upstream commit c9ef786c0970991578397043f1c819229e2b7197 ] When the evlist is expanded the metric leader wasn't being updated. As the original evsel is deleted this creates a use-after-free in stat-shadow's prepare_metric. This was detected running the "perf stat --bpf-counters --for-each-cgroup test" with sanitizers. The change itself puts the copied evsel into the priv field (known unused because of evsel__clone use) and then in a second pass over the list updates the copied values using the priv pointer. Fixes: d1c5a0e86a4e ("perf stat: Add --for-each-cgroup option") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Sun Jian <sun.jian.kdev@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf expr: Return -EINVAL for syntax error in expr__find_ids()	Leo Yan	1	-1/+2
	[ Upstream commit 3a61fd866ef9aaa1d3158b460f852b74a2df07f4 ] expr__find_ids() propagates the parser return value directly. For syntax errors, the parser can return a positive value, but callers treat it as success, e.g., for below case on Arm64 platform: metric expr 100 * (STALL_SLOT_BACKEND / (CPU_CYCLES * #slots) - BR_MIS_PRED * 3 / CPU_CYCLES) for backend_bound parsing metric: 100 * (STALL_SLOT_BACKEND / (CPU_CYCLES * #slots) - BR_MIS_PRED * 3 / CPU_CYCLES) Failure to read '#slots' literal: #slots = nan syntax error Convert positive parser returns in expr__find_ids() to -EINVAL, as a result, the error value will be respected by callers. Before: perf stat -C 5 Failure to read '#slots'Failure to read '#slots'Failure to read '#slots'Failure to read '#slots'Segmentation fault After: perf stat -C 5 Failure to read '#slots'Cannot find metric or group `Default' Fixes: ded80bda8bc9 ("perf expr: Migrate expr ids table to a hashmap") Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf tools: Fix module symbol resolution for non-zero .text sh_addr	Chuck Lever	1	-2/+6
	[ Upstream commit 9a82bfde4775b7a87cd1a7e791f46f83ae442848 ] When perf resolves symbols from kernel module ELF files (ET_REL), it converts symbol addresses to file offsets so that sample IPs can be matched to the correct symbol. The conversion adjusts each symbol's st_value: sym->st_value -= shdr->sh_addr - shdr->sh_offset; For vmlinux (ET_EXEC), st_value is a virtual address and sh_addr is the section's virtual base, so subtracting sh_addr and adding sh_offset correctly yields a file offset. For kernel modules (ET_REL), st_value is a section-relative offset. The module loader ignores sh_addr entirely and places symbols at module_base + st_value. Converting to file offset requires only adding sh_offset; subtracting sh_addr introduces an error equal to sh_addr bytes. When .text has sh_addr == 0 -- the historical norm for simple modules -- both formulas produce the same result and the bug is latent. As modules gain more metadata sections before .text (.note, .static_call.text, etc.), the linker assigns .text a non-zero sh_addr, exposing the defect. For example, nfsd.ko on this kernel has sh_addr=0xa80, kvm-intel.ko has sh_addr=0x1e90. The effect is that all .text symbols in affected modules shift by sh_addr bytes relative to sample IPs, causing perf report to attribute samples to incorrect, nearby symbols. This was observed as 13% of LLC-load-miss samples misattributed to nfsd_file_get_dio_attrs when the actual hot function was nfsd_cache_lookup, approximately 0xa80 bytes away in the symbol table. Use the existing dso__rel() flag (already set for ET_REL modules) to select the correct adjustment: add sh_offset for ET_REL, subtract (sh_addr - sh_offset) for ET_EXEC/ET_DYN. Fixes: 0131c4ec794a ("perf tools: Make it possible to read object code from kernel modules") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf stat: Fix opt->value type for parse_cache_level	Ian Rogers	1	-20/+23
	[ Upstream commit 44311ae84ad9177fb311aee856027861c22f17b2 ] Commit f5803651b4a4 ("perf stat: Choose the most disaggregate command line option") changed aggregation option handling for `perf stat` but not `perf stat report` leading to parse_cache_level being passed a struct in the `perf stat` case but erroneously an aggr_mode enum value for `perf stat report`. Change the `perf stat report` aggregation handling to use the same opt_aggr_mode as `perf stat`. Also, just pass the boolean for consistency with other boolean argument handling. Fixes: f5803651b4a4 ("perf stat: Choose the most disaggregate command line option") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf lock: Fix option value type in parse_max_stack	Ian Rogers	1	-1/+1
	[ Upstream commit cfaade34b52aa1ec553044255702c4b31b57c005 ] The value is a void* and the address of an int, max_stack_depth, is set up in the perf lock options. The parse_max_stack function treats the int* as a long, make this more correct by declaring the value to be an int. Fixes: 0a277b622670 ("perf lock contention: Check --max-stack option") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf: tools: cs-etm: Fix print issue for Coresight debug in ETE/TRBE trace	Mike Leach	1	-38/+13
	[ Upstream commit 6c478e7b3eba3f387a2d6c749e3e3ee0f8ad1c53 ] Building perf with CORESIGHT=1 and the optional CSTRACE_RAW=1 enables additional debug printing of raw trace data when using command:- perf report --dump. This raw trace prints the CoreSight formatted trace frames, which may be used to investigate suspected issues with trace quality / corruption / decode. These frames are not present in ETE + TRBE trace. This fix removes the unnecessary call to print these frames. This fix also rationalises implementation - original code had helper function that unnecessarily repeated initialisation calls that had already been made. Due to an addtional fault with the OpenCSD library, this call when ETE/TRBE are being decoded will cause a segfault in perf. This fix also prevents that problem for perf using older (<= 1.8.0 version) OpenCSD libraries. Fixes: 68ffe3902898 ("perf tools: Add decoder mechanic to support dumping trace data") Reported-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Mike Leach <mike.leach@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf branch: Avoid incrementing NULL	Ian Rogers	1	-0/+3
	[ Upstream commit c969a9d7bbf46f983c4a48566b3b2f7340b02296 ] If the entry is NULL the value is meaningless so early return NULL to avoid an increment of NULL. This was happening in calls from has_stitched_lbr when running the "perf record LBR tests". The return value isn't used in that case, so returning NULL as no effect. Fixes: 42bbabed09ce ("perf tools: Add hw_idx in struct branch_stack") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf trace: Avoid an ERR_PTR in syscall_stats	Ian Rogers	1	-5/+7
	[ Upstream commit d05073adda0f047e9b2115a2932bcb2797eab238 ] hashmap__new may return an ERR_PTR and previously this would be assigned to syscall_stats meaning all use of syscall_stats needs to test for NULL (uninitialized) or an ERR_PTR. Given the only reason hashmap__new can fail is ENOMEM, just use NULL to indicate the allocation failure and avoid the code having to test for NULL and IS_ERR. Fixes: 96f202eab813 (perf trace: Fix IS_ERR() vs NULL check bug) Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	perf trace: Fix IS_ERR() vs NULL check bug	wangguangju	1	-1/+1
	[ Upstream commit 96f202eab8133f94479b14a32902c636e9bdf6af ] The alloc_syscall_stats() function always returns an error pointer (ERR_PTR) on failure. So replace NULL check with IS_ERR() check after calling delete_syscall_stats() function. Fixes: ef2da619b132c6f74 ("perf trace: Convert syscall_stats to hashmap") Signed-off-by: wangguangju <wangguangju@hygon.cn> Reviewed-by: Howard Chu <howardchu95@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	libbpf: Prevent double close and leak of btf objects	Jiri Olsa	1	-10/+11
	[ Upstream commit 380044c40b1636a72fd8f188b5806be6ae564279 ] Sashiko found possible double close of btf object fd [1], which happens when strdup in load_module_btfs fails at which point the obj->btf_module_cnt is already incremented. The error path close btf fd and so does later cleanup code in bpf_object_post_load_cleanup function. Also libbpf_ensure_mem failure leaves btf object not assigned and it's leaked. Replacing the err_out label with break to make the error path less confusing as suggested by Alan. Incrementing obj->btf_module_cnt only if there's no failure and releasing btf object in error path. Fixes: 91abb4a6d79d ("libbpf: Support attachment of BPF tracing programs to kernel modules") [1] https://sashiko.dev/#/patchset/20260324081846.2334094-1-jolsa%40kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20260416100034.1610852-1-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	bpf: allow UTF-8 literals in bpf_bprintf_prepare()	Yihan Ding	1	-1/+2
	[ Upstream commit b960430ea8862ef37ce53c8bf74a8dc79d3f2404 ] bpf_bprintf_prepare() only needs ASCII parsing for conversion specifiers. Plain text can safely carry bytes >= 0x80, so allow UTF-8 literals outside '%' sequences while keeping ASCII control bytes rejected and format specifiers ASCII-only. This keeps existing parsing rules for format directives unchanged, while allowing helpers such as bpf_trace_printk() to emit UTF-8 literal text. Update test_snprintf_negative() in the same commit so selftests keep matching the new plain-text vs format-specifier split during bisection. Fixes: 48cac3f4a96d ("bpf: Implement formatted output helpers with bstr_printf") Signed-off-by: Yihan Ding <dingyihan@uniontech.com> Acked-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260416120142.1420646-2-dingyihan@uniontech.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	rtla/utils: Fix resource leak in set_comm_sched_attr()	Wander Lairson Costa	1	-5/+6
	[ Upstream commit 5b6dc659ad792c72b3ff1be8039ae2945e030928 ] The set_comm_sched_attr() function opens the /proc directory via opendir() but fails to call closedir() on its successful exit path. If the function iterates through all processes without error, it returns 0 directly, leaking the DIR stream pointer. Fix this by refactoring the function to use a single exit path. A retval variable is introduced to track the success or failure status. All exit points now jump to a unified out label that calls closedir() before the function returns, ensuring the resource is always freed. Fixes: dada03db9bb19 ("rtla: Remove procps-ng dependency") Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260309195040.1019085-18-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	rtla: Replace atoi() with a robust strtoi()	Wander Lairson Costa	3	-8/+41
	[ Upstream commit 7e9dfccf8f11c26208211457c4597a466135b56a ] The atoi() function does not perform error checking, which can lead to undefined behavior when parsing invalid or out-of-range strings. This can cause issues when parsing user-provided numerical inputs, such as signal numbers, PIDs, or CPU lists. To address this, introduce a new strtoi() helper function that safely converts a string to an integer. This function validates the input and checks for overflows, returning a negative value on failure. Replace all calls to atoi() with the new strtoi() function and add proper error handling to make the parsing more robust and prevent potential issues. Signed-off-by: Wander Lairson Costa <wander@redhat.com> Link: https://lore.kernel.org/r/20260106133655.249887-5-wander@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Stable-dep-of: 5b6dc659ad79 ("rtla/utils: Fix resource leak in set_comm_sched_attr()") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	rtla: Fix -C/--cgroup interface	Ivan Pravdin	6	-76/+55
	[ Upstream commit 7b71f3a6986c93defbb72bb6c143e04122720cb1 ] Currently, user can only specify cgroup to the tracer's thread the following ways: `-C[cgroup]` `-C[=cgroup]` `--cgroup[=cgroup]` If user tries to specify cgroup as `-C [cgroup]` or `--cgroup [cgroup]`, the parser silently fails and rtla's cgroup is used for the tracer threads. To make interface more user-friendly, allow user to specify cgroup in the aforementioned way, i.e. `-C [cgroup]` and `--cgroup [cgroup]`. Refactor identical logic between -t/--trace and -C/--cgroup into a common function. Change documentation to reflect this user interface change. Fixes: a957cbc02531 ("rtla: Add -C cgroup support") Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/r/16132f1565cf5142b5fbd179975be370b529ced7.1762186418.git.ipravdin.official@gmail.com [ use capital letter in subject, as required by tracing subsystem ] Signed-off-by: Tomas Glozar <tglozar@redhat.com> Stable-dep-of: 5b6dc659ad79 ("rtla/utils: Fix resource leak in set_comm_sched_attr()") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	ktest: Run POST_KTEST hooks on failure and cancellation	Ricardo B. Marlière	1	-5/+22
	[ Upstream commit bc6e165a452da909cef0efbc286e6695624db372 ] PRE_KTEST can be useful for setting up the environment and POST_KTEST to tear it down, however POST_KTEST only runs on the normal end-of-run path. It is skipped when ktest exits through dodie() or cancel_test(). Final cleanup hooks are skipped. Factor the final hook execution into run_post_ktest(), call it from the normal exit path and from the early exit paths, and guard it so the hook runs at most once. Cc: John Hawley <warthog9@eaglescrag.net> Cc: Andrea Righi <arighi@nvidia.com> Cc: Marcos Paulo de Souza <mpdesouza@suse.com> Cc: Matthieu Baerts <matttbe@kernel.org> Cc: Fernando Fernandez Mancera <fmancera@suse.de> Cc: Pedro Falcato <pfalcato@suse.de> Link: https://patch.msgid.link/20260307-ktest-fixes-v1-8-565d412f4925@suse.com Fixes: 921ed4c7208e ("ktest: Add PRE/POST_KTEST and TEST options") Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	ktest: Honor empty per-test option overrides	Ricardo B. Marlière	1	-2/+4
	[ Upstream commit a2de57a3c8192dcd67cccaff6c341b93748d799b ] A per-test override can clear an inherited default option by assigning an empty value, but __set_test_option() still used option_defined() to decide whether a per-test key existed. That turned an empty per-test assignment back into "fall back to the default", so tests still could not clear inherited settings. For example: DEFAULTS (...) LOG_FILE = /tmp/ktest-empty-override.log CLEAR_LOG = 1 ADD_CONFIG = /tmp/.config TEST_START TEST_TYPE = build BUILD_TYPE = nobuild ADD_CONFIG = This would run the test with ADD_CONFIG[1] = /tmp/.config Fix by checking whether the per-test key exists before falling back. If it does exist but is empty, treat it as unset for that test and stop the fallback chain there. Cc: John Hawley <warthog9@eaglescrag.net> Cc: Andrea Righi <arighi@nvidia.com> Cc: Marcos Paulo de Souza <mpdesouza@suse.com> Cc: Matthieu Baerts <matttbe@kernel.org> Cc: Fernando Fernandez Mancera <fmancera@suse.de> Cc: Pedro Falcato <pfalcato@suse.de> Link: https://patch.msgid.link/20260307-ktest-fixes-v1-4-565d412f4925@suse.com Fixes: 22c37a9ac49d ("ktest: Allow tests to undefine default options") Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	ktest: Avoid undef warning when WARNINGS_FILE is unset	Ricardo B. Marlière	1	-1/+1
	[ Upstream commit 057854f8a595160656fe77ed7bf0d2403724b915 ] check_buildlog() probes $warnings_file with -f even when WARNINGS_FILE is not configured. Perl warns about the uninitialized value and adds noise to the test log, which can hide the output we actually care about. Check that WARNINGS_FILE is defined before testing whether the file exists. Cc: John Hawley <warthog9@eaglescrag.net> Cc: Andrea Righi <arighi@nvidia.com> Cc: Marcos Paulo de Souza <mpdesouza@suse.com> Cc: Matthieu Baerts <matttbe@kernel.org> Cc: Fernando Fernandez Mancera <fmancera@suse.de> Cc: Pedro Falcato <pfalcato@suse.de> Link: https://patch.msgid.link/20260307-ktest-fixes-v1-1-565d412f4925@suse.com Fixes: 4283b169abfb ("ktest: Add make_warnings_file and process full warnings") Signed-off-by: Ricardo B. Marlière <rbm@suse.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	selftest: memcg: skip memcg_sock test if address family not supported	Waiman Long	1	-1/+10
	[ Upstream commit 2d028f3e4bbbfd448928a8d3d2814b0b04c214f4 ] The test_memcg_sock test in memcontrol.c sets up an IPv6 socket and send data over it to consume memory and verify that memory.stat.sock and memory.current values are close. On systems where IPv6 isn't enabled or not configured to support SOCK_STREAM, the test_memcg_sock test always fails. When the socket() call fails, there is no way we can test the memory consumption and verify the above claim. I believe it is better to just skip the test in this case instead of reporting a test failure hinting that there may be something wrong with the memcg code. Link: https://lkml.kernel.org/r/20260311200526.885899-1-longman@redhat.com Fixes: 5f8f019380b8 ("selftests: cgroup/memcontrol: add basic test for socket accounting") Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Michal Koutný <mkoutny@suse.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shuah Khan <shuah@kernel.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	kho: make debugfs interface optional	Pasha Tatashin	1	-0/+1
	[ Upstream commit 03d3963464a43654703938a66503cd686c5fc54e ] Patch series "liveupdate: Rework KHO for in-kernel users", v9. This series refactors the KHO framework to better support in-kernel users like the upcoming LUO. The current design, which relies on a notifier chain and debugfs for control, is too restrictive for direct programmatic use. The core of this rework is the removal of the notifier chain in favor of a direct registration API. This decouples clients from the shutdown-time finalization sequence, allowing them to manage their preserved state more flexibly and at any time. In support of this new model, this series also: - Makes the debugfs interface optional. - Introduces APIs to unpreserve memory and fixes a bug in the abort path where client state was being incorrectly discarded. Note that this is an interim step, as a more comprehensive fix is planned as part of the stateless KHO work [1]. - Moves all KHO code into a new kernel/liveupdate/ directory to consolidate live update components. This patch (of 9): Currently, KHO is controlled via debugfs interface, but once LUO is introduced, it can control KHO, and the debug interface becomes optional. Add a separate config CONFIG_KEXEC_HANDOVER_DEBUGFS that enables the debugfs interface, and allows to inspect the tree. Move all debugfs related code to a new file to keep the .c files clear of ifdefs. Link: https://lkml.kernel.org/r/20251101142325.1326536-1-pasha.tatashin@soleen.com Link: https://lkml.kernel.org/r/20251101142325.1326536-2-pasha.tatashin@soleen.com Link: https://lore.kernel.org/all/20251020100306.2709352-1-jasonmiu@google.com [1] Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Changyuan Lyu <changyuanl@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Simon Horman <horms@kernel.org> Cc: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 019fc3687237 ("kho: fix KASAN support for restored vmalloc regions") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	selftests/mm: skip migration tests if NUMA is unavailable	AnishMulay	1	-1/+2
	[ Upstream commit 54218f10dfbe88c8e41c744fd45a756cde60b8c4 ] Currently, the migration test asserts that numa_available() returns 0. On systems where NUMA is not available (returning -1), such as certain ARM64 configurations or single-node systems, this assertion fails and crashes the test. Update the test to check the return value of numa_available(). If it is less than 0, skip the test gracefully instead of failing. This aligns the behavior with other MM selftests (like rmap) that skip when NUMA support is missing. Link: https://lkml.kernel.org/r/20260218163941.13499-1-anishm7030@gmail.com Fixes: 0c2d08728470 ("mm: add selftests for migration entries") Signed-off-by: AnishMulay <anishm7030@gmail.com> Reviewed-by: SeongJae Park <sj@kernel.org> Reviewed-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Tested-by: Sayali Patil <sayalip@linux.ibm.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	selftests/sched_ext: Add missing error check for exit__load()	David Carlier	1	-1/+1
	[ Upstream commit 1d02346fec8d13b05e54296ddc6ae29b7e1067df ] exit__load(skel) was called without checking its return value. Every other test in the suite wraps the load call with SCX_FAIL_IF(). Add the missing check to be consistent with the rest of the test suite. Fixes: a5db7817af78 ("sched_ext: Add selftests") Signed-off-by: David Carlier <devnexen@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	selftests/futex: Fix incorrect result reporting of futex_requeue test item	Yuwen Chen	1	-41/+8
	[ Upstream commit d317e2ef9dcf673c9f37cda784284af7c6812757 ] When using the TEST_HARNESS_MAIN macro definition to declare the main function, it is required to use the EXPECT() and ASSERT() macros in conjunction and not ksft_test_result_(). Otherwise, even if a test item fails, the test will still return a success result because ksft_test_result_() does not affect the test harness state. Convert the code to use EXPECT/ASSERT() variants, which ensures that the overall test result is fail if one of the EXPECT()s fails. [ tglx: Massaged change log to explain _why_ ksft_test_result*() is the wrong choice ] Fixes: f341a20f6d7e ("selftests/futex: Refactor futex_requeue with kselftest_harness.h") Signed-off-by: Yuwen Chen <ywen.chen@foxmail.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/tencent_51851B741CC4B5EC9C22AFF70BA82BB60805@qq.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	selftests/bpf: Fix reg_bounds to match new tnum-based refinement	Paul Chaignon	1	-0/+35
	[ Upstream commit 2fefa9c81a25534464911447d51ddb44b04a8e5b ] Commit efc11a667878 ("bpf: Improve bounds when tnum has a single possible value") improved the bounds refinement to detect when the tnum and u64 range overlap in a single value (and the bounds can thus be set to that value). Eduard then noticed that it broke the slow-mode reg_bounds selftests because they don't have an equivalent logic and are therefore unable to refine the bounds as much as the verifier. The following test case illustrates this. ACTUAL TRUE1: scalar(u64=0xffffffff00000000,u32=0,s64=0xffffffff00000000,s32=0) EXPECTED TRUE1: scalar(u64=[0xfffffffe00000001; 0xffffffff00000000],u32=0,s64=[0xfffffffe00000001; 0xffffffff00000000],s32=0) [...] #323/1007 reg_bounds_gen_consts_s64_s32/(s64)[0xfffffffe00000001; 0xffffffff00000000] (s32)<op> S64_MIN:FAIL with the verifier logs: [...] 19: w0 = w6 ; R0=scalar(smin=0,smax=umax=0xffffffff, var_off=(0x0; 0xffffffff)) R6=scalar(smin=0xfffffffe00000001,smax=0xffffffff00000000, umin=0xfffffffe00000001,umax=0xffffffff00000000, var_off=(0xfffffffe00000000; 0x1ffffffff)) 20: w0 = w7 ; R0=0 R7=0x8000000000000000 21: if w6 == w7 goto pc+3 [...] from 21 to 25: [...] 25: w0 = w6 ; R0=0 R6=0xffffffff00000000 ; ^ ; unexpected refined value 26: w0 = w7 ; R0=0 R7=0x8000000000000000 27: exit When w6 == w7 is true, the verifier can deduce that the R6's tnum is equal to (0xfffffffe00000000; 0x100000000) and then use that information to refine the bounds: the tnum only overlap with the u64 range in 0xffffffff00000000. The reg_bounds selftest doesn't know about tnums and therefore fails to perform the same refinement. This issue happens when the tnum carries information that cannot be represented in the ranges, as otherwise the selftest could reach the same refined value using just the ranges. The tnum thus needs to represent non-contiguous values (ex., R6's tnum above, after the condition). The only way this can happen in the reg_bounds selftest is at the boundary between the 32 and 64bit ranges. We therefore only need to handle that case. This patch fixes the selftest refinement logic by checking if the u32 and u64 ranges overlap in a single value. If so, the ranges can be set to that value. We need to handle two cases: either they overlap in umin64... u64 values matching u32 range: xxx xxx xxx xxx \|--------------------------------------\| u64 range: 0 xxxxx UMAX64 or in umax64: u64 values matching u32 range: xxx xxx xxx xxx \|--------------------------------------\| u64 range: 0 xxxxx UMAX64 To detect the first case, we decrease umax64 to the maximum value that matches the u32 range. If that happens to be umin64, then umin64 is the only overlap. We proceed similarly for the second case, increasing umin64 to the minimum value that matches the u32 range. Note this is similar to how the verifier handles the general case using tnum, but we don't need to care about a single-value overlap in the middle of the range. That case is not possible when comparing two ranges. This patch also adds two test cases reproducing this bug as part of the normal test runs (without SLOW_TESTS=1). Fixes: efc11a667878 ("bpf: Improve bounds when tnum has a single possible value") Reported-by: Eduard Zingerman <eddyz87@gmail.com> Closes: https://lore.kernel.org/bpf/4e6dd64a162b3cab3635706ae6abfdd0be4db5db.camel@gmail.com/ Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/ada9UuSQi2SE2IfB@mail.gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	selftests: netfilter: nft_tproxy.sh: adjust to socat changes	Florian Westphal	1	-7/+7
	[ Upstream commit 61119542663cac70898aef532eb57ee41ea9b477 ] Like e65d8b6f3092 ("selftests: drv-net: adjust to socat changes") we need to add shut-none for this test too. The extra 0-packet can trigger a second (unexpected) reply from the server. Fixes: 7e37e0eacd22 ("selftests: netfilter: nft_tproxy.sh: add tcp tests") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20260408152432.24b8ad0d@kernel.org/ Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20260409224506.27072-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	selftests/bpf: fix __jited_unpriv tag name	Eduard Zingerman	1	-1/+1
	[ Upstream commit cdd54fe98c00549264a92613af6bb0e9a5fd0d1c ] __jited_unpriv was using "test_jited=" as its tag name, same as the priv variant __jited. Fix by using "test_jited_unpriv=". Fixes: 7d743e4c759c ("selftests/bpf: __jited test tag to check disassembly after jit") Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260410-selftests-global-tags-ordering-v2-1-c566ec9781bf@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	bpf: Relax scalar id equivalence for state pruning	Puranjay Mohan	1	-3/+5
	[ Upstream commit b0388bafa4949bd30af7b3be5ee415f2a25ac014 ] Scalar register IDs are used by the verifier to track relationships between registers and enable bounds propagation across those relationships. Once an ID becomes singular (i.e. only a single register/stack slot carries it), it can no longer contribute to bounds propagation and effectively becomes stale. The previous commit makes the verifier clear such ids before caching the state. When comparing the current and cached states for pruning, these stale IDs can cause technically equivalent states to be considered different and thus prevent pruning. For example, in the selftest added in the next commit, two registers - r6 and r7 are not linked to any other registers and get cached with id=0, in the current state, they are both linked to each other with id=A. Before this commit, check_scalar_ids would give temporary ids to r6 and r7 (say tid1 and tid2) and then check_ids() would map tid1->A, and when it would see tid2->A, it would not consider these state equivalent. Relax scalar ID equivalence by treating rold->id == 0 as "independent": if the old state did not rely on any ID relationships for a register, then any ID/linking present in the current state only adds constraints and is always safe to accept for pruning. Implement this by returning true immediately in check_scalar_ids() when old_id == 0. Maintain correctness for the opposite direction (old_id != 0 && cur_id == 0) by still allocating a temporary ID for cur_id == 0. This avoids incorrectly allowing multiple independent current registers (id==0) to satisfy a single linked old ID during mapping. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-5-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: 2f2ec8e7730e ("bpf: Enforce regsafe base id consistency for BPF_ADD_CONST scalars") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	bpf: Support negative offsets, BPF_SUB, and alu32 for linked register tracking	Puranjay Mohan	1	-1/+1
	[ Upstream commit 7a433e519364c3c19643e5c857f4fbfaebec441c ] Previously, the verifier only tracked positive constant deltas between linked registers using BPF_ADD. This limitation meant patterns like: r1 = r0; r1 += -4; if r1 s>= 0 goto l0_%=; // r1 >= 0 implies r0 >= 4 // verifier couldn't propagate bounds back to r0 if r0 != 0 goto l0_%=; r0 /= 0; // Verifier thinks this is reachable l0_%=: Similar limitation exists for 32-bit registers. With this change, the verifier can now track negative deltas in reg->off enabling bound propagation for the above pattern. For alu32, we make sure the destination register has the upper 32 bits as 0s before creating the link. BPF_ADD_CONST is split into BPF_ADD_CONST64 and BPF_ADD_CONST32, the latter is used in case of alu32 and sync_linked_regs uses this to zext the result if known_reg has this flag. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260204151741.2678118-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: d7f14173c0d5 ("bpf: Fix linked reg delta tracking when src_reg == dst_reg") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	bpf: reject negative CO-RE accessor indices in bpf_core_parse_spec()	Weiming Shi	1	-0/+2
	[ Upstream commit 1c22483a2c4bbf747787f328392ca3e68619c4dc ] CO-RE accessor strings are colon-separated indices that describe a path from a root BTF type to a target field, e.g. "0:1:2" walks through nested struct members. bpf_core_parse_spec() parses each component with sscanf("%d"), so negative values like -1 are silently accepted. The subsequent bounds checks (access_idx >= btf_vlen(t)) only guard the upper bound and always pass for negative values because C integer promotion converts the __u16 btf_vlen result to int, making the comparison (int)(-1) >= (int)(N) false for any positive N. When -1 reaches btf_member_bit_offset() it gets cast to u32 0xffffffff, producing an out-of-bounds read far past the members array. A crafted BPF program with a negative CO-RE accessor on any struct that exists in vmlinux BTF (e.g. task_struct) crashes the kernel deterministically during BPF_PROG_LOAD on any system with CONFIG_DEBUG_INFO_BTF=y (default on major distributions). The bug is reachable with CAP_BPF: BUG: unable to handle page fault for address: ffffed11818b6626 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page Oops: Oops: 0000 [#1] SMP KASAN NOPTI CPU: 0 UID: 0 PID: 85 Comm: poc Not tainted 7.0.0-rc6 #18 PREEMPT(full) RIP: 0010:bpf_core_parse_spec (tools/lib/bpf/relo_core.c:354) RAX: 00000000ffffffff Call Trace: <TASK> bpf_core_calc_relo_insn (tools/lib/bpf/relo_core.c:1321) bpf_core_apply (kernel/bpf/btf.c:9507) check_core_relo (kernel/bpf/verifier.c:19475) bpf_check (kernel/bpf/verifier.c:26031) bpf_prog_load (kernel/bpf/syscall.c:3089) __sys_bpf (kernel/bpf/syscall.c:6228) </TASK> CO-RE accessor indices are inherently non-negative (struct member index, array element index, or enumerator index), so reject them immediately after parsing. Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/20260404161221.961828-2-bestswngs@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	selftests/powerpc: Suppress -Wmaybe-uninitialized with GCC 15	Amit Machhiwal	1	-1/+1
	[ Upstream commit 6e65886fceb23605eff952d6b1975737b4c4b154 ] GCC 15 reports the below false positive '-Wmaybe-uninitialized' warning in vphn_unpack_associativity() when building the powerpc selftests. # make -C tools/testing/selftests TARGETS="powerpc" [...] CC test-vphn In file included from test-vphn.c:3: In function ‘vphn_unpack_associativity’, inlined from ‘test_one’ at test-vphn.c:371:2, inlined from ‘test_vphn’ at test-vphn.c:399:9: test-vphn.c:10:33: error: ‘be_packed’ may be used uninitialized [-Werror=maybe-uninitialized] 10 \| #define be16_to_cpup(x) bswap_16(x) \| ^~~~~~~~ vphn.c:42:27: note: in expansion of macro ‘be16_to_cpup’ 42 \| u16 new = be16_to_cpup(field++); \| ^~~~~~~~~~~~ In file included from test-vphn.c:19: vphn.c: In function ‘test_vphn’: vphn.c:27:16: note: ‘be_packed’ declared here 27 \| __be64 be_packed[VPHN_REGISTER_COUNT]; \| ^~~~~~~~~ cc1: all warnings being treated as errors When vphn_unpack_associativity() is called from hcall_vphn() in kernel the error is not seen while building vphn.c during kernel compilation. This is because the top level Makefile includes '-fno-strict-aliasing' flag always. The issue here is that GCC 15 emits '-Wmaybe-uninitialized' due to type punning between __be64[] and __b16 when accessing the buffer via be16_to_cpup(). The underlying object is fully initialized but GCC 15 fails to track the aliasing due to the strict aliasing violation here. Please refer [1] and [2]. This results in a false positive warning which is promoted to an error under '-Werror'. This problem is not seen when the compilation is performed with GCC 13 and 14. An issue [1] has also been created on GCC bugzilla. The selftest compiles fine with '-fno-strict-aliasing'. Since this GCC flag is used to compile vphn.c in kernel too, the same flag should be used to build vphn tests when compiling vphn.c in the selftest as well. Fix this by including '-fno-strict-aliasing' during vphn.c compilation in the selftest. This keeps the build working while limiting the scope of the suppression to building vphn tests. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124427 [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99768 Fixes: 58dae82843f5 ("selftests/powerpc: Add test for VPHN") Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20260313165426.43259-1-amachhiw@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	tools/nolibc/printf: Move snprintf length check to callback	David Laight	1	-27/+67
	[ Upstream commit 4045e7b19bbf7338452cda11e64cfe7ae3361964 ] Move output truncation to the snprintf() callback. This simplifies the main code and fixes truncation of padded fields. Add a zero length callback to 'finalise' the buffer rather than doing it in snprintf() itself. Fixes: e90ce42e81381 ("tools/nolibc: implement width padding in printf()") Signed-off-by: David Laight <david.laight.linux@gmail.com> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260302101815.3043-3-david.laight.linux@gmail.com [Thomas: clean up Fixes trailer] Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	tools/nolibc/printf: Change variables 'c' to 'ch' and 'tmpbuf[]' to 'outbuf[]'	David Laight	1	-19/+19
	[ Upstream commit f675ae28fcdf7db93a8c1a6964f062725b1e06a0 ] Changing 'c' makes the code slightly easier to read because the variable stands out from the single character literals (especially 'c'). Change tmpbuf[] to outbuf[] because 'out' points into it. The following patches pretty much rewrite the function so the churn is limited. Signed-off-by: David Laight <david.laight.linux@gmail.com> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260223101735.2922-7-david.laight.linux@gmail.com Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Stable-dep-of: 4045e7b19bbf ("tools/nolibc/printf: Move snprintf length check to callback") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	tools/nolibc: implement %m if errno is not defined	Benjamin Berg	1	-2/+4
	[ Upstream commit fbd1b7f6b322a63b21ebbf00c732a17bb8bdb5d4 ] For improved compatibility, print %m as "unknown error" when nolibc is compiled using NOLIBC_IGNORE_ERRNO. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Stable-dep-of: 4045e7b19bbf ("tools/nolibc/printf: Move snprintf length check to callback") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-05-14	x86/CPU/AMD: Prevent improper isolation of shared resources in Zen2's op cache	Prathyushi Nangia	1	-1/+2
	commit c21b90f77687075115d989e53a8ec5e2bb427ab1 upstream. Make sure resources are not improperly shared in the op cache and cause instruction corruption this way. Signed-off-by: Prathyushi Nangia <prathyushi.nangia@amd.com> Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-05-14	selftests: mptcp: pm: restrict 'unknown' check to pm_nl_ctl	Matthieu Baerts (NGI0)	1	-3/+7
	commit 53705ddfa18408f8e1f064331b6387509fa19f7f upstream. When pm_netlink.sh is executed with '-i', 'ip mptcp' is used instead of 'pm_nl_ctl'. IPRoute2 doesn't support the 'unknown' flag, which has only been added to 'pm_nl_ctl' for this specific check: to ensure that the kernel ignores such unsupported flag. No reason to add this flag to 'ip mptcp'. Then, this check should be skipped when 'ip mptcp' is used. Fixes: 0cef6fcac24d ("selftests: mptcp: ip_mptcp option for more scripts") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-11-fca8091060a4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-05-14	selftests: mptcp: check output: catch cmd errors	Matthieu Baerts (NGI0)	2	-10/+16
	commit 65db7b27b90e2ea8d4966935aa9a50b6a60c31ac upstream. Using '${?}' inside the if-statement to check the returned value from the command that was evaluated as part of the if-statement is not correct: here, '${?}' will be linked to the previous instruction, not the one that is expected here (${cmd}). Instead, simply mark the error, except if an error is expected. If that's the case, 1 can be passed as the 4th argument of this helper. Three checks from pm_netlink.sh expect an error. While at it, improve the error message when the command unexpectedly fails or succeeds. Note that we could expect a specific returned value, but the checks currently expecting an error can be used with 'ip mptcp' or 'pm_nl_ctl', and these two tools don't return the same error code. Fixes: 2d0c1d27ea4e ("selftests: mptcp: add mptcp_lib_check_output helper") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-10-fca8091060a4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>