summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2026-04-02HSI: omap_ssi_port: remove set but unused variablesRosen Penev1-5/+2
W=1 build warns that these are set and unused. eg: error: variable ‘mode’ set but not used [-Werror=unused-but-set-variable] Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20260401215618.11251-1-rosenp@gmail.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2026-04-02Merge branch 'libbpf-clarify-raw-address-single-kprobe-attach-behavior'Andrii Nakryiko3-9/+116
Hoyeon Lee says: ==================== libbpf: clarify raw-address single kprobe attach behavior Today libbpf documents single-kprobe attach through func_name, with an optional offset. For the PMU-based path, func_name = NULL with an absolute address in offset already works as well, but that is not described in the API. This patchset clarifies this behavior. First commit fixes kprobe and uprobe attach error handling to use direct error codes. Next adds kprobe API comments for the raw-address form and rejects it explicitly for legacy tracefs/debugfs kprobes. Last adds PERF and LINK selftests for the raw-address form, and checks that LEGACY rejects it. --- Changes in v7: - Change selftest line wrapping and assertions Changes in v6: - Split the kprobe/uprobe direct error-code fix into a separate patch Changes in v5: - Add kprobe API docs, use -EOPNOTSUPP, and switch selftests to LIBBPF_OPTS Changes in v4: - Inline raw-address error formatting and remove the probe_target buffer Changes in v3: - Drop bpf_kprobe_opts.addr and reuse offset when func_name is NULL - Make legacy tracefs/debugfs kprobes reject the raw-address form - Update selftests to cover PERF/LINK raw-address attach and LEGACY reject Changes in v2: - Fix line wrapping and indentation ==================== Link: https://patch.msgid.link/20260401143116.185049-1-hoyeon.lee@suse.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2026-04-02selftests/bpf: Add test for raw-address single kprobe attachHoyeon Lee1-0/+80
Currently, attach_probe covers manual single-kprobe attaches by func_name, but not the raw-address form that the PMU-based single-kprobe path can accept. This commit adds PERF and LINK raw-address coverage. It resolves SYS_NANOSLEEP_KPROBE_NAME through kallsyms, passes the absolute address in bpf_kprobe_opts.offset with func_name = NULL, and verifies that kprobe and kretprobe are still triggered. It also verifies that LEGACY rejects the same form. Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20260401143116.185049-4-hoyeon.lee@suse.com
2026-04-02libbpf: Clarify raw-address single kprobe attach behaviorHoyeon Lee2-7/+34
bpf_program__attach_kprobe_opts() documents single-kprobe attach through func_name, with an optional offset. For the PMU-based path, func_name = NULL with an absolute address in offset already works as well, but that is not described in the API. This commit clarifies this existing non-legacy behavior. For PMU-based attach, callers can use func_name = NULL with an absolute address in offset as the raw-address form. For legacy tracefs/debugfs kprobes, reject this form explicitly. Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20260401143116.185049-3-hoyeon.lee@suse.com
2026-04-02libbpf: Use direct error codes for kprobe/uprobe attachHoyeon Lee1-2/+2
perf_event_open_probe() and perf_event_{k,u}probe_open_legacy() helpers are returning negative error codes directly on failure. This commit changes bpf_program__attach_{k,u}probe_opts() to use those return values directly instead of re-reading possibly changed errno. Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20260401143116.185049-2-hoyeon.lee@suse.com
2026-04-02accel: ethosu: Add hardware dependency hintJean Delvare1-0/+1
The Ethos-U NPU is only available on ARM systems, so add a hardware dependency hint to prevent this driver from being needlessly included in kernels built for other architectures. Signed-off-by: Jean Delvare <jdelvare@suse.de> Link: https://patch.msgid.link/20260401122323.6127a77c@endymion Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2026-04-02libbpf: Fix BTF handling in bpf_program__clone()Mykyta Yatsenko2-17/+44
Align bpf_program__clone() with bpf_object_load_prog() by gating BTF func/line info on FEAT_BTF_FUNC kernel support, and resolve caller-provided prog_btf_fd before checking obj->btf so that callers with their own BTF can use clone() even when the object has no BTF loaded. While at it, treat func_info and line_info fields as atomic groups to prevent mismatches between pointer and count from different sources. Move bpf_program__clone() to libbpf 1.8. Fixes: 970bd2dced35 ("libbpf: Introduce bpf_program__clone()") Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260401151640.356419-1-mykyta.yatsenko5@gmail.com
2026-04-02perf test: Skip perf data type profiling tests for s390Thomas Richter1-0/+4
Test case 'perf data type profiling tests' fails on s390 with this error: # ./perf mem record -- ./perf test -w code_with_type failed: no PMU supports the memory events # echo $? 255 # because s390 does not support memory events at all. According to the man page, perf annotate --code-with-type only works with memory instructions only. As command 'perf mem record ...' is not supported on s390, skip this test for s390. Output before: # ./perf test 'perf data type profiling tests' 77: perf data type profiling tests : FAILED! Output after: # ./perf test 'perf data type profiling tests' 77: perf data type profiling tests : Skip Fixes: f60a5c22967b8 ("perf tests: Test annotate with data type profiling and rust") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Suggested-by: Namhyung Kim <namhyung@kernel.org> Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2026-04-02perf tools: prevent null dsos from being addedAnubhav Shelat1-0/+3
When sorting the dso array we sometimes get a crash due to null comparisons in comparator functions. So prevent __dsos__add from adding null to the dso array to avoid out-of-memory related errors. Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2026-04-02perf test: Fix ratio_to_prev event parsing testThomas Falcon1-21/+28
test__ratio_to_prev() assumed the first event in a group is the leader, which is not the case when the event is expanded into two event groups on hybrid PMU's with auto counter reload support. Instead, iterate over the event group generated for each core PMU. Also update "wrong leader" test to check that the subordinate event has the correct leader instead of checking that it is not the group leader. Finally, do not exit immediately if a PMU without auto counter reload support is found. Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 56be0fe5f62c ("perf record: Add auto counter reload parse and regression tests") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2026-04-02perf tools: Fix module symbol resolution for non-zero .text sh_addrChuck Lever1-2/+6
When perf resolves symbols from kernel module ELF files (ET_REL), it converts symbol addresses to file offsets so that sample IPs can be matched to the correct symbol. The conversion adjusts each symbol's st_value: sym->st_value -= shdr->sh_addr - shdr->sh_offset; For vmlinux (ET_EXEC), st_value is a virtual address and sh_addr is the section's virtual base, so subtracting sh_addr and adding sh_offset correctly yields a file offset. For kernel modules (ET_REL), st_value is a section-relative offset. The module loader ignores sh_addr entirely and places symbols at module_base + st_value. Converting to file offset requires only adding sh_offset; subtracting sh_addr introduces an error equal to sh_addr bytes. When .text has sh_addr == 0 -- the historical norm for simple modules -- both formulas produce the same result and the bug is latent. As modules gain more metadata sections before .text (.note, .static_call.text, etc.), the linker assigns .text a non-zero sh_addr, exposing the defect. For example, nfsd.ko on this kernel has sh_addr=0xa80, kvm-intel.ko has sh_addr=0x1e90. The effect is that all .text symbols in affected modules shift by sh_addr bytes relative to sample IPs, causing perf report to attribute samples to incorrect, nearby symbols. This was observed as 13% of LLC-load-miss samples misattributed to nfsd_file_get_dio_attrs when the actual hot function was nfsd_cache_lookup, approximately 0xa80 bytes away in the symbol table. Use the existing dso__rel() flag (already set for ET_REL modules) to select the correct adjustment: add sh_offset for ET_REL, subtract (sh_addr - sh_offset) for ET_EXEC/ET_DYN. Fixes: 0131c4ec794a ("perf tools: Make it possible to read object code from kernel modules") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2026-04-02perf trace: Skip unnecessary synthesis for summary-only modeNamhyung Kim1-1/+5
It needs to synthesize task info for the comm name. The mmap information is only needed for callchain symbolization which is not used by the summary mode. Also total or cgroup summary mode don't require the task info. Let's skip the processing if possible. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2026-04-02perf stat: Fix crash on arm64Breno Leitao1-9/+17
Perf stat is crashing on arm64 hosts with the following issue: # make -C tools/perf DEBUG=1 # perf stat sleep 1 perf: util/evsel.c:2034: get_group_fd: Assertion `!(!leader->core.fd)' failed. [1] 1220794 IOT instruction (core dumped) ./perf stat The sorting function introduced by commit a745c0831c15c ("perf stat: Sort default events/metrics") compares events based on their individual properties. This can cause events from different groups to be interleaved, resulting in group members appearing before their leaders in the sorted evlist. When the iterator opens events in list order, a group member may be processed before its leader has been opened. For example, CPU_CYCLES (idx=32) with leader STALL_SLOT_BACKEND (idx=37) could be sorted before its leader, causing the crash when CPU_CYCLES tries to get its group fd from the not-yet-opened leader. Fix this by comparing events based on their leader's attributes instead of their own attributes when the events are in different groups. This ensures all members of a group share the same sort key as their leader, keeping groups together and guaranteeing leaders are opened before their members. Fixes: a745c0831c15c ("perf stat: Sort default events/metrics") Reported-by: Denis Yaroshevskiy <dyaroshev@meta.com> Tested-by: Dmitry Ilvokhin <d@ilvokhin.com> Tested-by: Ian Rogers <irogers@google.com> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2026-04-02arm64: mm: Remove pmd_sect() and pud_sect()Ryan Roberts2-16/+21
The semantics of pXd_leaf() are very similar to pXd_sect(). The only difference is that pXd_sect() only considers it a section if PTE_VALID is set, whereas pXd_leaf() permits both "valid" and "present-invalid" types. Using pXd_sect() has caused issues now that large leaf entries can be present-invalid since commit a166563e7ec37 ("arm64: mm: support large block mapping when rodata=full"), so let's just remove the API and standardize on pXd_leaf(). There are a few callsites of the form pXd_leaf(READ_ONCE(*pXdp)). This was previously fine for the pXd_sect() macro because it only evaluated its argument once. But pXd_leaf() evaluates its argument multiple times. So let's avoid unintended side effects by reimplementing pXd_leaf() as an inline function. Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-02arm64: mm: Handle invalid large leaf mappings correctlyRyan Roberts5-59/+48
It has been possible for a long time to mark ptes in the linear map as invalid. This is done for secretmem, kfence, realm dma memory un/share, and others, by simply clearing the PTE_VALID bit. But until commit a166563e7ec37 ("arm64: mm: support large block mapping when rodata=full") large leaf mappings were never made invalid in this way. It turns out various parts of the code base are not equipped to handle invalid large leaf mappings (in the way they are currently encoded) and I've observed a kernel panic while booting a realm guest on a BBML2_NOABORT system as a result: [ 15.432706] software IO TLB: Memory encryption is active and system is using DMA bounce buffers [ 15.476896] Unable to handle kernel paging request at virtual address ffff000019600000 [ 15.513762] Mem abort info: [ 15.527245] ESR = 0x0000000096000046 [ 15.548553] EC = 0x25: DABT (current EL), IL = 32 bits [ 15.572146] SET = 0, FnV = 0 [ 15.592141] EA = 0, S1PTW = 0 [ 15.612694] FSC = 0x06: level 2 translation fault [ 15.640644] Data abort info: [ 15.661983] ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000 [ 15.694875] CM = 0, WnR = 1, TnD = 0, TagAccess = 0 [ 15.723740] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 15.755776] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000081f3f000 [ 15.800410] [ffff000019600000] pgd=0000000000000000, p4d=180000009ffff403, pud=180000009fffe403, pmd=00e8000199600704 [ 15.855046] Internal error: Oops: 0000000096000046 [#1] SMP [ 15.886394] Modules linked in: [ 15.900029] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 7.0.0-rc4-dirty #4 PREEMPT [ 15.935258] Hardware name: linux,dummy-virt (DT) [ 15.955612] pstate: 21400005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 15.986009] pc : __pi_memcpy_generic+0x128/0x22c [ 16.006163] lr : swiotlb_bounce+0xf4/0x158 [ 16.024145] sp : ffff80008000b8f0 [ 16.038896] x29: ffff80008000b8f0 x28: 0000000000000000 x27: 0000000000000000 [ 16.069953] x26: ffffb3976d261ba8 x25: 0000000000000000 x24: ffff000019600000 [ 16.100876] x23: 0000000000000001 x22: ffff0000043430d0 x21: 0000000000007ff0 [ 16.131946] x20: 0000000084570010 x19: 0000000000000000 x18: ffff00001ffe3fcc [ 16.163073] x17: 0000000000000000 x16: 00000000003fffff x15: 646e612065766974 [ 16.194131] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 16.225059] x11: 0000000000000000 x10: 0000000000000010 x9 : 0000000000000018 [ 16.256113] x8 : 0000000000000018 x7 : 0000000000000000 x6 : 0000000000000000 [ 16.287203] x5 : ffff000019607ff0 x4 : ffff000004578000 x3 : ffff000019600000 [ 16.318145] x2 : 0000000000007ff0 x1 : ffff000004570010 x0 : ffff000019600000 [ 16.349071] Call trace: [ 16.360143] __pi_memcpy_generic+0x128/0x22c (P) [ 16.380310] swiotlb_tbl_map_single+0x154/0x2b4 [ 16.400282] swiotlb_map+0x5c/0x228 [ 16.415984] dma_map_phys+0x244/0x2b8 [ 16.432199] dma_map_page_attrs+0x44/0x58 [ 16.449782] virtqueue_map_page_attrs+0x38/0x44 [ 16.469596] virtqueue_map_single_attrs+0xc0/0x130 [ 16.490509] virtnet_rq_alloc.isra.0+0xa4/0x1fc [ 16.510355] try_fill_recv+0x2a4/0x584 [ 16.526989] virtnet_open+0xd4/0x238 [ 16.542775] __dev_open+0x110/0x24c [ 16.558280] __dev_change_flags+0x194/0x20c [ 16.576879] netif_change_flags+0x24/0x6c [ 16.594489] dev_change_flags+0x48/0x7c [ 16.611462] ip_auto_config+0x258/0x1114 [ 16.628727] do_one_initcall+0x80/0x1c8 [ 16.645590] kernel_init_freeable+0x208/0x2f0 [ 16.664917] kernel_init+0x24/0x1e0 [ 16.680295] ret_from_fork+0x10/0x20 [ 16.696369] Code: 927cec03 cb0e0021 8b0e0042 a9411c26 (a900340c) [ 16.723106] ---[ end trace 0000000000000000 ]--- [ 16.752866] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 16.792556] Kernel Offset: 0x3396ea200000 from 0xffff800080000000 [ 16.818966] PHYS_OFFSET: 0xfff1000080000000 [ 16.837237] CPU features: 0x0000000,00060005,13e38581,957e772f [ 16.862904] Memory Limit: none [ 16.876526] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- This panic occurs because the swiotlb memory was previously shared to the host (__set_memory_enc_dec()), which involves transitioning the (large) leaf mappings to invalid, sharing to the host, then marking the mappings valid again. But pageattr_p[mu]d_entry() would only update the entry if it is a section mapping, since otherwise it concluded it must be a table entry so shouldn't be modified. But p[mu]d_sect() only returns true if the entry is valid. So the result was that the large leaf entry was made invalid in the first pass then ignored in the second pass. It remains invalid until the above code tries to access it and blows up. The simple fix would be to update pageattr_pmd_entry() to use !pmd_table() instead of pmd_sect(). That would solve this problem. But the ptdump code also suffers from a similar issue. It checks pmd_leaf() and doesn't call into the arch-specific note_page() machinery if it returns false. As a result of this, ptdump wasn't even able to show the invalid large leaf mappings; it looked like they were valid which made this super fun to debug. the ptdump code is core-mm and pmd_table() is arm64-specific so we can't use the same trick to solve that. But we already support the concept of "present-invalid" for user space entries. And even better, pmd_leaf() will return true for a leaf mapping that is marked present-invalid. So let's just use that encoding for present-invalid kernel mappings too. Then we can use pmd_leaf() where we previously used pmd_sect() and everything is magically fixed. Additionally, from inspection kernel_page_present() was broken in a similar way, so I'm also updating that to use pmd_leaf(). The transitional page tables component was also similarly broken; it creates a copy of the kernel page tables, making RO leaf mappings RW in the process. It also makes invalid (but-not-none) pte mappings valid. But it was not doing this for large leaf mappings. This could have resulted in crashes at kexec- or hibernate-time. This code is fixed to flip "present-invalid" mappings back to "present-valid" at all levels. Finally, I have hardened split_pmd()/split_pud() so that if it is passed a "present-invalid" leaf, it will maintain that property in the split leaves, since I wasn't able to convince myself that it would only ever be called for "present-valid" leaves. Fixes: a166563e7ec3 ("arm64: mm: support large block mapping when rodata=full") Cc: stable@vger.kernel.org Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-02arm64: mm: Fix rodata=full block mapping support for realm guestsRyan Roberts3-14/+42
Commit a166563e7ec37 ("arm64: mm: support large block mapping when rodata=full") enabled the linear map to be mapped by block/cont while still allowing granular permission changes on BBML2_NOABORT systems by lazily splitting the live mappings. This mechanism was intended to be usable by realm guests since they need to dynamically share dma buffers with the host by "decrypting" them - which for Arm CCA, means marking them as shared in the page tables. However, it turns out that the mechanism was failing for realm guests because realms need to share their dma buffers (via __set_memory_enc_dec()) much earlier during boot than split_kernel_leaf_mapping() was able to handle. The report linked below showed that GIC's ITS was one such user. But during the investigation I found other callsites that could not meet the split_kernel_leaf_mapping() constraints. The problem is that we block map the linear map based on the boot CPU supporting BBML2_NOABORT, then check that all the other CPUs support it too when finalizing the caps. If they don't, then we stop_machine() and split to ptes. For safety, split_kernel_leaf_mapping() previously wouldn't permit splitting until after the caps were finalized. That ensured that if any secondary cpus were running that didn't support BBML2_NOABORT, we wouldn't risk breaking them. I've fix this problem by reducing the black-out window where we refuse to split; there are now 2 windows. The first is from T0 until the page allocator is inititialized. Splitting allocates memory for the page allocator so it must be in use. The second covers the period between starting to online the secondary cpus until the system caps are finalized (this is a very small window). All of the problematic callers are calling __set_memory_enc_dec() before the secondary cpus come online, so this solves the problem. However, one of these callers, swiotlb_update_mem_attributes(), was trying to split before the page allocator was initialized. So I have moved this call from arch_mm_preinit() to mem_init(), which solves the ordering issue. I've added warnings and return an error if any attempt is made to split in the black-out windows. Note there are other issues which prevent booting all the way to user space, which will be fixed in subsequent patches. Reported-by: Jinjiang Tu <tujinjiang@huawei.com> Closes: https://lore.kernel.org/all/0b2a4ae5-fc51-4d77-b177-b2e9db74f11d@huawei.com/ Fixes: a166563e7ec3 ("arm64: mm: support large block mapping when rodata=full") Cc: stable@vger.kernel.org Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-02eventpoll: defer struct eventpoll free to RCU grace periodNicholas Carlini1-1/+5
In certain situations, ep_free() in eventpoll.c will kfree the epi->ep eventpoll struct while it still being used by another concurrent thread. Defer the kfree() to an RCU callback to prevent UAF. Fixes: f2e467a48287 ("eventpoll: Fix semi-unbounded recursion") Signed-off-by: Nicholas Carlini <nicholas@carlini.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-04-02accel/ivpu: Trigger recovery on TDR with OS schedulingKarol Wachowski1-0/+6
With OS scheduling mode the driver cannot determine which context caused the timeout, so context abort cannot be used. Instead of queuing context_abort_work, directly trigger full device recovery when a job timeout (TDR) occurs in OS scheduling mode. Fixes: ade00a6c903f ("accel/ivpu: Perform engine reset instead of device recovery on TDR") Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://patch.msgid.link/20260402125526.845210-1-karol.wachowski@linux.intel.com
2026-04-02sched_ext: Fix is_bpf_migration_disabled() false negative on non-PREEMPT_RCUChangwoo Min1-12/+19
Since commit 8e4f0b1ebcf2 ("bpf: use rcu_read_lock_dont_migrate() for trampoline.c"), the BPF prolog (__bpf_prog_enter) calls migrate_disable() only when CONFIG_PREEMPT_RCU is enabled, via rcu_read_lock_dont_migrate(). Without CONFIG_PREEMPT_RCU, the prolog never touches migration_disabled, so migration_disabled == 1 always means the task is truly migration-disabled regardless of whether it is the current task. The old unconditional p == current check was a false negative in this case, potentially allowing a migration-disabled task to be dispatched to a remote CPU and triggering scx_error in task_can_run_on_remote_rq(). Only apply the p == current disambiguation when CONFIG_PREEMPT_RCU is enabled, where the ambiguity with the BPF prolog still exists. Fixes: 8e4f0b1ebcf2 ("bpf: use rcu_read_lock_dont_migrate() for trampoline.c") Cc: stable@vger.kernel.org # v6.18+ Link: https://lore.kernel.org/lkml/20250821090609.42508-8-dongml2@chinatelecom.cn/ Signed-off-by: Changwoo Min <changwoo@igalia.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-02drm/amd/display: Wire up dcn10_dio_construct() for all pre-DCN401 generationsIonut Nechita17-0/+699
Description: - Commit b82f0759346617b2 ("drm/amd/display: Migrate DIO registers access from hwseq to dio component") moved DIO_MEM_PWR_CTRL register access behind the new dio abstraction layer but only created the dio object for DCN 4.01. On all other generations (DCN 10/20/21/201/30/301/302/303/ 31/314/315/316/32/321/35/351/36), the dio pointer is NULL, causing the register write to be silently skipped. This results in AFMT HDMI memory not being powered on during init_hw, which can cause HDMI audio failures and display issues on affected hardware including Renoir/Cezanne (DCN 2.1) APUs that use dcn10_init_hw. Call dcn10_dio_construct() in each older DCN generation's resource.c to create the dio object, following the same pattern as DCN 4.01. This ensures the dio pointer is non-NULL and the mem_pwr_ctrl callback works through the dio abstraction for all DCN generations. Fixes: b82f07593466 ("drm/amd/display: Migrate DIO registers access from hwseq to dio component.") Reviewed-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Ionut Nechita <ionut_n2001@yahoo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-04-02sched_ext: Fix missing warning in scx_set_task_state() default caseSamuele Mariotti1-1/+2
In scx_set_task_state(), the default case was setting the warn flag, but then returning immediately. This is problematic because the only purpose of the warn flag is to trigger WARN_ONCE, but the early return prevented it from ever firing, leaving invalid task states undetected and untraced. To fix this, a WARN_ONCE call is now added directly in the default case. The fix addresses two aspects: - Guarantees the invalid task states are properly logged and traced. - Provides a distinct warning message ("sched_ext: Invalid task state") specifically for states outside the defined scx_task_state enum values, making it easier to distinguish from other transition warnings. This ensures proper detection and reporting of invalid states. Signed-off-by: Samuele Mariotti <smariotti@disroot.org> Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-02Merge tag 'v7.0-rc6-ksmbd-server-fix' of git://git.samba.org/ksmbdLinus Torvalds3-32/+134
Pull smb server fix from Steve French: - Fix out of bound write * tag 'v7.0-rc6-ksmbd-server-fix' of git://git.samba.org/ksmbd: ksmbd: fix OOB write in QUERY_INFO for compound requests
2026-04-02ata: libata-transport: remove static variable ata_scsi_transport_templateHeiner Kallweit3-9/+4
Simplify the code by making struct ata_scsi_transportt public, instead of using separate variable ata_scsi_transport_template. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Niklas Cassel <cassel@kernel.org>
2026-04-02ata: libata-transport: split struct ata_internalHeiner Kallweit1-28/+25
There's no need for an umbrella struct, so remove it. It's also a prerequisite for making the embedded struct scsi_transport_template public. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Niklas Cassel <cassel@kernel.org>
2026-04-02ata: libata-transport: use static struct ata_transport_internal to simplify ↵Heiner Kallweit1-21/+23
match functions Both matching functions can make use of static struct ata_transport_internal. This eliminates the dependency on static variable ata_scsi_transport_template, and it allows to remove helper to_ata_internal(). Small drawback is that a forward declaration of both functions is needed. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Niklas Cassel <cassel@kernel.org>
2026-04-02fuse: support FSCONFIG_SET_FD for "fd" optionMiklos Szeredi1-7/+11
This is not only cleaner to use in userspace (no need to sprintf the fd to a string) but also allows userspace to detect that the devfd can be closed after the fsconfig call. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
2026-04-02fuse: clean up device cloningMiklos Szeredi3-24/+15
- fuse_mutex is not needed for device cloning, because fuse_dev_install() uses cmpxcg() to set fud->fc, which prevents races between clone/mount or clone/clone. This makes the logic simpler - Drop fc->dev_count. This is only used to check in release if the device is the last clone, but checking list_empty(&fc->devices) is equivalent after removing the released device from the list. Removing the fuse_dev before calling fuse_abort_conn() is okay, since the processing and io lists are now empty for this device. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2026-04-02ata: libata-transport: inline ata_attach|release_transportHeiner Kallweit3-28/+11
Both functions are helpers which are used only once. So remove them and merge their code into libata_transport_init() and libata_transport_exit() respectively. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Niklas Cassel <cassel@kernel.org>
2026-04-02ata: libata-transport: instantiate struct ata_internal staticallyHeiner Kallweit3-42/+28
Struct ata_internal is only instantiated once, in module init code. So we can also instantiate it statically, which allows simplifying the code. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Niklas Cassel <cassel@kernel.org>
2026-04-02fuse: don't require /dev/fuse fd to be kept open during mountMiklos Szeredi2-27/+34
With the new mount API the sequence of syscalls would be: fs_fd = fsopen("fuse", 0); snprintf(opt, sizeof(opt), "%i", devfd); fsconfig(fs_fd, FSCONFIG_SET_STRING, "fd", opt, 0); /* ... */ fsconfig(fs_fd, FSCONFIG_CMD_CREATE, 0, 0, 0); Current mount code just stores the value of devfd in the fs_context and uses it in during FSCONFIG_CMD_CREATE, which is inelegant. Instead grab a reference to the underlying fuse_dev, and use that during the filesystem creation. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2026-04-02fuse: add refcount to fuse_devMiklos Szeredi6-18/+50
This will make it possible to grab the fuse_dev and subsequently release the file that it came from. In the above case, fud->fc will be set to FUSE_DEV_FC_DISCONNECTED to indicate that this is no longer a functional device. When trying to assign an fc to such a disconnected fuse_dev, the fc is set to the disconnected state. Use atomic operations xchg() and cmpxchg() to prevent races. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2026-04-02fuse: create fuse_dev on /dev/fuse open instead of mountMiklos Szeredi5-66/+57
Allocate struct fuse_dev when opening the device. This means that unlike before, ->private_data is always set to a valid pointer. The use of USE_DEV_SYNC_INIT magic pointer for the private_data is now replaced with a simple bool sync_init member. If sync INIT is not set, I/O on the device returns error before mount. Keep this behavior by checking for the ->fc member. If fud->fc is set, the mount has succeeded. Testing this used READ_ONCE(file->private_data) and smp_mb() to try and provide the necessary semantics. Switch this to smp_store_release() and smp_load_acquire(). Setting fud->fc is protected by fuse_mutex, this is unchanged. Will need this later so the /dev/fuse open file reference is not held during FSCONFIG_CMD_CREATE. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
2026-04-02fuse: check connection state on notificationMiklos Szeredi1-0/+7
Check if the connection is fully initialized and connected before trying to process a notification form the fuse server. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2026-04-02fuse: fuse_dev_ioctl_clone() should wait for device file to be initializedMiklos Szeredi1-11/+8
Use fuse_get_dev() not __fuse_get_dev() on the old fd, since in the case of synchronous INIT the caller will want to wait for the device file to be available for cloning, just like I/O wants to wait instead of returning an error. Fixes: dfb84c330794 ("fuse: allow synchronous FUSE_INIT") Cc: stable@vger.kernel.org # v6.18 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2026-04-02Merge branch 'net-stmmac-tso-fixes-cleanups'Jakub Kicinski2-67/+150
Russell King says: ==================== net: stmmac: TSO fixes/cleanups This is a more refined version of the previous patch series fixing and cleaning up the TSO code. I'm not sure whether "TSO" or "GSO" should be used to describe this feature - although it primarily handles TCP, dwmac4 appears to also be able to handle UDP. In essence, this series adds a .ndo_features_check() method to handle whether TSO/GSO can be used for a particular skbuff - checking which queue the skbuff is destined for and whether that has TBS available which precludes TSO being enabled on that channel. I'm also adding a check that the header is smaller than 1024 bytes, as documented in those sources which have TSO support - this is due to the hardware buffering the header in "TSO memory" which I guess is limited to 1KiB. I expect this test never to trigger, but if the headers ever exceed that size, the hardware will likely fail. While IPv4 headers are unlikely to be anywhere near this, there is nothing in the protocol which prevents IPv6 headers up to 64KiB. As we now have a .ndo_features_check() method, I'm moving the VLAN insertion for TSO packets into core code by unpublishing the VLAN insertion features when we use TSO. Another move is for checksumming, which is required for TSO, but stmmac's requirements for offloading checksums are more strict - and this seems to be a bug in the TSO path. I've changed the hardware initialisation to always enable TSO support on the channels even if the user requests TSO/GSO to be disabled - this fixes another issue as pointed out by Jakub in a previous review. I'm moving the setup of the GSO features, cleaning those up, and adding a warning if platform glue requests this to be enabled but the hardware has no support. Hopefully this will never trigger if everyone got the STMMAC_FLAG_TSO_EN flag correct. Also adding a check for TxPBL value. Finally, moving the "TSO supported" message to the new stmmac_set_gso_features() function so keep all this TSO stuff together. ==================== Link: https://patch.msgid.link/aczHVF04LIGq_lYO@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: move "TSO supported" message to stmmac_set_gso_features()Russell King (Oracle)1-3/+3
Move the "TSO supported" message to stmmac_set_gso_features() so that we group all probe-time TSO stuff in one place. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7pu8-0000000Eau5-3Zne@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: check txpbl for TSORussell King (Oracle)1-0/+14
Documentation states that TxPBL must be >= 4 to allow TSO support, but the driver doesn't check this. TxPBL comes from the platform glue code or DT. Add a check with a warning if platform glue code attempts to enable TSO support with TxPBL too low. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7pu3-0000000Eatz-39ts@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: add warning when TSO is requested but unsupportedRussell King (Oracle)1-1/+3
Add a warning message if TSO is requested by the platform glue code but the core wasn't configured for TSO. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7pty-0000000Eatt-2TjZ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: make stmmac_set_gso_features() more readableRussell King (Oracle)1-7/+13
Make stmmac_set_gso_features() more readable by adding some whitespace and getting rid of the indentation. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7ptt-0000000Eatn-1ziK@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: split out gso features setupRussell King (Oracle)1-7/+14
Move the GSO features setup into a separate function, co-loated with other GSO/TSO support. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7pto-0000000Eath-1VDH@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: simplify GSO/TSO test in stmmac_xmit()Russell King (Oracle)2-12/+19
The test in stmmac_xmit() to see whether we should pass the skbuff to stmmac_tso_xmit() is more complex than it needs to be. This test can be simplified by storing the mask of GSO types that we will pass, and setting it according to the enabled features. Note that "tso" is a mis-nomer since commit b776620651a1 ("net: stmmac: Implement UDP Segmentation Offload"). Also note that this commit controls both via the TSO feature. We preserve this behaviour in this commit. Also, this commit unconditionally accessed skb_shinfo(skb)->gso_type for all frames, even when skb_is_gso() was false. This access is eliminated. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7ptj-0000000Eatb-11zK@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: move check for hardware checksum supportedRussell King (Oracle)1-19/+19
Add a check in .ndo_features_check() to indicate whether hardware checksum can be performed on the skbuff. Where hardware checksum is not supported - either because the channel does not support Tx COE or the skb isn't suitable (stmmac uses a tighter test than can_checksum_protocol()) we also need to disable TSO, which will be done by harmonize_features() in net/core/dev.c This fixes a bug where a channel which has COE disabled may still receive TSO skbuffs. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7pte-0000000EatU-0ILt@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: move TSO VLAN tag insertion to core codeRussell King (Oracle)1-13/+10
stmmac_tso_xmit() checks whether the skbuff is trying to offload vlan tag insertion to hardware, which from the comment in the code appears to be buggy when the TSO feature is used. Rather than stmmac_tso_xmit() inserting the VLAN tag, handle this in stmmac_features_check() which will then use core net code to handle this. See net/core/dev.c::validate_xmit_skb() Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7ptY-0000000EatO-42Qv@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: add GSO MSS checksRussell King (Oracle)1-1/+8
Add GSO MSS checks to stmmac_features_check(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7ptT-0000000EatI-3feh@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: add TSO check for header lengthRussell King (Oracle)1-1/+16
According to the STM32MP151 documentation which covers dwmac v4.2, the hardware TSO feature can handle header lengths up to a maximum of 1023 bytes. Add a .ndo_features_check() method implementation to check the header length meets these requirements, otherwise fall back to software GSO. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7ptO-0000000EatC-39il@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: add stmmac_tso_header_size()Russell King (Oracle)1-5/+15
We will need to compute the size of the protocol headers in two places, so move this into a separate function. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7ptJ-0000000Eat5-2ZlA@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: fix TSO support when some channels have TBS availableRussell King (Oracle)1-4/+28
According to the STM32MP25xx manual, which is dwmac v5.3, TBS (time based scheduling) is not permitted for channels which have hardware TSO enabled. Intel's commit 5e6038b88a57 ("net: stmmac: fix TSO and TBS feature enabling during driver open") concurs with this, but it is incomplete. This commit avoids enabling TSO support on the channels which have TBS available, which, as far as the hardware is concerned, means we do not set the TSE bit in the DMA channel's transmit control register. However, the net device's features apply to all queues(channels), which means these channels may still be handed TSO skbs to transmit, and the driver will pass them to stmmac_tso_xmit(). This will generate the descriptors for TSO, even though the channel has the TSE bit clear. Fix this by checking whether the queue(channel) has TBS available, and if it does, fall back to software GSO support. Fixes: 5e6038b88a57 ("net: stmmac: fix TSO and TBS feature enabling during driver open") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7ptE-0000000Easz-28tv@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: fix .ndo_fix_features()Russell King (Oracle)1-8/+2
netdev features documentation requires that .ndo_fix_features() is stateless: it shouldn't modify driver state. Yet, stmmac_fix_features() does exactly that, changing whether GSO frames are processed by the driver. Move this code to stmmac_set_features() instead, which is the correct place for it. We don't need to check whether TSO is supported; this is already handled via the setup of netdev->hw_features, and we are guaranteed that if netdev->hw_features indicates that a feature is not supported, .ndo_set_features() won't be called with it set. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7pt9-0000000East-1YAO@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02net: stmmac: fix channel TSO enable on resumeRussell King (Oracle)1-1/+1
Rather than configuring the channels depending on whether GSO/TSO is currently enabled by the user, always enable if the hardware has TSO support and the platform wants TSO to be enabled. This avoids the channel TSO enable bit being disabled after a resume when the user has disabled TSO features. This will cause problems when the user re-enables TSO. This bug goes back to commit f748be531d70 ("stmmac: support new GMAC4") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1w7pt4-0000000Easn-14WL@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02ntfs3: fix memory leak in indx_create_allocate()Deepanshu Kartikey1-0/+1
When indx_create_allocate() fails after attr_allocate_clusters() succeeds, run_deallocate() frees the disk clusters but never frees the memory allocated by run_add_entry() via kvmalloc() for the runs_tree structure. Fix this by adding run_close() at the out: label to free the run.runs memory on all error paths. The success path is unaffected as it returns 0 directly without going through out:, transferring ownership of the run memory to indx->alloc_run via memcpy(). Reported-by: syzbot+7adcddaeeb860e5d3f2f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7adcddaeeb860e5d3f2f Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>