summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2026-03-04tracing: Wake up poll waiters for hist files when removing an eventPetr Pavlu1-0/+5
[ Upstream commit 9678e53179aa7e907360f5b5b275769008a69b80 ] The event_hist_poll() function attempts to verify whether an event file is being removed, but this check may not occur or could be unnecessarily delayed. This happens because hist_poll_wakeup() is currently invoked only from event_hist_trigger() when a hist command is triggered. If the event file is being removed, no associated hist command will be triggered and a waiter will be woken up only after an unrelated hist command is triggered. Fix the issue by adding a call to hist_poll_wakeup() in remove_event_file_dir() after setting the EVENT_FILE_FL_FREED flag. This ensures that a task polling on a hist file is woken up and receives EPOLLERR. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <zanussi@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://patch.msgid.link/20260219162737.314231-3-petr.pavlu@suse.com Fixes: 1bd13edbbed6 ("tracing/hist: Add poll(POLLIN) support on hist file") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04fgraph: Do not call handlers direct when not using ftrace_opsSteven Rostedt1-3/+10
[ Upstream commit f4ff9f646a4d373f9e895c2f0073305da288bc0a ] The function graph tracer was modified to us the ftrace_ops of the function tracer. This simplified the code as well as allowed more features of the function graph tracer. Not all architectures were converted over as it required the implementation of HAVE_DYNAMIC_FTRACE_WITH_ARGS to implement. For those architectures, it still did it the old way where the function graph tracer handle was called by the function tracer trampoline. The handler then had to check the hash to see if the registered handlers wanted to be called by that function or not. In order to speed up the function graph tracer that used ftrace_ops, if only one callback was registered with function graph, it would call its function directly via a static call. Now, if the architecture does not support the use of using ftrace_ops and still has the ftrace function trampoline calling the function graph handler, then by doing a direct call it removes the check against the handler's hash (list of functions it wants callbacks to), and it may call that handler for functions that the handler did not request calls for. On 32bit x86, which does not support the ftrace_ops use with function graph tracer, it shows the issue: ~# trace-cmd start -p function -l schedule ~# trace-cmd show # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 2) * 11898.94 us | schedule(); 3) # 1783.041 us | schedule(); 1) | schedule() { ------------------------------------------ 1) bash-8369 => kworker-7669 ------------------------------------------ 1) | schedule() { ------------------------------------------ 1) kworker-7669 => bash-8369 ------------------------------------------ 1) + 97.004 us | } 1) | schedule() { [..] Now by starting the function tracer is another instance: ~# trace-cmd start -B foo -p function This causes the function graph tracer to trace all functions (because the function trace calls the function graph tracer for each on, and the function graph trace is doing a direct call): ~# trace-cmd show # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 1) 1.669 us | } /* preempt_count_sub */ 1) + 10.443 us | } /* _raw_spin_unlock_irqrestore */ 1) | tick_program_event() { 1) | clockevents_program_event() { 1) 1.044 us | ktime_get(); 1) 6.481 us | lapic_next_event(); 1) + 10.114 us | } 1) + 11.790 us | } 1) ! 181.223 us | } /* hrtimer_interrupt */ 1) ! 184.624 us | } /* __sysvec_apic_timer_interrupt */ 1) | irq_exit_rcu() { 1) 0.678 us | preempt_count_sub(); When it should still only be tracing the schedule() function. To fix this, add a macro FGRAPH_NO_DIRECT to be set to 0 when the architecture does not support function graph use of ftrace_ops, and set to 1 otherwise. Then use this macro to know to allow function graph tracer to call the handlers directly or not. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://patch.msgid.link/20260218104244.5f14dade@gandalf.local.home Fixes: cc60ee813b503 ("function_graph: Use static_call and branch to optimize entry function") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04fbdev: Use device_create_with_groups() to fix sysfs groups registration raceHans de Goede1-1/+0
[ Upstream commit 68eeb0871e986ae5462439dae881e3a27bcef85f ] The fbdev sysfs attributes are registered after sending the uevent for the device creation, leaving a race window where e.g. udev rules may not be able to access the sysfs attributes because the registration is not done yet. Fix this by switching to device_create_with_groups(). This also results in a nice cleanup. After switching to device_create_with_groups() all that is left of fb_init_device() is setting the drvdata and that can be passed to device_create[_with_groups]() too. After which fb_init_device() can be completely removed. Dropping fb_init_device() + fb_cleanup_device() in turn allows removing fb_info.class_flag as they were the only user of this field. Fixes: 5fc830d6aca1 ("fbdev: Register sysfs groups through device_add_group") Cc: stable@vger.kernel.org Cc: Shixiong Ou <oushixiong@kylinos.cn> Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04mm/vmscan: fix demotion targets checks in reclaim/demotionBing Jiao2-6/+6
[ Upstream commit 1aceed565ff172fc0331dd1d5e7e65139b711139 ] Patch series "mm/vmscan: fix demotion targets checks in reclaim/demotion", v9. This patch series addresses two issues in demote_folio_list(), can_demote(), and next_demotion_node() in reclaim/demotion. 1. demote_folio_list() and can_demote() do not correctly check demotion target against cpuset.mems_effective, which will cause (a) pages to be demoted to not-allowed nodes and (b) pages fail demotion even if the system still has allowed demotion nodes. Patch 1 fixes this bug by updating cpuset_node_allowed() and mem_cgroup_node_allowed() to return effective_mems, allowing directly logic-and operation against demotion targets. 2. next_demotion_node() returns a preferred demotion target, but it does not check the node against allowed nodes. Patch 2 ensures that next_demotion_node() filters against the allowed node mask and selects the closest demotion target to the source node. This patch (of 2): Fix two bugs in demote_folio_list() and can_demote() due to incorrect demotion target checks against cpuset.mems_effective in reclaim/demotion. Commit 7d709f49babc ("vmscan,cgroup: apply mems_effective to reclaim") introduces the cpuset.mems_effective check and applies it to can_demote(). However: 1. It does not apply this check in demote_folio_list(), which leads to situations where pages are demoted to nodes that are explicitly excluded from the task's cpuset.mems. 2. It checks only the nodes in the immediate next demotion hierarchy and does not check all allowed demotion targets in can_demote(). This can cause pages to never be demoted if the nodes in the next demotion hierarchy are not set in mems_effective. These bugs break resource isolation provided by cpuset.mems. This is visible from userspace because pages can either fail to be demoted entirely or are demoted to nodes that are not allowed in multi-tier memory systems. To address these bugs, update cpuset_node_allowed() and mem_cgroup_node_allowed() to return effective_mems, allowing directly logic-and operation against demotion targets. Also update can_demote() and demote_folio_list() accordingly. Bug 1 reproduction: Assume a system with 4 nodes, where nodes 0-1 are top-tier and nodes 2-3 are far-tier memory. All nodes have equal capacity. Test script: echo 1 > /sys/kernel/mm/numa/demotion_enabled mkdir /sys/fs/cgroup/test echo +cpuset > /sys/fs/cgroup/cgroup.subtree_control echo "0-2" > /sys/fs/cgroup/test/cpuset.mems echo $$ > /sys/fs/cgroup/test/cgroup.procs swapoff -a # Expectation: Should respect node 0-2 limit. # Observation: Node 3 shows significant allocation (MemFree drops) stress-ng --oomable --vm 1 --vm-bytes 150% --mbind 0,1 Bug 2 reproduction: Assume a system with 6 nodes, where nodes 0-2 are top-tier, node 3 is a far-tier node, and nodes 4-5 are the farthest-tier nodes. All nodes have equal capacity. Test script: echo 1 > /sys/kernel/mm/numa/demotion_enabled mkdir /sys/fs/cgroup/test echo +cpuset > /sys/fs/cgroup/cgroup.subtree_control echo "0-2,4-5" > /sys/fs/cgroup/test/cpuset.mems echo $$ > /sys/fs/cgroup/test/cgroup.procs swapoff -a # Expectation: Pages are demoted to Nodes 4-5 # Observation: No pages are demoted before oom. stress-ng --oomable --vm 1 --vm-bytes 150% --mbind 0,1,2 Link: https://lkml.kernel.org/r/20260114205305.2869796-1-bingjiao@google.com Link: https://lkml.kernel.org/r/20260114205305.2869796-2-bingjiao@google.com Fixes: 7d709f49babc ("vmscan,cgroup: apply mems_effective to reclaim") Signed-off-by: Bing Jiao <bingjiao@google.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Gregory Price <gourry@gourry.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Waiman Long <longman@redhat.com> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04kcsan, compiler_types: avoid duplicate type issues in BPF Type FormatAlan Maguire1-7/+16
[ Upstream commit 9dc052234da736f7749f19ab6936342ec7dbe3ac ] Enabling KCSAN is causing a large number of duplicate types in BTF for core kernel structs like task_struct [1]. This is due to the definition in include/linux/compiler_types.h `#ifdef __SANITIZE_THREAD__ ... `#define __data_racy volatile .. `#else ... `#define __data_racy ... `#endif Because some objects in the kernel are compiled without KCSAN flags (KCSAN_SANITIZE) we sometimes get the empty __data_racy annotation for objects; as a result we get multiple conflicting representations of the associated structs in DWARF, and these lead to multiple instances of core kernel types in BTF since they cannot be deduplicated due to the additional modifier in some instances. Moving the __data_racy definition under CONFIG_KCSAN avoids this problem, since the volatile modifier will be present for both KCSAN and KCSAN_SANITIZE objects in a CONFIG_KCSAN=y kernel. Link: https://lkml.kernel.org/r/20260116091730.324322-1-alan.maguire@oracle.com Fixes: 31f605a308e6 ("kcsan, compiler_types: Introduce __data_racy type qualifier") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Reported-by: Nilay Shroff <nilay@linux.ibm.com> Tested-by: Nilay Shroff <nilay@linux.ibm.com> Suggested-by: Marco Elver <elver@google.com> Reviewed-by: Marco Elver <elver@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Bart van Assche <bvanassche@acm.org> Cc: Daniel Borkman <daniel@iogearbox.net> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Kees Cook <kees@kernel.org> Cc: KP Singh <kpsingh@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Naman Jain <namjain@linux.microsoft.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PM: sleep: core: Avoid bit field races related to work_in_progressXuewen Yan1-1/+1
[ Upstream commit 0491f3f9f664e7e0131eb4d2a8b19c49562e5c64 ] In all of the system suspend transition phases, the async processing of a device may be carried out in parallel with power.work_in_progress updates for the device's parent or suppliers and if it touches bit fields from the same group (for example, power.must_resume or power.wakeup_path), bit field corruption is possible. To avoid that, turn work_in_progress in struct dev_pm_info into a proper bool field and relocate it to save space. Fixes: aa7a9275ab81 ("PM: sleep: Suspend async parents after suspending children") Fixes: 443046d1ad66 ("PM: sleep: Make suspend of devices more asynchronous") Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> Closes: https://lore.kernel.org/linux-pm/20260203063459.12808-1-xuewen.yan@unisoc.com/ Cc: All applicable <stable@vger.kernel.org> [ rjw: Added subject and changelog ] Link: https://patch.msgid.link/CAB8ipk_VX2VPm706Jwa1=8NSA7_btWL2ieXmBgHr2JcULEP76g@mail.gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04compiler-clang.h: require LLVM 19.1.0 or higher for __typeof_unqual__Nathan Chancellor1-1/+1
[ Upstream commit e8d899d301346a5591c9d1af06c3c9b3501cf84b ] When building the kernel using a version of LLVM between llvmorg-19-init (the first commit of the LLVM 19 development cycle) and the change in LLVM that actually added __typeof_unqual__ for all C modes [1], which might happen during a bisect of LLVM, there is a build failure: In file included from arch/x86/kernel/asm-offsets.c:9: In file included from include/linux/crypto.h:15: In file included from include/linux/completion.h:12: In file included from include/linux/swait.h:7: In file included from include/linux/spinlock.h:56: In file included from include/linux/preempt.h:79: arch/x86/include/asm/preempt.h:61:2: error: call to undeclared function '__typeof_unqual__'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 61 | raw_cpu_and_4(__preempt_count, ~PREEMPT_NEED_RESCHED); | ^ arch/x86/include/asm/percpu.h:478:36: note: expanded from macro 'raw_cpu_and_4' 478 | #define raw_cpu_and_4(pcp, val) percpu_binary_op(4, , "and", (pcp), val) | ^ arch/x86/include/asm/percpu.h:210:3: note: expanded from macro 'percpu_binary_op' 210 | TYPEOF_UNQUAL(_var) pto_tmp__; \ | ^ include/linux/compiler.h:248:29: note: expanded from macro 'TYPEOF_UNQUAL' 248 | # define TYPEOF_UNQUAL(exp) __typeof_unqual__(exp) | ^ The current logic of CC_HAS_TYPEOF_UNQUAL just checks for a major version of 19 but half of the 19 development cycle did not have support for __typeof_unqual__. Harden the logic of CC_HAS_TYPEOF_UNQUAL to avoid this error by only using __typeof_unqual__ with a released version of LLVM 19, which is greater than or equal to 19.1.0 with LLVM's versioning scheme that matches GCC's [2]. Link: https://github.com/llvm/llvm-project/commit/cc308f60d41744b5920ec2e2e5b25e1273c8704b [1] Link: https://github.com/llvm/llvm-project/commit/4532617ae420056bf32f6403dde07fb99d276a49 [2] Link: https://lkml.kernel.org/r/20260116-require-llvm-19-1-for-typeof_unqual-v1-1-3b9a4a4b212b@kernel.org Fixes: ac053946f5c4 ("compiler.h: introduce TYPEOF_UNQUAL() macro") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04ima: verify the previous kernel's IMA buffer lies in addressable RAMHarshit Mogalapalli1-0/+1
[ Upstream commit 10d1c75ed4382a8e79874379caa2ead8952734f9 ] Patch series "Address page fault in ima_restore_measurement_list()", v3. When the second-stage kernel is booted via kexec with a limiting command line such as "mem=<size>" we observe a pafe fault that happens. BUG: unable to handle page fault for address: ffff97793ff47000 RIP: ima_restore_measurement_list+0xdc/0x45a #PF: error_code(0x0000) not-present page This happens on x86_64 only, as this is already fixed in aarch64 in commit: cbf9c4b9617b ("of: check previous kernel's ima-kexec-buffer against memory bounds") This patch (of 3): When the second-stage kernel is booted with a limiting command line (e.g. "mem=<size>"), the IMA measurement buffer handed over from the previous kernel may fall outside the addressable RAM of the new kernel. Accessing such a buffer can fault during early restore. Introduce a small generic helper, ima_validate_range(), which verifies that a physical [start, end] range for the previous-kernel IMA buffer lies within addressable memory: - On x86, use pfn_range_is_mapped(). - On OF based architectures, use page_is_ram(). Link: https://lkml.kernel.org/r/20251231061609.907170-1-harshit.m.mogalapalli@oracle.com Link: https://lkml.kernel.org/r/20251231061609.907170-2-harshit.m.mogalapalli@oracle.com Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Cc: Alexander Graf <graf@amazon.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Borislav Betkov <bp@alien8.de> Cc: guoweikang <guoweikang.kernel@gmail.com> Cc: Henry Willard <henry.willard@oracle.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Bohac <jbohac@suse.cz> Cc: Joel Granados <joel.granados@kernel.org> Cc: Jonathan McDowell <noodles@fb.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Webb <paul.x.webb@oracle.com> Cc: Sohil Mehta <sohil.mehta@intel.com> Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Yifei Liu <yifei.l.liu@oracle.com> Cc: Baoquan He <bhe@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04mfd: tps65219: Implement LOCK register handling for TPS65214Kory Maincent (TI.com)1-0/+2
[ Upstream commit d3fcf276b501a82d4504fd5b1ed40249546530d1 ] The TPS65214 PMIC variant has a LOCK_REG register that prevents writes to nearly all registers when locked. Unlock the registers at probe time and leave them unlocked permanently. This approach is justified because: - Register locking is very uncommon in typical system operation - No code path is expected to lock the registers during runtime - Adding a custom regmap write function would add overhead to every register write, including voltage changes triggered by CPU OPP transitions from the cpufreq governor which could happen quite frequently Cc: stable@vger.kernel.org Fixes: 7947219ab1a2d ("mfd: tps65219: Add support for TI TPS65214 PMIC") Reviewed-by: Andrew Davis <afd@ti.com> Signed-off-by: Kory Maincent (TI.com) <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20251218-fix_tps65219-v5-1-8bb511417f3a@bootlin.com Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04x86/uprobes: Fix XOL allocation failure for 32-bit tasksOleg Nesterov1-0/+1
[ Upstream commit d55c571e4333fac71826e8db3b9753fadfbead6a ] This script #!/usr/bin/bash echo 0 > /proc/sys/kernel/randomize_va_space echo 'void main(void) {}' > TEST.c # -fcf-protection to ensure that the 1st endbr32 insn can't be emulated gcc -m32 -fcf-protection=branch TEST.c -o test bpftrace -e 'uprobe:./test:main {}' -c ./test "hangs", the probed ./test task enters an endless loop. The problem is that with randomize_va_space == 0 get_unmapped_area(TASK_SIZE - PAGE_SIZE) called by xol_add_vma() can not just return the "addr == TASK_SIZE - PAGE_SIZE" hint, this addr is used by the stack vma. arch_get_unmapped_area_topdown() doesn't take TIF_ADDR32 into account and in_32bit_syscall() is false, this leads to info.high_limit > TASK_SIZE. vm_unmapped_area() happily returns the high address > TASK_SIZE and then get_unmapped_area() returns -ENOMEM after the "if (addr > TASK_SIZE - len)" check. handle_swbp() doesn't report this failure (probably it should) and silently restarts the probed insn. Endless loop. I think that the right fix should change the x86 get_unmapped_area() paths to rely on TIF_ADDR32 rather than in_32bit_syscall(). Note also that if CONFIG_X86_X32_ABI=y, in_x32_syscall() falsely returns true in this case because ->orig_ax = -1. But we need a simple fix for -stable, so this patch just sets TS_COMPAT if the probed task is 32-bit to make in_ia32_syscall() true. Fixes: 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()") Reported-by: Paulo Andrade <pandrade@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/aV5uldEvV7pb4RA8@redhat.com/ Cc: stable@vger.kernel.org Link: https://patch.msgid.link/aWO7Fdxn39piQnxu@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/bwctrl: Disable BW controller on Intel P45 using a quirkIlpo Järvinen1-0/+1
[ Upstream commit 46a9f70e93ef73860d1dbbec75ef840031f8f30a ] The commit 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller") was found to lead to a boot hang on a Intel P45 system. Testing without setting Link Bandwidth Management Interrupt Enable (LBMIE) and Link Autonomous Bandwidth Interrupt Enable (LABIE) (PCIe r7.0, sec 7.5.3.7) in bwctrl allowed system to come up. P45 is a very old chipset and supports only up to gen2 PCIe, so not having bwctrl does not seem a huge deficiency. Add no_bw_notif in struct pci_dev and quirk Intel P45 Root Port with it. Reported-by: Adam Stylinski <kungfujesus06@gmail.com> Link: https://lore.kernel.org/linux-pci/aUCt1tHhm_-XIVvi@eggsbenedict/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Adam Stylinski <kungfujesus06@gmail.com> Link: https://patch.msgid.link/20260116131513.2359-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04ipv4: igmp: annotate data-races around idev->mr_maxdelayEric Dumazet1-1/+1
[ Upstream commit e4faaf65a75f650ac4366ddff5dabb826029ca5a ] idev->mr_maxdelay is read and written locklessly, add READ_ONCE()/WRITE_ONCE() annotations. While we are at it, make this field an u32. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20260122172247.2429403-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Add Intel Nova Lake audio Device IDPeter Ujfalusi1-0/+1
[ Upstream commit b190870e0e0cfb375c0d4da02761c32083f3644d ] Add Nova Lake (NVL) audio Device ID The ID will be used by HDA legacy, SOF audio stack and the driver to determine which audio stack should be used (intel-dsp-config). Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20260120193507.14019-2-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04arm64/ftrace,bpf: Fix partial regs after bpf_prog_runJiri Olsa1-0/+25
[ Upstream commit 276f3b6daf6024ae2742afd161e7418a5584a660 ] Mahe reported issue with bpf_override_return helper not working when executed from kprobe.multi bpf program on arm. The problem is that on arm we use alternate storage for pt_regs object that is passed to bpf_prog_run and if any register is changed (which is the case of bpf_override_return) it's not propagated back to actual pt_regs object. Fixing this by introducing and calling ftrace_partial_regs_update function to propagate the values of changed registers (ip and stack). Reported-by: Mahe Tardy <mahe.tardy@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/bpf/20260112121157.854473-1-jolsa@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04EFI/CPER: don't go past the ARM processor CPER record bufferMauro Carvalho Chehab1-1/+2
[ Upstream commit eae21beecb95a3b69ee5c38a659f774e171d730e ] There's a logic inside GHES/CPER to detect if the section_length is too small, but it doesn't detect if it is too big. Currently, if the firmware receives an ARM processor CPER record stating that a section length is big, kernel will blindly trust section_length, producing a very long dump. For instance, a 67 bytes record with ERR_INFO_NUM set 46198 and section length set to 854918320 would dump a lot of data going a way past the firmware memory-mapped area. Fix it by adding a logic to prevent it to go past the buffer if ERR_INFO_NUM is too big, making it report instead: [Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [Hardware Error]: event severity: recoverable [Hardware Error]: Error 0, type: recoverable [Hardware Error]: section_type: ARM processor error [Hardware Error]: MIDR: 0xff304b2f8476870a [Hardware Error]: section length: 854918320, CPER size: 67 [Hardware Error]: section length is too big [Hardware Error]: firmware-generated error record is incorrect [Hardware Error]: ERR_INFO_NUM is 46198 Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> [ rjw: Subject and changelog tweaks ] Link: https://patch.msgid.link/41cd9f6b3ace3cdff7a5e864890849e4b1c58b63.1767871950.git.mchehab+huawei@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27ata: libata-scsi: avoid Non-NCQ command starvationDamien Le Moal1-0/+3
commit 0ea84089dbf62a92dc7889c79e6b18fc89260808 upstream. When a non-NCQ command is issued while NCQ commands are being executed, ata_scsi_qc_issue() indicates to the SCSI layer that the command issuing should be deferred by returning SCSI_MLQUEUE_XXX_BUSY. This command deferring is correct and as mandated by the ACS specifications since NCQ and non-NCQ commands cannot be mixed. However, in the case of a host adapter using multiple submission queues, when the target device is under a constant load of NCQ commands, there are no guarantees that requeueing the non-NCQ command will be executed later and it may be deferred again repeatedly as other submission queues can constantly issue NCQ commands from different CPUs ahead of the non-NCQ command. This can lead to very long delays for the execution of non-NCQ commands, and even complete starvation for these commands in the worst case scenario. Since the block layer and the SCSI layer do not distinguish between queueable (NCQ) and non queueable (non-NCQ) commands, libata-scsi SAT implementation must ensure forward progress for non-NCQ commands in the presence of NCQ command traffic. This is similar to what SAS HBAs with a hardware/firmware based SAT implementation do. Implement such forward progress guarantee by limiting requeueing of non-NCQ commands from ata_scsi_qc_issue(): when a non-NCQ command is received and NCQ commands are in-flight, do not force a requeue of the non-NCQ command by returning SCSI_MLQUEUE_XXX_BUSY and instead return 0 to indicate that the command was accepted but hold on to the qc using the new deferred_qc field of struct ata_port. This deferred qc will be issued using the work item deferred_qc_work running the function ata_scsi_deferred_qc_work() once all in-flight commands complete, which is checked with the port qc_defer() callback return value indicating that no further delay is necessary. This check is done using the helper function ata_scsi_schedule_deferred_qc() which is called from ata_scsi_qc_complete(). This thus excludes this mechanism from all internal non-NCQ commands issued by ATA EH. When a port deferred_qc is non NULL, that is, the port has a command waiting for the device queue to drain, the issuing of all incoming commands (both NCQ and non-NCQ) is deferred using the regular busy mechanism. This simplifies the code and also avoids potential denial of service problems if a user issues too many non-NCQ commands. Finally, whenever ata EH is scheduled, regardless of the reason, a deferred qc is always requeued so that it can be retried once EH completes. This is done by calling the function ata_scsi_requeue_deferred_qc() from ata_eh_set_pending(). This avoids the need for any special processing for the deferred qc in case of NCQ error, link or device reset, or device timeout. Reported-by: Xingui Yang <yangxingui@huawei.com> Reported-by: Igor Pylypiv <ipylypiv@google.com> Fixes: bdb01301f3ea ("scsi: Add host and host template flag 'host_tagset'") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Tested-by: Igor Pylypiv <ipylypiv@google.com> Tested-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-27net/mlx5: Fix multiport device check over light SFsShay Drory1-2/+2
[ Upstream commit 47bf2e813817159f4d195be83a9b5a640ee6baec ] Driver is using num_vhca_ports capability to distinguish between multiport master device and multiport slave device. num_vhca_ports is a capability the driver sets according to the MAX num_vhca_ports capability reported by FW. On the other hand, light SFs doesn't set the above capbility. This leads to wrong results whenever light SFs is checking whether he is a multiport master or slave. Therefore, use the MAX capability to distinguish between master and slave devices. Fixes: e71383fb9cd1 ("net/mlx5: Light probe local SFs") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27powercap: intel_rapl: Remove incorrect CPU check in PMU contextKuppuswamy Sathyanarayanan1-1/+1
[ Upstream commit 7537bae8b6eb635583e0e6260f61d13ddbd52087 ] The RAPL MSR read path incorrectly validates CPU context when called from the PMU subsystem: if (atomic) { if (unlikely(smp_processor_id() != cpu)) return -EIO; rdmsrq(ra->reg.msr, ra->value); } This check fails for package-scoped MSRs like RAPL energy counters, which are readable from any CPU within the package. The perf tool avoids hitting this check by validating against /sys/bus/event_source/devices/power/cpumask before opening events. However, turbostat does not perform this validation and may attempt reads from non-lead CPUs, causing the check to fail and return zero power values. Since package-scoped MSRs are architecturally accessible from any CPU in the package, remove the CPU matching check. Also rename 'atomic' to 'pmu_ctx' to clarify this indicates PMU context where rdmsrq() can be used directly instead of rdmsrl_safe_on_cpu(). Fixes: 748d6ba43afd ("powercap: intel_rapl: Enable MSR-based RAPL PMU support") Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Furquim Ulisses <ulisses.furquim@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20260209234310.1440722-2-sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27leds: expresswire: Fix chip state breakageDuje Mihanović1-3/+0
[ Upstream commit f4b830a5371914239756b0599e5dc9d4c328e387 ] It is possible to put the KTD2801 chip in an unknown/undefined state by changing the brightness very rapidly (for example, with a brightness slider). When this happens, the brightness is stuck on max and cannot be changed until the chip is power cycled. Fix this by disabling interrupts while talking to the chip. While at it, make expresswire_power_off() use fsleep() and also unexport some functions meant to be internal. Fixes: 1368d06dd2c9 ("leds: Introduce ExpressWire library") Tested-by: Karel Balej <balejk@matfyz.cz> Signed-off-by: Duje Mihanović <duje@dujemihanovic.xyz> Link: https://patch.msgid.link/20251217-expresswire-fix-v2-1-4a02b10acd96@dujemihanovic.xyz Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27nfsd: do not allow exporting of special kernel filesystemsAmir Goldstein1-0/+9
[ Upstream commit b3c78bc53630d14a5770451ede3a30e7052f3b8b ] pidfs and nsfs recently gained support for encode/decode of file handles via name_to_handle_at(2)/open_by_handle_at(2). These special kernel filesystems have custom ->open() and ->permission() export methods, which nfsd does not respect and it was never meant to be used for exporting those filesystems by nfsd. Therefore, do not allow nfsd to export filesystems with custom ->open() or ->permission() methods. Fixes: b3caba8f7a34a ("pidfs: implement file handle support") Fixes: 5222470b2fbb3 ("nsfs: support file handles") Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://patch.msgid.link/20260129100212.49727-3-amir73il@gmail.com Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27clk: Move clk_{save,restore}_context() to COMMON_CLK sectionGeert Uytterhoeven1-24/+24
[ Upstream commit f47c1b77d0a2a9c0d49ec14302e74f933398d1a3 ] The clk_save_context() and clk_restore_context() helpers are only implemented by the Common Clock Framework. They are not available when using legacy clock frameworks. Dummy implementations are provided, but only if no clock support is available at all. Hence when CONFIG_HAVE_CLK=y, but CONFIG_COMMON_CLK is not enabled: m68k-linux-gnu-ld: drivers/net/phy/air_en8811h.o: in function `en8811h_resume': air_en8811h.c:(.text+0x83e): undefined reference to `clk_restore_context' m68k-linux-gnu-ld: drivers/net/phy/air_en8811h.o: in function `en8811h_suspend': air_en8811h.c:(.text+0x856): undefined reference to `clk_save_context' Fix this by moving forward declarations and dummy implementions from the HAVE_CLK to the COMMON_CLK section. Fixes: 8b95d1ce3300c411 ("clk: Add functions to save/restore clock context en-masse") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511301553.eaEz1nEW-lkp@intel.com/ Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27Input: adp5589 - remove a leftover header fileVladimir Zapolskiy1-180/+0
[ Upstream commit f8a6e5eac701369afb5d69aba875dc5fec93003d ] In commit 3bdbd0858df6 ("Input: adp5589: remove the driver") the last user of include/linux/input/adp5589.h was removed along with the whole driver, thus the header file can be also removed. Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Nuno Sá <nuno.sa@analog.com> Fixes: 3bdbd0858df6 ("Input: adp5589: remove the driver") Link: https://patch.msgid.link/20260113151140.3843753-1-vz@mleia.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27mtd: spinand: Fix kernel docMiquel Raynal1-1/+1
[ Upstream commit a57b1f07d2d35843a7ada30c8cf9a215c0931868 ] The @data buffer is 5 bytes, not 4, it has been extended for the need of devices with an extra ID bytes. Fixes: 34a956739d29 ("mtd: spinand: Add support for 5-byte IDs") Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27ata: libata: Add ATA_QUIRK_MAX_SEC and convert all device quirksNiklas Cassel2-3/+2
[ Upstream commit 59b7bb3d48333889adb1dd2aac3ab0cf26714390 ] Add a new quirk ATA_QUIRK_MAX_SEC, which has a separate table with device specific values. Convert all existing ATA_QUIRK_MAX_SEC_XXX device quirks in __ata_dev_quirks to the new format. Quirks ATA_QUIRK_MAX_SEC_128 and ATA_QUIRK_MAX_SEC_1024 cannot be removed yet, since they are also used by libata.force, which functionally, is a separate user of the quirks. The quirks will be removed once all users have been converted to use the new format. The quirk ATA_QUIRK_MAX_SEC_8191 can be removed since it has no equivalent libata.force parameter. Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Stable-dep-of: 5f64ae1ef639 ("ata: libata-core: Quirk INTEL SSDSC2KG480G8 max_sectors") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27crypto: ccp - Send PSP_CMD_TEE_RING_DESTROY when PSP_CMD_TEE_RING_INIT failsMario Limonciello (AMD)1-0/+1
[ Upstream commit 7b85137caf110a09a4a18f00f730de4709f9afc8 ] The hibernate resume sequence involves loading a resume kernel that is just used for loading the hibernate image before shifting back to the existing kernel. During that hibernate resume sequence the resume kernel may have loaded the ccp driver. If this happens the resume kernel will also have called PSP_CMD_TEE_RING_INIT but it will never have called PSP_CMD_TEE_RING_DESTROY. This is problematic because the existing kernel needs to re-initialize the ring. One could argue that the existing kernel should call destroy as part of restore() but there is no guarantee that the resume kernel did or didn't load the ccp driver. There is also no callback opportunity for the resume kernel to destroy before handing back control to the existing kernel. Similar problems could potentially exist with the use of kdump and crash handling. I actually reproduced this issue like this: 1) rmmod ccp 2) hibernate the system 3) resume the system 4) modprobe ccp The resume kernel will have loaded ccp but never destroyed and then when I try to modprobe it fails. Because of these possible cases add a flow that checks the error code from the PSP_CMD_TEE_RING_INIT call and tries to call PSP_CMD_TEE_RING_DESTROY if it failed. If this succeeds then call PSP_CMD_TEE_RING_INIT again. Fixes: f892a21f51162 ("crypto: ccp - use generic power management") Reported-by: Lars Francke <lars.francke@gmail.com> Closes: https://lore.kernel.org/platform-driver-x86/CAD-Ua_gfJnQSo8ucS_7ZwzuhoBRJ14zXP7s8b-zX3ZcxcyWePw@mail.gmail.com/ Tested-by: Yijun Shen <Yijun.Shen@Dell.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://patch.msgid.link/20260116041132.153674-6-superm1@kernel.org Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27netfilter: nft_counter: fix reset of counters on 32bit archsAnders Grahn1-0/+10
[ Upstream commit 1e13f27e0675552161ab1778be9a23a636dde8a7 ] nft_counter_reset() calls u64_stats_add() with a negative value to reset the counter. This will work on 64bit archs, hence the negative value added will wrap as a 64bit value which then can wrap the stat counter as well. On 32bit archs, the added negative value will wrap as a 32bit value and _not_ wrapping the stat counter properly. In most cases, this would just lead to a very large 32bit value being added to the stat counter. Fix by introducing u64_stats_sub(). Fixes: 4a1d3acd6ea8 ("netfilter: nft_counter: Use u64_stats_t for statistic.") Signed-off-by: Anders Grahn <anders.grahn@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI: endpoint: Add BAR subrange mapping supportKoichiro Den2-0/+27
[ Upstream commit 31fb95400451040050361e22ff480476964280f0 ] Some endpoint platforms have only a small number of usable BARs. At the same time, EPF drivers (e.g. vNTB) may need multiple independent inbound regions (control/scratchpad, one or more memory windows, and optionally MSI or other feature-related regions). Subrange mapping allows these to share a single BAR without consuming additional BARs that may not be available, or forcing a fragile layout by aggressively packing into a single contiguous memory range. Extend the PCI endpoint core to support mapping subranges within a BAR. Add an optional 'submap' field in struct pci_epf_bar so an endpoint function driver can request inbound mappings that fully cover the BAR. Introduce a new EPC feature bit, subrange_mapping, and reject submap requests from pci_epc_set_bar() unless the controller advertises both subrange_mapping and dynamic_inbound_mapping features. The submap array describes the complete BAR layout (no overlaps and no gaps are allowed to avoid exposing untranslated address ranges). This provides the generic infrastructure needed to map multiple logical regions into a single BAR at different offsets, without assuming a controller-specific inbound address translation mechanism. Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260124145012.2794108-3-den@valinux.co.jp Stable-dep-of: 72cb5ed2a5c6 ("PCI: dwc: ep: Add per-PF BAR and inbound ATU mapping support") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI: endpoint: Add dynamic_inbound_mapping EPC featureKoichiro Den1-0/+5
[ Upstream commit 06a81c5940e46cc7bddee28f16bdd29a12a76344 ] Introduce a new EPC feature bit (dynamic_inbound_mapping) that indicates whether an Endpoint Controller can update the inbound address translation for a BAR without requiring the EPF driver to clear/reset the BAR first. Endpoint Function drivers (e.g. vNTB) can use this information to decide whether it really is safe to call pci_epc_set_bar() multiple times to update inbound mappings for the BAR. Suggested-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260124145012.2794108-2-den@valinux.co.jp Stable-dep-of: 72cb5ed2a5c6 ("PCI: dwc: ep: Add per-PF BAR and inbound ATU mapping support") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27ipc: don't audit capability check in ipc_permissions()Ondrej Mosnacek1-0/+6
[ Upstream commit 8924336531e21b187d724b5fdf5277269c9ec22c ] The IPC sysctls implement the ctl_table_root::permissions hook and they override the file access mode based on the CAP_CHECKPOINT_RESTORE capability, which is being checked regardless of whether any access is actually denied or not, so if an LSM denies the capability, an audit record may be logged even when access is in fact granted. It wouldn't be viable to restructure the sysctl permission logic to only check the capability when the access would be actually denied if it's not granted. Thus, do the same as in net_ctl_permissions() (net/sysctl_net.c) - switch from ns_capable() to ns_capable_noaudit(), so that the check never emits an audit record. Link: https://lkml.kernel.org/r/20260122141303.241133-1-omosnace@redhat.com Fixes: 0889f44e2810 ("ipc: Check permissions for checkpoint_restart sysctls at open time") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Alexey Gladkov <legion@kernel.org> Acked-by: Serge Hallyn <serge@hallyn.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Paul Moore <paul@paul-moore.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27xdrgen: Initialize data pointer for zero-length itemsChuck Lever1-12/+8
[ Upstream commit 27b0fcae8f535fb882b1876227a935dcfdf576aa ] The xdrgen decoders for strings and opaque data had an optimization that skipped calling xdr_inline_decode() when the item length was zero. This left the data pointer uninitialized, which could lead to unpredictable behavior when callers access it. Remove the zero-length check and always call xdr_inline_decode(). When passed a length of zero, xdr_inline_decode() returns the current buffer position, which is valid and matches the behavior of hand-coded XDR decoders throughout the kernel. Fixes: 4b132aacb076 ("tools: Add xdrgen") Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27kallsyms/ftrace: set module buildid in ftrace_mod_address_lookup()Petr Mladek1-2/+4
[ Upstream commit e8a1e7eaa19d0b757b06a2f913e3eeb4b1c002c6 ] __sprint_symbol() might access an invalid pointer when kallsyms_lookup_buildid() returns a symbol found by ftrace_mod_address_lookup(). The ftrace lookup function must set both @modname and @modbuildid the same way as module_address_lookup(). Link: https://lkml.kernel.org/r/20251128135920.217303-7-pmladek@suse.com Fixes: 9294523e3768 ("module: add printk formats to add module build ID to stacktraces") Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Aaron Tomlin <atomlin@atomlin.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkman <daniel@iogearbox.net> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Kees Cook <kees@kernel.org> Cc: Luis Chamberalin <mcgrof@kernel.org> Cc: Marc Rutland <mark.rutland@arm.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27module: add helper function for reading module_buildid()Petr Mladek1-0/+9
[ Upstream commit acfdbb4ab2910ff6f03becb569c23ac7b2223913 ] Add a helper function for reading the optional "build_id" member of struct module. It is going to be used also in ftrace_mod_address_lookup(). Use "#ifdef" instead of "#if IS_ENABLED()" to match the declaration of the optional field in struct module. Link: https://lkml.kernel.org/r/20251128135920.217303-4-pmladek@suse.com Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Cc: Aaron Tomlin <atomlin@atomlin.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkman <daniel@iogearbox.net> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Kees Cook <kees@kernel.org> Cc: Luis Chamberalin <mcgrof@kernel.org> Cc: Marc Rutland <mark.rutland@arm.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: e8a1e7eaa19d ("kallsyms/ftrace: set module buildid in ftrace_mod_address_lookup()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27kallsyms/bpf: rename __bpf_address_lookup() to bpf_address_lookup()Petr Mladek1-22/+4
[ Upstream commit cd6735896d0343942cf3dafb48ce32eb79341990 ] bpf_address_lookup() has been used only in kallsyms_lookup_buildid(). It was supposed to set @modname and @modbuildid when the symbol was in a module. But it always just cleared @modname because BPF symbols were never in a module. And it did not clear @modbuildid because the pointer was not passed. The wrapper is no longer needed. Both @modname and @modbuildid are now always initialized to NULL in kallsyms_lookup_buildid(). Remove the wrapper and rename __bpf_address_lookup() to bpf_address_lookup() because this variant is used everywhere. [akpm@linux-foundation.org: fix loongarch] Link: https://lkml.kernel.org/r/20251128135920.217303-6-pmladek@suse.com Fixes: 9294523e3768 ("module: add printk formats to add module build ID to stacktraces") Signed-off-by: Petr Mladek <pmladek@suse.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Aaron Tomlin <atomlin@atomlin.com> Cc: Daniel Borkman <daniel@iogearbox.net> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Kees Cook <kees@kernel.org> Cc: Luis Chamberalin <mcgrof@kernel.org> Cc: Marc Rutland <mark.rutland@arm.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27vsnprintf: drop __printf() attributes on binary printing functionsArnd Bergmann2-3/+2
[ Upstream commit b07829d546c83134629591f02c5348d57cea0c1e ] The printf() format attributes are applied inconsistently for the binary printf helpers, which causes warnings for the bpf_trace code using them from functions that pass down format strings: kernel/trace/bpf_trace.c: In function '____bpf_trace_printk': kernel/trace/bpf_trace.c:377:9: error: function '____bpf_trace_printk' might be a candidate for 'gnu_printf' format attribute [-Werror=suggest-attribute=format] 377 | ret = bstr_printf(data.buf, MAX_BPRINTF_BUF, fmt, data.bin_args); | ^~~ This can be addressed either by annotating all five callers in bpf code, or by removing the annotations on the callees that were added by Andy Shevchenko last year. As Alexei Starovoitov points out, there are no callers in C code that would benefit from the __printf attributes, the only users are in BPF code or in the do_trace_printk() helper that already checks the arguments. Drop all three of these annotations, reverting the earlierl commits that added these, in order to get a clean build with -Wsuggest-attribute=format. Fixes: 6b2c1e30ad68 ("seq_file: Mark binary printing functions with __printf() attribute") Fixes: 7bf819aa992f ("vsnprintf: Mark binary printing functions with __printf() attribute") Link: https://lore.kernel.org/all/CAADnVQK3eZp3yp35OUx8j1UBsQFhgsn5-4VReqAJ=68PaaKYmg@mail.gmail.com/ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202512061640.9hKTnB8p-lkp@intel.com/ Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Acked-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Petr Mladek <pmladek@suse.com> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20260204132643.1302967-1-arnd@kernel.org Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27soc: qcom: ubwc: add missing includeDmitry Baryshkov1-0/+1
[ Upstream commit ccef4b2703ff5b0de0b1bda30a0de3026d52eb19 ] The header has a function which calls pr_err(). Don't require users of the header to include <linux/printk.h> and include it here. Fixes: 87cfc79dcd60 ("drm/msm/a6xx: Resolve the meaning of UBWC_MODE") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Link: https://lore.kernel.org/r/20260110-iris-ubwc-v1-1-dd70494dcd7b@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27hwrng: core - use RCU and work_struct to fix race conditionLianjie Wang1-0/+2
[ Upstream commit cc2f39d6ac48e6e3cb2d6240bc0d6df839dd0828 ] Currently, hwrng_fill is not cleared until the hwrng_fillfn() thread exits. Since hwrng_unregister() reads hwrng_fill outside the rng_mutex lock, a concurrent hwrng_unregister() may call kthread_stop() again on the same task. Additionally, if hwrng_unregister() is called immediately after hwrng_register(), the stopped thread may have never been executed. Thus, hwrng_fill remains dirty even after hwrng_unregister() returns. In this case, subsequent calls to hwrng_register() will fail to start new threads, and hwrng_unregister() will call kthread_stop() on the same freed task. In both cases, a use-after-free occurs: refcount_t: addition on 0; use-after-free. WARNING: ... at lib/refcount.c:25 refcount_warn_saturate+0xec/0x1c0 Call Trace: kthread_stop+0x181/0x360 hwrng_unregister+0x288/0x380 virtrng_remove+0xe3/0x200 This patch fixes the race by protecting the global hwrng_fill pointer inside the rng_mutex lock, so that hwrng_fillfn() thread is stopped only once, and calls to kthread_run() and kthread_stop() are serialized with the lock held. To avoid deadlock in hwrng_fillfn() while being stopped with the lock held, we convert current_rng to RCU, so that get_current_rng() can read current_rng without holding the lock. To remove the lock from put_rng(), we also delay the actual cleanup into a work_struct. Since get_current_rng() no longer returns ERR_PTR values, the IS_ERR() checks are removed from its callers. With hwrng_fill protected by the rng_mutex lock, hwrng_fillfn() can no longer clear hwrng_fill itself. Therefore, if hwrng_fillfn() returns directly after current_rng is dropped, kthread_stop() would be called on a freed task_struct later. To fix this, hwrng_fillfn() calls schedule() now to keep the task alive until being stopped. The kthread_stop() call is also moved from hwrng_unregister() to drop_current_rng(), ensuring kthread_stop() is called on all possible paths where current_rng becomes NULL, so that the thread would not wait forever. Fixes: be4000bc4644 ("hwrng: create filler thread") Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Lianjie Wang <karin0.zst@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27mfd: wm8350-core: Use IRQF_ONESHOTSebastian Andrzej Siewior1-1/+1
[ Upstream commit 553b4999cbe231b5011cb8db05a3092dec168aca ] Using a threaded interrupt without a dedicated primary handler mandates the IRQF_ONESHOT flag to mask the interrupt source while the threaded handler is active. Otherwise the interrupt can fire again before the threaded handler had a chance to run. Mark explained that this should not happen with this hardware since it is a slow irqchip which is behind an I2C/ SPI bus but the IRQ-core will refuse to accept such a handler. Set IRQF_ONESHOT so the interrupt source is masked until the secondary handler is done. Fixes: 1c6c69525b40e ("genirq: Reject bogus threaded irq requests") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20260128095540.863589-16-bigeasy@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27genirq: Set IRQF_COND_ONESHOT in devm_request_irq().Sebastian Andrzej Siewior1-1/+1
[ Upstream commit 943b052ded21feb84f293d40b06af3181cd0d0d7 ] The flag IRQF_COND_ONESHOT was already force-added to request_irq() because the ACPI SCI interrupt handler is using the IRQF_ONESHOT flag which breaks all shared handlers. devm_request_irq() needs the same change since some users, such as int0002_vgpio, are using this function instead. Add IRQF_COND_ONESHOT to the flags passed to devm_request_irq(). Fixes: c37927a203fa2 ("genirq: Set IRQF_COND_ONESHOT in request_irq()") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260128095540.863589-2-bigeasy@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27ftrace,bpf: Remove FTRACE_OPS_FL_JMP ftrace_ops flagJiri Olsa1-1/+0
[ Upstream commit 4be42c92220128b3128854a2d6b0f0ad0bcedbdb ] At the moment the we allow the jmp attach only for ftrace_ops that has FTRACE_OPS_FL_JMP set. This conflicts with following changes where we use single ftrace_ops object for all direct call sites, so all could be be attached via just call or jmp. We already limit the jmp attach support with config option and bit (LSB) set on the trampoline address. It turns out that's actually enough to limit the jmp attach for architecture and only for chosen addresses (with LSB bit set). Each user of register_ftrace_direct or modify_ftrace_direct can set the trampoline bit (LSB) to indicate it has to be attached by jmp. The bpf trampoline generation code uses trampoline flags to generate jmp-attach specific code and ftrace inner code uses the trampoline bit (LSB) to handle return from jmp attachment, so there's no harm to remove the FTRACE_OPS_FL_JMP bit. The fexit/fmodret performance stays the same (did not drop), current code: fentry : 77.904 ± 0.546M/s fexit : 62.430 ± 0.554M/s fmodret : 66.503 ± 0.902M/s with this change: fentry : 80.472 ± 0.061M/s fexit : 63.995 ± 0.127M/s fmodret : 67.362 ± 0.175M/s Fixes: 25e4e3565d45 ("ftrace: Introduce FTRACE_OPS_FL_JMP") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/bpf/20251230145010.103439-2-jolsa@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27seqlock: fix scoped_seqlock_read kernel-docRandy Dunlap1-9/+8
[ Upstream commit f88a31308db6a856229150039b0f56d59696ed31 ] Eliminate all kernel-doc warnings in seqlock.h: - correct the macro to have "()" immediately following the macro name - don't include the macro parameters in the short description (first line) - make the parameter names in the comments match the actual macro parameter names. - use "::" for the Example WARNING: include/linux/seqlock.h:1341 This comment starts with '/**', but isn't a kernel-doc comment. * scoped_seqlock_read (lock, ss_state) - execute the read side critical Documentation/locking/seqlock:242: include/linux/seqlock.h:1351: WARNING: Definition list ends without a blank line; unexpected unindent. [docutils] Warning: include/linux/seqlock.h:1357 function parameter '_seqlock' not described in 'scoped_seqlock_read' Warning: include/linux/seqlock.h:1357 function parameter '_target' not described in 'scoped_seqlock_read' Fixes: cc39f3872c08 ("seqlock: Introduce scoped_seqlock_read()") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260123183749.3997533-1-rdunlap@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27bpf: Fix tcx/netkit detach permissions when prog fd isn't givenGuillaume Gonnet2-0/+15
[ Upstream commit ae23bc81ddf7c17b663c4ed1b21e35527b0a7131 ] This commit fixes a security issue where BPF_PROG_DETACH on tcx or netkit devices could be executed by any user when no program fd was provided, bypassing permission checks. The fix adds a capability check for CAP_NET_ADMIN or CAP_SYS_ADMIN in this case. Fixes: e420bed02507 ("bpf: Add fd-based tcx multi-prog infra with link support") Signed-off-by: Guillaume Gonnet <ggonnet.linux@gmail.com> Link: https://lore.kernel.org/r/20260127160200.10395-1-ggonnet.linux@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27bpf, sockmap: Fix FIONREAD for sockmapJiayuan Chen1-2/+66
[ Upstream commit 929e30f9312514902133c45e51c79088421ab084 ] A socket using sockmap has its own independent receive queue: ingress_msg. This queue may contain data from its own protocol stack or from other sockets. Therefore, for sockmap, relying solely on copied_seq and rcv_nxt to calculate FIONREAD is not enough. This patch adds a new msg_tot_len field in the psock structure to record the data length in ingress_msg. Additionally, we implement new ioctl interfaces for TCP and UDP to intercept FIONREAD operations. Note that we intentionally do not include sk_receive_queue data in the FIONREAD result. Data in sk_receive_queue has not yet been processed by the BPF verdict program, and may be redirected to other sockets or dropped. Including it would create semantic ambiguity since this data may never be readable by the user. Unix and VSOCK sockets have similar issues, but fixing them is outside the scope of this patch as it would require more intrusive changes. Previous work by John Fastabend made some efforts towards FIONREAD support: commit e5c6de5fa025 ("bpf, sockmap: Incorrectly handling copied_seq") Although the current patch is based on the previous work by John Fastabend, it is acceptable for our Fixes tag to point to the same commit. FD1:read() -- FD1->copied_seq++ | [read data] | [enqueue data] v [sockmap] -> ingress to self -> ingress_msg queue FD1 native stack ------> ^ -- FD1->rcv_nxt++ -> redirect to other | [enqueue data] | | | ingress to FD1 v ^ ... | [sockmap] FD2 native stack Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()") Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20260124113314.113584-3-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27bpf, sockmap: Fix incorrect copied_seq calculationJiayuan Chen1-0/+2
[ Upstream commit b40cc5adaa80e1471095a62d78233b611d7a558c ] A socket using sockmap has its own independent receive queue: ingress_msg. This queue may contain data from its own protocol stack or from other sockets. The issue is that when reading from ingress_msg, we update tp->copied_seq by default. However, if the data is not from its own protocol stack, tcp->rcv_nxt is not increased. Later, if we convert this socket to a native socket, reading from this socket may fail because copied_seq might be significantly larger than rcv_nxt. This fix also addresses the syzkaller-reported bug referenced in the Closes tag. This patch marks the skmsg objects in ingress_msg. When reading, we update copied_seq only if the data is from its own protocol stack. FD1:read() -- FD1->copied_seq++ | [read data] | [enqueue data] v [sockmap] -> ingress to self -> ingress_msg queue FD1 native stack ------> ^ -- FD1->rcv_nxt++ -> redirect to other | [enqueue data] | | | ingress to FD1 v ^ ... | [sockmap] FD2 native stack Closes: https://syzkaller.appspot.com/bug?extid=06dbd397158ec0ea4983 Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()") Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://lore.kernel.org/r/20260124113314.113584-2-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27crypto: hisilicon - consolidate qp creation and start in hisi_qm_alloc_qps_nodeChenghai Huang1-1/+0
[ Upstream commit 72f3bbebff15e87171271d643ee2672fb8e92031 ] Consolidate the creation and start of qp into the function hisi_qm_alloc_qps_node. This change eliminates the need for each module to perform these steps in two separate phases (creation and start). Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Stable-dep-of: 6aff4d977e2d ("crypto: hisilicon/hpre - support the hpre algorithm fallback") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27crypto: hisilicon/qm - centralize the sending locks of each module into qmChenghai Huang1-0/+1
[ Upstream commit 8cd9b608ee8dea78cac3f373bd5e3b3de2755d46 ] When a single queue used by multiple tfms, the protection of shared resources by individual module driver programs is no longer sufficient. The hisi_qp_send needs to be ensured by the lock in qp. Fixes: 5fdb4b345cfb ("crypto: hisilicon - add a lock for the qp send operation") Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27crypto: hisilicon/qm - enhance the configuration of req_type in queue attributesChenghai Huang1-2/+1
[ Upstream commit 21452eaa06edb5f6038720e643aed0bbfffad9c3 ] Originally, when a queue was requested, it could only be configured with the default algorithm type of 0. Now, when multiple tfms use the same queue, the queue must be selected based on its attributes to meet the requirements of tfm tasks. So the algorithm type attribute of queue need to be distinguished. Just like a queue used for compression in ZIP cannot be used for decompression tasks. Fixes: 3f1ec97aacf1 ("crypto: hisilicon/qm - Put device finding logic into QM") Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27crypto: hisilicon/sec - move backlog management to qp and store sqe in qp ↵Chenghai Huang1-0/+8
for callback [ Upstream commit 08eb67d23e5172a5d1e60f1f0acccee569fe10ba ] When multiple tfm use a same qp, the backlog data should be managed centrally by the qp, rather than in the qp_ctx of each req. Additionally, since SEC_BD_TYPE1 and SEC_BD_TYPE2 cannot use the tag of the sqe to carry the virtual address of the req, the sent sqe is stored in the qp. This allows the callback function to get the req address. To handle the differences between hardware types, the callback functions are split into two separate implementations. Fixes: f0ae287c5045 ("crypto: hisilicon/sec2 - implement full backlog mode for sec") Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27io_uring/eventfd: remove unused ctx->evfd_last_cq_tail memberJens Axboe1-4/+3
[ Upstream commit 07f3c3a1cd56c2048a92dad0c11f15e4ac3888c1 ] A previous commit got rid of any use of this member, but forgot to remove it. Kill it. Fixes: f4bb2f65bb81 ("io_uring/eventfd: move ctx->evfd_last_cq_tail into io_ev_fd") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27device_cgroup: remove branch hint after code refactorBreno Leitao1-1/+1
[ Upstream commit 6784f274722559c0cdaaa418bc8b7b1d61c314f9 ] commit 4ef4ac360101 ("device_cgroup: avoid access to ->i_rdev in the common case in devcgroup_inode_permission()") reordered the checks in devcgroup_inode_permission() to check the inode mode before checking i_rdev, for better cache behavior. However, the likely() annotation on the i_rdev check was not updated to reflect the new code flow. Originally, when i_rdev was checked first, likely(!inode->i_rdev) made sense because most inodes were(?) regular files/directories, thus i_rdev == 0. After the reorder, by the time we reach the i_rdev check, we have already confirmed the inode IS a block or character device. Block and character special files are precisely defined by having a device number (i_rdev), so !inode->i_rdev is now the rare edge case, not the common case. Branch profiling confirmed this is 100% mispredicted: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 2631904 100 devcgroup_inode_permission device_cgroup.h 24 Remove likely() to avoid giving the wrong hint to the CPU. Fixes: 4ef4ac360101 ("device_cgroup: avoid access to ->i_rdev in the common case in devcgroup_inode_permission()") Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260107-likely_device-v1-1-0c55f83a7e47@debian.org Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27audit: move the compat_xxx_class[] extern declarations to audit_arch.hBen Dooks2-6/+7
[ Upstream commit 76489955c6d4a065ca69dc88faf7a50a59b66f35 ] The comapt_xxx_class symbols aren't declared in anything that lib/comapt_audit.c is including (arm64 build) which is causing the following sparse warnings: lib/compat_audit.c:7:10: warning: symbol 'compat_dir_class' was not declared. Should it be static? lib/compat_audit.c:12:10: warning: symbol 'compat_read_class' was not declared. Should it be static? lib/compat_audit.c:17:10: warning: symbol 'compat_write_class' was not declared. Should it be static? lib/compat_audit.c:22:10: warning: symbol 'compat_chattr_class' was not declared. Should it be static? lib/compat_audit.c:27:10: warning: symbol 'compat_signal_class' was not declared. Should it be static? Trying to fix this by chaning compat_audit.c to inclde <linux/audit.h> does not work on arm64 due to compile errors with the extra includes that changing this header makes. The simpler thing would be just to move the definitons of these symbols out of <linux/audit.h> into <linux/audit_arch.h> which is included. Fixes: 4b58841149dca ("audit: Add generic compat syscall support") Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> [PM: rewrite subject line, fixed line length in description] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>