summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2025-08-15bpf: Move bpf map owner out of common structDaniel Borkmann1-12/+24
[ Upstream commit fd1c98f0ef5cbcec842209776505d9e70d8fcd53 ] Given this is only relevant for BPF tail call maps, it is adding up space and penalizing other map types. We also need to extend this with further objects to track / compare to. Therefore, lets move this out into a separate structure and dynamically allocate it only for BPF tail call maps. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20250730234733.530041-2-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: abad3d0bad72 ("bpf: Fix oob access in cgroup local storage") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15bpf: Add cookie object to bpf mapsDaniel Borkmann1-0/+1
[ Upstream commit 12df58ad294253ac1d8df0c9bb9cf726397a671d ] Add a cookie to BPF maps to uniquely identify BPF maps for the timespan when the node is up. This is different to comparing a pointer or BPF map id which could get rolled over and reused. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20250730234733.530041-1-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: abad3d0bad72 ("bpf: Fix oob access in cgroup local storage") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15proc: use the same treatment to check proc_lseek as ones for proc_read_iter ↵wangzijie1-0/+1
et.al [ Upstream commit ff7ec8dc1b646296f8d94c39339e8d3833d16c05 ] Check pde->proc_ops->proc_lseek directly may cause UAF in rmmod scenario. It's a gap in proc_reg_open() after commit 654b33ada4ab("proc: fix UAF in proc_get_inode()"). Followed by AI Viro's suggestion, fix it in same manner. Link: https://lkml.kernel.org/r/20250607021353.1127963-1-wangzijie1@honor.com Fixes: 3f61631d47f1 ("take care to handle NULL ->proc_lseek()") Signed-off-by: wangzijie <wangzijie1@honor.com> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15crypto: ahash - Add support for drivers with no fallbackHerbert Xu1-0/+3
[ Upstream commit 4ccd065a69df163cd9fe0dd8e0f609f1eeb4723d ] Some drivers cannot have a fallback, e.g., because the key is held in hardware. Allow these to be used with ahash by adding the bit CRYPTO_ALG_NO_FALLBACK. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Harald Freudenberger <freude@linux.ibm.com> Stable-dep-of: 1e2b7fcd3f07 ("crypto: ahash - Stop legacy tfms from using the set_virt fallback path") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15padata: Fix pd UAF once and for allHerbert Xu1-3/+0
[ Upstream commit 71203f68c7749609d7fc8ae6ad054bdedeb24f91 ] There is a race condition/UAF in padata_reorder that goes back to the initial commit. A reference count is taken at the start of the process in padata_do_parallel, and released at the end in padata_serial_worker. This reference count is (and only is) required for padata_replace to function correctly. If padata_replace is never called then there is no issue. In the function padata_reorder which serves as the core of padata, as soon as padata is added to queue->serial.list, and the associated spin lock released, that padata may be processed and the reference count on pd would go away. Fix this by getting the next padata before the squeue->serial lock is released. In order to make this possible, simplify padata_reorder by only calling it once the next padata arrives. Fixes: 16295bec6398 ("padata: Generic parallelization/serialization interface") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15fortify: Fix incorrect reporting of read buffer sizeKees Cook1-1/+1
[ Upstream commit 94fd44648dae2a5b6149a41faa0b07928c3e1963 ] When FORTIFY_SOURCE reports about a run-time buffer overread, the wrong buffer size was being shown in the error message. (The bounds checking was correct.) Fixes: 3d965b33e40d ("fortify: Improve buffer overflow reporting") Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20250729231817.work.023-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15ring-buffer: Remove ring_buffer_read_prepare_sync()Steven Rostedt1-3/+1
[ Upstream commit 119a5d573622ae90ba730d18acfae9bb75d77b9a ] When the ring buffer was first introduced, reading the non-consuming "trace" file required disabling the writing of the ring buffer. To make sure the writing was fully disabled before iterating the buffer with a non-consuming read, it would set the disable flag of the buffer and then call an RCU synchronization to make sure all the buffers were synchronized. The function ring_buffer_read_start() originally would initialize the iterator and call an RCU synchronization, but this was for each individual per CPU buffer where this would get called many times on a machine with many CPUs before the trace file could be read. The commit 72c9ddfd4c5bf ("ring-buffer: Make non-consuming read less expensive with lots of cpus.") separated ring_buffer_read_start into ring_buffer_read_prepare(), ring_buffer_read_sync() and then ring_buffer_read_start() to allow each of the per CPU buffers to be prepared, call the read_buffer_read_sync() once, and then the ring_buffer_read_start() for each of the CPUs which made things much faster. The commit 1039221cc278 ("ring-buffer: Do not disable recording when there is an iterator") removed the requirement of disabling the recording of the ring buffer in order to iterate it, but it did not remove the synchronization that was happening that was required to wait for all the buffers to have no more writers. It's now OK for the buffers to have writers and no synchronization is needed. Remove the synchronization and put back the interface for the ring buffer iterator back before commit 72c9ddfd4c5bf was applied. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250630180440.3eabb514@batman.local.home Reported-by: David Howells <dhowells@redhat.com> Fixes: 1039221cc278 ("ring-buffer: Do not disable recording when there is an iterator") Tested-by: David Howells <dhowells@redhat.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15RDMA/mlx5: Fix UMR modifying of mkey page sizeEdward Srouji1-0/+1
[ Upstream commit c4f96972c3c206ac8f6770b5ecd5320b561d0058 ] When changing the page size on an mkey, the driver needs to set the appropriate bits in the mkey mask to indicate which fields are being modified. The 6th bit of a page size in mlx5 driver is considered an extension, and this bit has a dedicated capability and mask bits. Previously, the driver was not setting this mask in the mkey mask when performing page size changes, regardless of its hardware support, potentially leading to an incorrect page size updates. This fixes the issue by setting the relevant bit in the mkey mask when performing page size changes on an mkey and the 6th bit of this field is supported by the hardware. Fixes: cef7dde8836a ("net/mlx5: Expand mkey page size to support 6 bits") Signed-off-by: Edward Srouji <edwards@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/9f43a9c73bf2db6085a99dc836f7137e76579f09.1751979184.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15sched/psi: Optimize psi_group_change() cpu_clock() usagePeter Zijlstra1-4/+2
[ Upstream commit 570c8efd5eb79c3725ba439ce105ed1bedc5acd9 ] Dietmar reported that commit 3840cbe24cf0 ("sched: psi: fix bogus pressure spikes from aggregation race") caused a regression for him on a high context switch rate benchmark (schbench) due to the now repeating cpu_clock() calls. In particular the problem is that get_recent_times() will extrapolate the current state to 'now'. But if an update uses a timestamp from before the start of the update, it is possible to get two reads with inconsistent results. It is effectively back-dating an update. (note that this all hard-relies on the clock being synchronized across CPUs -- if this is not the case, all bets are off). Combine this problem with the fact that there are per-group-per-cpu seqcounts, the commit in question pushed the clock read into the group iteration, causing tree-depth cpu_clock() calls. On architectures where cpu_clock() has appreciable overhead, this hurts. Instead move to a per-cpu seqcount, which allows us to have a single clock read for all group updates, increasing internal consistency and lowering update overhead. This comes at the cost of a longer update side (proportional to the tree depth) which can cause the read side to retry more often. Fixes: 3840cbe24cf0 ("sched: psi: fix bogus pressure spikes from aggregation race") Reported-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>, Link: https://lkml.kernel.org/20250522084844.GC31726@noisy.programming.kicks-ass.net Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15team: replace team lock with rtnl lockStanislav Fomichev1-3/+0
[ Upstream commit bfb4fb77f9a8ce33ce357224569eae5564eec573 ] syszbot reports various ordering issues for lower instance locks and team lock. Switch to using rtnl lock for protecting team device, similar to bonding. Based on the patch by Tetsuo Handa. Cc: Jiri Pirko <jiri@resnulli.us> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04 Reported-by: syzbot+71fd22ae4b81631e22fd@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=71fd22ae4b81631e22fd Fixes: 6b1d3c5f675c ("team: grab team lock during team_change_rx_flags") Link: https://lkml.kernel.org/r/ZoZ2RH9BcahEB9Sb@nanopsycho.orion Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250623153147.3413631-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15pps: fix poll supportDenis OSTERLAND-HEIM1-0/+1
[ Upstream commit 12c409aa1ec2592280a2ddcc66ff8f3c7f7bb171 ] Because pps_cdev_poll() returns unconditionally EPOLLIN, a user space program that calls select/poll get always an immediate data ready-to-read response. As a result the intended use to wait until next data becomes ready does not work. User space snippet: struct pollfd pollfd = { .fd = open("/dev/pps0", O_RDONLY), .events = POLLIN|POLLERR, .revents = 0 }; while(1) { poll(&pollfd, 1, 2000/*ms*/); // returns immediate, but should wait if(revents & EPOLLIN) { // always true struct pps_fdata fdata; memset(&fdata, 0, sizeof(memdata)); ioctl(PPS_FETCH, &fdata); // currently fetches data at max speed } } Lets remember the last fetch event counter and compare this value in pps_cdev_poll() with most recent event counter and return 0 if they are equal. Signed-off-by: Denis OSTERLAND-HEIM <denis.osterland@diehl.com> Co-developed-by: Rodolfo Giometti <giometti@enneenne.com> Signed-off-by: Rodolfo Giometti <giometti@enneenne.com> Fixes: eae9d2ba0cfc ("LinuxPPS: core support") Link: https://lore.kernel.org/all/f6bed779-6d59-4f0f-8a59-b6312bd83b4e@enneenne.com/ Acked-by: Rodolfo Giometti <giometti@enneenne.com> Link: https://lore.kernel.org/r/c3c50ad1eb19ef553eca8a57c17f4c006413ab70.camel@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15soc: qcom: fix endianness for QMI headerAlexander Wilhelm1-3/+3
[ Upstream commit 07a4688833b237331e5045f90fc546c085b28c86 ] The members of QMI header have to be swapped on big endian platforms. Use __le16 types instead of u16 ones. Signed-off-by: Alexander Wilhelm <alexander.wilhelm@westermo.com> Fixes: 9b8a11e82615 ("soc: qcom: Introduce QMI encoder/decoder") Fixes: 3830d0771ef6 ("soc: qcom: Introduce QMI helpers") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250522143530.3623809-3-alexander.wilhelm@westermo.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15sched/task_stack: Add missing const qualifier to end_of_stack()Kees Cook1-1/+1
[ Upstream commit 32e42ab9fc88a884435c27527a433f61c4d2b61b ] Add missing const qualifier to the non-CONFIG_THREAD_INFO_IN_TASK version of end_of_stack() to match the CONFIG_THREAD_INFO_IN_TASK version. Fixes a warning with CONFIG_KSTACK_ERASE=y on archs that don't select THREAD_INFO_IN_TASK (such as LoongArch): error: passing 'const struct task_struct *' to parameter of type 'struct task_struct *' discards qualifiers The stackleak_task_low_bound() function correctly uses a const task parameter, but the legacy end_of_stack() prototype didn't like that. Build tested on loongarch (with CONFIG_KSTACK_ERASE=y) and m68k (with CONFIG_DEBUG_STACK_USAGE=y). Fixes: a45728fd4120 ("LoongArch: Enable HAVE_ARCH_STACKLEAK") Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://lore.kernel.org/all/20250726004313.GA3650901@ax162 Cc: Youling Tang <tangyouling@kylinos.cn> Cc: Huacai Chen <chenhuacai@loongson.cn> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15fs_context: fix parameter name in infofc() macroRubenKelevra1-1/+1
[ Upstream commit ffaf1bf3737f706e4e9be876de4bc3c8fc578091 ] The macro takes a parameter called "p" but references "fc" internally. This happens to compile as long as callers pass a variable named fc, but breaks otherwise. Rename the first parameter to “fc” to match the usage and to be consistent with warnfc() / errorfc(). Fixes: a3ff937b33d9 ("prefix-handling analogues of errorf() and friends") Signed-off-by: RubenKelevra <rubenkelevra@gmail.com> Link: https://lore.kernel.org/20250617230927.1790401-1-rubenkelevra@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-15audit,module: restore audit logging in load failure caseRichard Guy Briggs1-5/+4
[ Upstream commit ae1ae11fb277f1335d6bcd4935ba0ea985af3c32 ] The move of the module sanity check to earlier skipped the audit logging call in the case of failure and to a place where the previously used context is unavailable. Add an audit logging call for the module loading failure case and get the module name when possible. Link: https://issues.redhat.com/browse/RHEL-52839 Fixes: 02da2cbab452 ("module: move check_modinfo() early to early_mod_check()") Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-25Merge tag 'mm-hotfixes-stable-2025-07-24-18-03' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "11 hotfixes. 9 are cc:stable and the remainder address post-6.15 issues or aren't considered necessary for -stable kernels. 7 are for MM" * tag 'mm-hotfixes-stable-2025-07-24-18-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: sprintf.h requires stdarg.h resource: fix false warning in __request_region() mm/damon/core: commit damos_quota_goal->nid kasan: use vmalloc_dump_obj() for vmalloc error reports mm/ksm: fix -Wsometimes-uninitialized from clang-21 in advisor_mode_show() mm: update MAINTAINERS entry for HMM nilfs2: reject invalid file types when reading inodes selftests/mm: fix split_huge_page_test for folio_split() tests mailmap: add entry for Senozhatsky mm/zsmalloc: do not pass __GFP_MOVABLE if CONFIG_COMPACTION=n mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list
2025-07-25sprintf.h requires stdarg.hStephen Rothwell1-0/+1
In file included from drivers/crypto/intel/qat/qat_common/adf_pm_dbgfs_utils.c:4: include/linux/sprintf.h:11:54: error: unknown type name 'va_list' 11 | __printf(2, 0) int vsprintf(char *buf, const char *, va_list); | ^~~~~~~ include/linux/sprintf.h:1:1: note: 'va_list' is defined in header '<stdarg.h>'; this is probably fixable by adding '#include <stdarg.h>' Link: https://lkml.kernel.org/r/20250721173754.42865913@canb.auug.org.au Fixes: 39ced19b9e60 ("lib/vsprintf: split out sprintf() and friends") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-24Merge tag 'arm64-fixes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Two important arm64 fixes ahead of the 6.16 release. The first fixes a regression introduced during the merge window where the KVM UUID (which is used to advertise KVM-specific hypercalls for things like time synchronisation in the guest) was corrupted thanks to an endianness bug introduced when converting the code to use the UUID_INIT() helper. The second fixes a stack-pointer corruption issue during context-switch which has been observed in the wild when taking a pseudo-NMI with shadow call stack enabled. Summary: - Fix broken UUID value for the KVM/arm64 hypervisor SMCCC interface - Fix stack corruption on context-switch, primarily seen on (but not limited to) configurations with both pNMI and SCS enabled" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/entry: Mask DAIF in cpu_switch_to(), call_on_irq_stack() arm64: kvm, smccc: Fix vendor uuid
2025-07-24Merge tag 'net-6.16-rc8' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can and xfrm. The TI regression notified last week is actually on our net-next tree, it does not affect 6.16. We are investigating a virtio regression which is quite hard to reproduce - currently only our CI sporadically hits it. Hopefully it should not be critical, and I'm not sure that an additional week would be enough to solve it. Current release - fix to a fix: - sched: sch_qfq: avoid sleeping in atomic context in qfq_delete_class Previous releases - regressions: - xfrm: - set transport header to fix UDP GRO handling - delete x->tunnel as we delete x - eth: - mlx5: fix memory leak in cmd_exec() - i40e: when removing VF MAC filters, avoid losing PF-set MAC - gve: fix stuck TX queue for DQ queue format Previous releases - always broken: - can: fix NULL pointer deref of struct can_priv::do_set_mode - eth: - ice: fix a null pointer dereference in ice_copy_and_init_pkg() - ism: fix concurrency management in ism_cmd() - dpaa2: fix device reference count leak in MAC endpoint handling - icssg-prueth: fix buffer allocation for ICSSG Misc: - selftests: mptcp: increase code coverage" * tag 'net-6.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) net: hns3: default enable tx bounce buffer when smmu enabled net: hns3: fixed vf get max channels bug net: hns3: disable interrupt when ptp init failed net: hns3: fix concurrent setting vlan filter issue s390/ism: fix concurrency management in ism_cmd() selftests: drv-net: wait for iperf client to stop sending MAINTAINERS: Add in6.h to MAINTAINERS selftests: netfilter: tone-down conntrack clash test can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode net/sched: sch_qfq: Avoid triggering might_sleep in atomic context in qfq_delete_class gve: Fix stuck TX queue for DQ queue format net: appletalk: Fix use-after-free in AARP proxy probe net: bcmasp: Restore programming of TX map vector register selftests: mptcp: connect: also cover checksum selftests: mptcp: connect: also cover alt modes e1000e: ignore uninitialized checksum word on tgp e1000e: disregard NVM checksum on tgp when valid checksum bit is not set ice: Fix a null pointer dereference in ice_copy_and_init_pkg() i40e: When removing VF MAC filters, only check PF-set MAC i40e: report VF tx_dropped with tx_errors instead of tx_discards ...
2025-07-24s390/ism: fix concurrency management in ism_cmd()Halil Pasic1-0/+1
The s390x ISM device data sheet clearly states that only one request-response sequence is allowable per ISM function at any point in time. Unfortunately as of today the s390/ism driver in Linux does not honor that requirement. This patch aims to rectify that. This problem was discovered based on Aliaksei's bug report which states that for certain workloads the ISM functions end up entering error state (with PEC 2 as seen from the logs) after a while and as a consequence connections handled by the respective function break, and for future connection requests the ISM device is not considered -- given it is in a dysfunctional state. During further debugging PEC 3A was observed as well. A kernel message like [ 1211.244319] zpci: 061a:00:00.0: Event 0x2 reports an error for PCI function 0x61a is a reliable indicator of the stated function entering error state with PEC 2. Let me also point out that a kernel message like [ 1211.244325] zpci: 061a:00:00.0: The ism driver bound to the device does not support error recovery is a reliable indicator that the ISM function won't be auto-recovered because the ISM driver currently lacks support for it. On a technical level, without this synchronization, commands (inputs to the FW) may be partially or fully overwritten (corrupted) by another CPU trying to issue commands on the same function. There is hard evidence that this can lead to DMB token values being used as DMB IOVAs, leading to PEC 2 PCI events indicating invalid DMA. But this is only one of the failure modes imaginable. In theory even completely losing one command and executing another one twice and then trying to interpret the outputs as if the command we intended to execute was actually executed and not the other one is also possible. Frankly, I don't feel confident about providing an exhaustive list of possible consequences. Fixes: 684b89bc39ce ("s390/ism: add device driver for internal shared memory") Reported-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com> Tested-by: Mahanta Jambigi <mjambigi@linux.ibm.com> Tested-by: Aliaksei Makarau <Aliaksei.Makarau@ibm.com> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250722161817.1298473-1-wintera@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-21arm64: kvm, smccc: Fix vendor uuidJack Thomson1-1/+1
Commit 13423063c7cb ("arm64: kvm, smccc: Introduce and use API for getting hypervisor UUID") replaced the explicit register constants with the UUID_INIT macro. However, there is an endian issue, meaning the UUID generated and used in the handshake didn't match UUID prior to the commit. The change in UUID causes the SMCCC vendor handshake to fail with older guest kernels, meaning devices such as PTP were not available in the guest. This patch updates the parameters to the macro to generate a UUID which matches the previous value, and re-establish backwards compatibility with older guest kernels. Fixes: 13423063c7cb ("arm64: kvm, smccc: Introduce and use API for getting hypervisor UUID") Signed-off-by: Jack Thomson <jackabt@amazon.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20250721130558.50823-1-jackabt.amazon@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2025-07-21Merge tag 'platform-drivers-x86-v6.16-4' of ↵Linus Torvalds1-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fixes from Ilpo Järvinen: - power supply code: - Add get/set property direct to allow avoiding taking psy->extensions_sem twice from power supply extensions - alienware-wmi-wmax: - Add AWCC support for Alienware Area-51m and m15 R5 - Fix `dmi_system_id` array termination - arm64: huawei-gaokun-ec: fix OF node leak - dell-ddv: Fix taking psy->extensions_sem twice - dell-lis3lv02d: Add Precision 3551 accelerometer support - firmware_attributes_class: Fix initialization order - ideapad-laptop: Retain FnLock and kbd backlight across boots - lenovo-wmi-hotkey: Avoid triggering error -5 due to missing mute LED - mellanox: mlxbf-pmc: Validate event names and bool input * tag 'platform-drivers-x86-v6.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: MAINTAINERS: Update entries for IFS and SBL drivers platform/x86: dell-lis3lv02d: Add Precision 3551 platform/x86: alieneware-wmi-wmax: Add AWCC support to more laptops platform/x86: Fix initialization order for firmware_attributes_class platform: arm64: huawei-gaokun-ec: fix OF node leak lenovo-wmi-hotkey: Avoid triggering error -5 due to missing mute LED platform/x86: ideapad-laptop: Fix kbd backlight not remembered among boots platform/x86: ideapad-laptop: Fix FnLock not remembered among boots platform/mellanox: mlxbf-pmc: Use kstrtobool() to check 0/1 input platform/mellanox: mlxbf-pmc: Validate event/enable input platform/mellanox: mlxbf-pmc: Remove newline char from event name input platform/x86: dell-ddv: Fix taking the psy->extensions_sem lock twice power: supply: test-power: Test access to extended power supply power: supply: core: Add power_supply_get/set_property_direct() platform/x86: alienware-wmi-wmax: Fix `dmi_system_id` array
2025-07-20Merge tag 'char-misc-6.16-rc7' of ↵Linus Torvalds1-0/+7
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc / IIO fixes from Greg KH: "Here are some char/misc/iio and other driver fixes for 6.16-rc7. Included in here are: - IIO driver fixes for reported problems - Interconnect driver fixes for reported problems - nvmem driver fixes - bunch of comedi driver fixes for long-term bugs - Kconfig dependancy fixes for mux drivers - other small driver fixes for reported problems. All of these have been in linux-next for a while with no reported problems" * tag 'char-misc-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (35 commits) nvmem: layouts: u-boot-env: remove crc32 endianness conversion misc: amd-sbi: Explicitly clear in/out arg "mb_in_out" misc: amd-sbi: Address copy_to/from_user() warning reported in smatch misc: amd-sbi: Address potential integer overflow issue reported in smatch comedi: comedi_test: Fix possible deletion of uninitialized timers comedi: Fix initialization of data for instructions that write to subdevice comedi: Fix use of uninitialized data in insn_rw_emulate_bits() comedi: das6402: Fix bit shift out of bounds comedi: aio_iiro_16: Fix bit shift out of bounds comedi: pcl812: Fix bit shift out of bounds comedi: das16m1: Fix bit shift out of bounds comedi: Fix some signed shift left operations comedi: Fail COMEDI_INSNLIST ioctl if n_insns is too large nvmem: imx-ocotp: fix MAC address byte length MAINTAINERS: add miscdevice Rust abstractions mux: mmio: Fix missing CONFIG_REGMAP_MMIO iio: dac: ad3530r: Fix incorrect masking for channels 4-7 in powerdown mode iio: adc: ad7380: fix adi,gain-milli property parsing iio: adc: ad7949: use spi_is_bpw_supported() iio: accel: fxls8962af: Fix use after free in fxls8962af_fifo_flush ...
2025-07-18Merge tag 'phy-fix-6.16' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: "Core: - use per-PHY lockdep keys, in order to fix a phy using internal phys Drivers: - tegra: - fixes for unbalanced regulator - decouple pad calibration fix - disable periodic updates - qualcomm: - error code fix for driver probe" * tag 'phy-fix-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: qcom: fix error code in snps_eusb2_hsphy_probe() phy: use per-PHY lockdep keys phy: tegra: xusb: Fix unbalanced regulator disable in UTMI PHY mode phy: tegra: xusb: Disable periodic tracking on Tegra234 phy: tegra: xusb: Decouple CYA_TRK_CODE_UPDATE_ON_IDLE from trk_hw_mode
2025-07-13Merge tag 'irq_urgent_for_v6.16_rc6' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Fix a case of recursive locking in the MSI code - Fix a randconfig build failure in armada-370-xp irqchip * tag 'irq_urgent_for_v6.16_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-msi-lib: Fix build with PCI disabled PCI/MSI: Prevent recursive locking in pci_msix_write_tph_tag()
2025-07-12Merge tag 'mm-hotfixes-stable-2025-07-11-16-16' of ↵Linus Torvalds3-6/+5
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "19 hotfixes. A whopping 16 are cc:stable and the remainder address post-6.15 issues or aren't considered necessary for -stable kernels. 14 are for MM. Three gdb-script fixes and a kallsyms build fix" * tag 'mm-hotfixes-stable-2025-07-11-16-16' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: Revert "sched/numa: add statistics of numa balance task" mm: fix the inaccurate memory statistics issue for users mm/damon: fix divide by zero in damon_get_intervals_score() samples/damon: fix damon sample mtier for start failure samples/damon: fix damon sample wsse for start failure samples/damon: fix damon sample prcl for start failure kasan: remove kasan_find_vm_area() to prevent possible deadlock scripts: gdb: vfs: support external dentry names mm/migrate: fix do_pages_stat in compat mode mm/damon/core: handle damon_call_control as normal under kdmond deactivation mm/rmap: fix potential out-of-bounds page table access during batched unmap mm/hugetlb: don't crash when allocating a folio if there are no resv scripts/gdb: de-reference per-CPU MCE interrupts scripts/gdb: fix interrupts.py after maple tree conversion maple_tree: fix mt_destroy_walk() on root leaf node mm/vmalloc: leave lazy MMU mode on PTE mapping error scripts/gdb: fix interrupts display after MCP on x86 lib/alloc_tag: do not acquire non-existent lock in alloc_tag_top_users() kallsyms: fix build without execinfo
2025-07-11Merge tag 'block-6.16-20250710' of git://git.kernel.dk/linuxLinus Torvalds1-0/+5
Pull block fixes from Jens Axboe: - MD changes via Yu: - fix UAF due to stack memory used for bio mempool (Jinchao) - fix raid10/raid1 nowait IO error path (Nigel and Qixing) - fix kernel crash from reading bitmap sysfs entry (Håkon) - Fix for a UAF in the nbd connect error path - Fix for blocksize being bigger than pagesize, if THP isn't enabled * tag 'block-6.16-20250710' of git://git.kernel.dk/linux: block: reject bs > ps block devices when THP is disabled nbd: fix uaf in nbd_genl_connect() error path md/md-bitmap: fix GPF in bitmap_get_stats() md/raid1,raid10: strip REQ_NOWAIT from member bios raid10: cleanup memleak at raid10_make_request md/raid1: Fix stack memory use after return in raid1_reshape
2025-07-11Merge tag 'io_uring-6.16-20250710' of git://git.kernel.dk/linuxLinus Torvalds1-0/+2
Pull io_uring fixes from Jens Axboe: - Remove a pointless warning in the zcrx code - Fix for MSG_RING commands, where the allocated io_kiocb needs to be freed under RCU as well - Revert the work-around we had in place for the anon inodes pretending to be regular files. Since that got reworked upstream, the work-around is no longer needed * tag 'io_uring-6.16-20250710' of git://git.kernel.dk/linux: Revert "io_uring: gate REQ_F_ISREG on !S_ANON_INODE as well" io_uring/msg_ring: ensure io_kiocb freeing is deferred for RCU io_uring/zcrx: fix pp destruction warnings
2025-07-11Merge tag 'wireless-2025-07-10' of ↵Jakub Kicinski1-12/+33
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Quite a number of fixes still: - mt76 (hadn't sent any fixes so far) - RCU - scanning - decapsulation offload - interface combinations - rt2x00: build fix (bad function pointer prototype) - cfg80211: prevent A-MSDU flipping attacks in mesh - zd1211rw: prevent race ending with NULL ptr deref - cfg80211/mac80211: more S1G fixes - mwifiex: avoid WARN on certain RX frames - mac80211: - avoid stack data leak in WARN cases - fix non-transmitted BSSID search (on certain multi-BSSID APs) - always initialize key list so driver iteration won't crash - fix monitor interface in device restart - fix __free() annotation usage * tag 'wireless-2025-07-10' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (26 commits) wifi: mac80211: add the virtual monitor after reconfig complete wifi: mac80211: always initialize sdata::key_list wifi: mac80211: Fix uninitialized variable with __free() in ieee80211_ml_epcs() wifi: mt76: mt792x: Limit the concurrent STA and SoftAP to operate on the same channel wifi: mt76: mt7925: Fix null-ptr-deref in mt7925_thermal_init() wifi: mt76: fix queue assignment for deauth packets wifi: mt76: add a wrapper for wcid access with validation wifi: mt76: mt7921: prevent decap offload config before STA initialization wifi: mt76: mt7925: prevent NULL pointer dereference in mt7925_sta_set_decap_offload() wifi: mt76: mt7925: fix incorrect scan probe IE handling for hw_scan wifi: mt76: mt7925: fix invalid array index in ssid assignment during hw scan wifi: mt76: mt7925: fix the wrong config for tx interrupt wifi: mt76: Remove RCU section in mt7996_mac_sta_rc_work() wifi: mt76: Move RCU section in mt7996_mcu_add_rate_ctrl() wifi: mt76: Move RCU section in mt7996_mcu_add_rate_ctrl_fixed() wifi: mt76: Move RCU section in mt7996_mcu_set_fixed_field() wifi: mt76: Assume __mt76_connac_mcu_alloc_sta_req runs in atomic context wifi: prevent A-MSDU attacks in mesh networks wifi: rt2x00: fix remove callback type mismatch wifi: mac80211: reject VHT opmode for unsupported channel widths ... ==================== Link: https://patch.msgid.link/20250710122212.24272-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-11irqchip/irq-msi-lib: Fix build with PCI disabledArnd Bergmann1-0/+1
The armada-370-xp irqchip fails in some randconfig builds because of a missing declaration: In file included from drivers/irqchip/irq-armada-370-xp.c:23: include/linux/irqchip/irq-msi-lib.h:25:39: error: 'struct msi_domain_info' declared inside parameter list will not be visible outside of this definition or declaration [-Werror] Add a forward declaration for the msi_domain_info structure. [ tglx: Fixed up the subsystem prefix. Is it really that hard to get right? ] Fixes: e51b27438a10 ("irqchip: Make irq-msi-lib.h globally available") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/all/20250710080021.2303640-1-arnd@kernel.org
2025-07-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+2
Pull KVM fixes from Paolo Bonzini: "Many patches, pretty much all of them small, that accumulated while I was on vacation. ARM: - Remove the last leftovers of the ill-fated FPSIMD host state mapping at EL2 stage-1 - Fix unexpected advertisement to the guest of unimplemented S2 base granule sizes - Gracefully fail initialising pKVM if the interrupt controller isn't GICv3 - Also gracefully fail initialising pKVM if the carveout allocation fails - Fix the computing of the minimum MMIO range required for the host on stage-2 fault - Fix the generation of the GICv3 Maintenance Interrupt in nested mode x86: - Reject SEV{-ES} intra-host migration if one or more vCPUs are actively being created, so as not to create a non-SEV{-ES} vCPU in an SEV{-ES} VM - Use a pre-allocated, per-vCPU buffer for handling de-sparsification of vCPU masks in Hyper-V hypercalls; fixes a "stack frame too large" issue - Allow out-of-range/invalid Xen event channel ports when configuring IRQ routing, to avoid dictating a specific ioctl() ordering to userspace - Conditionally reschedule when setting memory attributes to avoid soft lockups when userspace converts huge swaths of memory to/from private - Add back MWAIT as a required feature for the MONITOR/MWAIT selftest - Add a missing field in struct sev_data_snp_launch_start that resulted in the guest-visible workarounds field being filled at the wrong offset - Skip non-canonical address when processing Hyper-V PV TLB flushes to avoid VM-Fail on INVVPID - Advertise supported TDX TDVMCALLs to userspace - Pass SetupEventNotifyInterrupt arguments to userspace - Fix TSC frequency underflow" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: avoid underflow when scaling TSC frequency KVM: arm64: Remove kvm_arch_vcpu_run_map_fp() KVM: arm64: Fix handling of FEAT_GTG for unimplemented granule sizes KVM: arm64: Don't free hyp pages with pKVM on GICv2 KVM: arm64: Fix error path in init_hyp_mode() KVM: arm64: Adjust range correctly during host stage-2 faults KVM: arm64: nv: Fix MI line level calculation in vgic_v3_nested_update_mi() KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush KVM: SVM: Add missing member in SNP_LAUNCH_START command structure Documentation: KVM: Fix unexpected unindent warnings KVM: selftests: Add back the missing check of MONITOR/MWAIT availability KVM: Allow CPU to reschedule while setting per-page memory attributes KVM: x86/xen: Allow 'out of range' event channel ports in IRQ routing table. KVM: x86/hyper-v: Use preallocated per-vCPU buffer for de-sparsified vCPU masks KVM: SVM: Initialize vmsa_pa in VMCB to INVALID_PAGE if VMSA page is NULL KVM: SVM: Reject SEV{-ES} intra host migration if vCPU creation is in-flight KVM: TDX: Report supported optional TDVMCALLs in TDX capabilities KVM: TDX: Exit to userspace for SetupEventNotifyInterrupt
2025-07-10Revert "sched/numa: add statistics of numa balance task"Chen Yu2-6/+0
This reverts commit ad6b26b6a0a79166b53209df2ca1cf8636296382. This commit introduces per-memcg/task NUMA balance statistics, but unfortunately it introduced a NULL pointer exception due to the following race condition: After a swap task candidate was chosen, its mm_struct pointer was set to NULL due to task exit. Later, when performing the actual task swapping, the p->mm caused the problem. CPU0 CPU1 : ... task_numa_migrate task_numa_find_cpu task_numa_compare # a normal task p is chosen env->best_task = p # p exit: exit_signals(p); p->flags |= PF_EXITING exit_mm p->mm = NULL; migrate_swap_stop __migrate_swap_task((arg->src_task, arg->dst_cpu) count_memcg_event_mm(p->mm, NUMA_TASK_SWAP)# p->mm is NULL task_lock() should be held and the PF_EXITING flag needs to be checked to prevent this from happening. After discussion, the conclusion was that adding a lock is not worthwhile for some statistics calculations. Revert the change and rely on the tracepoint for this purpose. Link: https://lkml.kernel.org/r/20250704135620.685752-1-yu.c.chen@intel.com Link: https://lkml.kernel.org/r/20250708064917.BBD13C4CEED@smtp.kernel.org Fixes: ad6b26b6a0a7 ("sched/numa: add statistics of numa balance task") Signed-off-by: Chen Yu <yu.c.chen@intel.com> Reported-by: Jirka Hladky <jhladky@redhat.com> Closes: https://lore.kernel.org/all/CAE4VaGBLJxpd=NeRJXpSCuw=REhC5LWJpC29kDy-Zh2ZDyzQZA@mail.gmail.com/ Reported-by: Srikanth Aithal <Srikanth.Aithal@amd.com> Reported-by: Suneeth D <Suneeth.D@amd.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Hladky <jhladky@redhat.com> Cc: Libo Chen <libo.chen@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-10mm: fix the inaccurate memory statistics issue for usersBaolin Wang1-0/+5
On some large machines with a high number of CPUs running a 64K pagesize kernel, we found that the 'RES' field is always 0 displayed by the top command for some processes, which will cause a lot of confusion for users. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top 1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd The main reason is that the batch size of the percpu counter is quite large on these machines, caching a significant percpu value, since converting mm's rss stats into percpu_counter by commit f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter"). Intuitively, the batch number should be optimized, but on some paths, performance may take precedence over statistical accuracy. Therefore, introducing a new interface to add the percpu statistical count and display it to users, which can remove the confusion. In addition, this change is not expected to be on a performance-critical path, so the modification should be acceptable. In addition, the 'mm->rss_stat' is updated by using add_mm_counter() and dec/inc_mm_counter(), which are all wrappers around percpu_counter_add_batch(). In percpu_counter_add_batch(), there is percpu batch caching to avoid 'fbc->lock' contention. This patch changes task_mem() and task_statm() to get the accurate mm counters under the 'fbc->lock', but this should not exacerbate kernel 'mm->rss_stat' lock contention due to the percpu batch caching of the mm counters. The following test also confirm the theoretical analysis. I run the stress-ng that stresses anon page faults in 32 threads on my 32 cores machine, while simultaneously running a script that starts 32 threads to busy-loop pread each stress-ng thread's /proc/pid/status interface. From the following data, I did not observe any obvious impact of this patch on the stress-ng tests. w/o patch: stress-ng: info: [6848] 4,399,219,085,152 CPU Cycles 67.327 B/sec stress-ng: info: [6848] 1,616,524,844,832 Instructions 24.740 B/sec (0.367 instr. per cycle) stress-ng: info: [6848] 39,529,792 Page Faults Total 0.605 M/sec stress-ng: info: [6848] 39,529,792 Page Faults Minor 0.605 M/sec w/patch: stress-ng: info: [2485] 4,462,440,381,856 CPU Cycles 68.382 B/sec stress-ng: info: [2485] 1,615,101,503,296 Instructions 24.750 B/sec (0.362 instr. per cycle) stress-ng: info: [2485] 39,439,232 Page Faults Total 0.604 M/sec stress-ng: info: [2485] 39,439,232 Page Faults Minor 0.604 M/sec On comparing a very simple app which just allocates & touches some memory against v6.1 (which doesn't have f1a7941243c1) and latest Linus tree (4c06e63b9203) I can see that on latest Linus tree the values for VmRSS, RssAnon and RssFile from /proc/self/status are all zeroes while they do report values on v6.1 and a Linus tree with this patch. Link: https://lkml.kernel.org/r/f4586b17f66f97c174f7fd1f8647374fdb53de1c.1749119050.git.baolin.wang@linux.alibaba.com Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Aboorva Devarajan <aboorvad@linux.ibm.com> Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com> Tested-by Donet Tom <donettom@linux.ibm.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: SeongJae Park <sj@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-08io_uring/msg_ring: ensure io_kiocb freeing is deferred for RCUJens Axboe1-0/+2
syzbot reports that defer/local task_work adding via msg_ring can hit a request that has been freed: CPU: 1 UID: 0 PID: 19356 Comm: iou-wrk-19354 Not tainted 6.16.0-rc4-syzkaller-00108-g17bbde2e1716 #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 Call Trace: <TASK> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xd2/0x2b0 mm/kasan/report.c:521 kasan_report+0x118/0x150 mm/kasan/report.c:634 io_req_local_work_add io_uring/io_uring.c:1184 [inline] __io_req_task_work_add+0x589/0x950 io_uring/io_uring.c:1252 io_msg_remote_post io_uring/msg_ring.c:103 [inline] io_msg_data_remote io_uring/msg_ring.c:133 [inline] __io_msg_ring_data+0x820/0xaa0 io_uring/msg_ring.c:151 io_msg_ring_data io_uring/msg_ring.c:173 [inline] io_msg_ring+0x134/0xa00 io_uring/msg_ring.c:314 __io_issue_sqe+0x17e/0x4b0 io_uring/io_uring.c:1739 io_issue_sqe+0x165/0xfd0 io_uring/io_uring.c:1762 io_wq_submit_work+0x6e9/0xb90 io_uring/io_uring.c:1874 io_worker_handle_work+0x7cd/0x1180 io_uring/io-wq.c:642 io_wq_worker+0x42f/0xeb0 io_uring/io-wq.c:696 ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 </TASK> which is supposed to be safe with how requests are allocated. But msg ring requests alloc and free on their own, and hence must defer freeing to a sane time. Add an rcu_head and use kfree_rcu() in both spots where requests are freed. Only the one in io_msg_tw_complete() is strictly required as it has been visible on the other ring, but use it consistently in the other spot as well. This should not cause any other issues outside of KASAN rightfully complaining about it. Link: https://lore.kernel.org/io-uring/686cd2ea.a00a0220.338033.0007.GAE@google.com/ Reported-by: syzbot+54cbbfb4db9145d26fc2@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Fixes: 0617bb500bfa ("io_uring/msg_ring: improve handling of target CQE posting") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-07-08Merge tag 'tsa_x86_bugs_for_6.16' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull CPU speculation fixes from Borislav Petkov: "Add the mitigation logic for Transient Scheduler Attacks (TSA) TSA are new aspeculative side channel attacks related to the execution timing of instructions under specific microarchitectural conditions. In some cases, an attacker may be able to use this timing information to infer data from other contexts, resulting in information leakage. Add the usual controls of the mitigation and integrate it into the existing speculation bugs infrastructure in the kernel" * tag 'tsa_x86_bugs_for_6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/process: Move the buffer clearing before MONITOR x86/microcode/AMD: Add TSA microcode SHAs KVM: SVM: Advertise TSA CPUID bits to guests x86/bugs: Add a Transient Scheduler Attacks mitigation x86/bugs: Rename MDS machinery to something more generic
2025-07-07block: reject bs > ps block devices when THP is disabledPankaj Raghav1-0/+5
If THP is disabled and when a block device with logical block size > page size is present, the following null ptr deref panic happens during boot: [ [13.2 mK AOSAN: null-ptr-deref in range [0x0000000000000000-0x0000000000K0 0 0[07] [ 13.017749] RIP: 0010:create_empty_buffers+0x3b/0x380 <snip> [ 13.025448] Call Trace: [ 13.025692] <TASK> [ 13.025895] block_read_full_folio+0x610/0x780 [ 13.026379] ? __pfx_blkdev_get_block+0x10/0x10 [ 13.027008] ? __folio_batch_add_and_move+0x1fa/0x2b0 [ 13.027548] ? __pfx_blkdev_read_folio+0x10/0x10 [ 13.028080] filemap_read_folio+0x9b/0x200 [ 13.028526] ? __pfx_filemap_read_folio+0x10/0x10 [ 13.029030] ? __filemap_get_folio+0x43/0x620 [ 13.029497] do_read_cache_folio+0x155/0x3b0 [ 13.029962] ? __pfx_blkdev_read_folio+0x10/0x10 [ 13.030381] read_part_sector+0xb7/0x2a0 [ 13.030805] read_lba+0x174/0x2c0 <snip> [ 13.045348] nvme_scan_ns+0x684/0x850 [nvme_core] [ 13.045858] ? __pfx_nvme_scan_ns+0x10/0x10 [nvme_core] [ 13.046414] ? _raw_spin_unlock+0x15/0x40 [ 13.046843] ? __switch_to+0x523/0x10a0 [ 13.047253] ? kvm_clock_get_cycles+0x14/0x30 [ 13.047742] ? __pfx_nvme_scan_ns_async+0x10/0x10 [nvme_core] [ 13.048353] async_run_entry_fn+0x96/0x4f0 [ 13.048787] process_one_work+0x667/0x10a0 [ 13.049219] worker_thread+0x63c/0xf60 As large folio support depends on THP, only allow bs > ps block devices if THP is enabled. Fixes: 47dd67532303 ("block/bdev: lift block size restrictions to 64k") Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20250704092134.289491-1-p.raghav@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-07-07power: supply: core: Add power_supply_get/set_property_direct()Armin Wolf1-0/+8
Power supply extensions might want to interact with the underlying power supply to retrieve data like serial numbers, charging status and more. However doing so causes psy->extensions_sem to be locked twice, possibly causing a deadlock. Provide special variants of power_supply_get/set_property() that ignore any power supply extensions and thus do not touch the associated psy->extensions_sem lock. Suggested-by: Hans de Goede <hansg@kernel.org> Signed-off-by: Armin Wolf <W_Armin@gmx.de> Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com> Reviewed-by: Hans de Goede <hansg@kernel.org> Link: https://lore.kernel.org/r/20250627205124.250433-1-W_Armin@gmx.de Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-07-07wifi: mac80211: correctly identify S1G short beaconLachlan Hodges1-12/+33
mac80211 identifies a short beacon by the presence of the next TBTT field, however the standard actually doesn't explicitly state that the next TBTT can't be in a long beacon or even that it is required in a short beacon - and as a result this validation does not work for all vendor implementations. The standard explicitly states that an S1G long beacon shall contain the S1G beacon compatibility element as the first element in a beacon transmitted at a TBTT that is not a TSBTT (Target Short Beacon Transmission Time) as per IEEE80211-2024 11.1.3.10.1. This is validated by 9.3.4.3 Table 9-76 which states that the S1G beacon compatibility element is only allowed in the full set and is not allowed in the minimum set of elements permitted for use within short beacons. Correctly identify short beacons by the lack of an S1G beacon compatibility element as the first element in an S1G beacon frame. Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results") Signed-off-by: Simon Wadsworth <simon@morsemicro.com> Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250701075541.162619-1-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-05Merge tag 'pm-6.16-rc5' of ↵Linus Torvalds1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These address system suspend failures under memory pressure in some configurations, fix up RAPL handling on platforms where PL1 cannot be disabled, and fix a documentation typo: - Prevent the Intel RAPL power capping driver from allowing PL1 to be exceeded by mistake on systems when PL1 cannot be disabled (Zhang Rui) - Fix a typo in the ABI documentation (Sumanth Gavini) - Allow swap to be used a bit longer during system suspend and hibernation to avoid suspend failures under memory pressure (Mario Limonciello)" * tag 'pm-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: sleep: docs: Replace "diasble" with "disable" powercap: intel_rapl: Do not change CLAMPING bit if ENABLE bit cannot be changed PM: Restrict swap use to later in the suspend sequence
2025-07-04Merge branch 'pm-sleep'Rafael J. Wysocki1-0/+5
Merge fixes related to system sleep for 6.16-rc5: - Fix typo in the ABI documentation (Sumanth Gavini). - Allow swap to be used a bit longer during system suspend and hibernation to avoid suspend failures under memory pressure (Mario Limonciello). * pm-sleep: PM: sleep: docs: Replace "diasble" with "disable" PM: Restrict swap use to later in the suspend sequence
2025-07-04Merge tag 'soc-fixes-6.16' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "A couple of fixes for firmware drivers have come up, addressing kernel side bugs in op-tee and ff-a code, as well as compatibility issues with exynos-acpm and ff-a protocols. The only devicetree fixes are for the Apple platform, addressing issues with conformance to the bindings for the wlan, spi and mipi nodes" * tag 'soc-fixes-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: arm64: dts: apple: Move touchbar mipi {address,size}-cells from dtsi to dts arm64: dts: apple: Drop {address,size}-cells from SPI NOR arm64: dts: apple: t8103: Fix PCIe BCM4377 nodename optee: ffa: fix sleep in atomic context firmware: exynos-acpm: fix timeouts on xfers handling arm64: defconfig: update renamed PHY_SNPS_EUSB2 firmware: arm_ffa: Fix the missing entry in struct ffa_indirect_msg_hdr firmware: arm_ffa: Replace mutex with rwlock to avoid sleep in atomic context firmware: arm_ffa: Move memory allocation outside the mutex locking firmware: arm_ffa: Fix memory leak by freeing notifier callback node
2025-07-04Merge tag 'spi-fix-v6.16-rc4' of ↵Linus Torvalds2-1/+9
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "As well as a few driver specific fixes we've got a core change here which raises the hard coded limit on the number of devices we can support on one SPI bus since some FPGA based systems are running into the existing limit. This is not a good solution but it's one suitable for this point in the release cycle, we should dynamically size the relevant data structures which I hope will happen in the next couple of merge windows. We also pull in a MTD fix for the Qualcomm SNAND driver, the two fixes cover the same issue and merging them together minimises bisection issues" * tag 'spi-fix-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: cadence-quadspi: fix cleanup of rx_chan on failure paths spi: spi-fsl-dspi: Clear completion counter before initiating transfer spi: Raise limit on number of chip selects to 24 mtd: nand: qpic_common: prevent out of bounds access of BAM arrays spi: spi-qpic-snand: reallocate BAM transactions
2025-07-04Merge tag 'platform-drivers-x86-v6.16-3' of ↵Linus Torvalds1-0/+13
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fixes from Ilpo Järvinen: "Mostly a few lines fixed here and there except amd/isp4 which improves swnodes relationships but that is a new driver not in any stable kernels yet. The think-lmi driver changes also look relatively large but there are just many fixes to it. The i2c/piix4 change is a effectively a revert of the commit 7e173eb82ae9 ("i2c: piix4: Make CONFIG_I2C_PIIX4 dependent on CONFIG_X86") but that required moving the header out from arch/x86 under include/linux/platform_data/ Summary: - amd/isp4: Improve swnode graph (new driver exception) - asus-nb-wmi: Use duo keyboard quirk for Zenbook Duo UX8406CA - dell-lis3lv02d: Add Latitude 5500 accelerometer address - dell-wmi-sysman: Fix WMI data block retrieval and class dev unreg - hp-bioscfg: Fix class device unregistration - i2c: piix4: Re-enable on non-x86 + move FCH header under platform_data/ - intel/hid: Wildcat Lake support - mellanox: - mlxbf-pmc: Fix duplicate event ID - mlxbf-tmfifo: Fix vring_desc.len assignment - mlxreg-lc: Fix bit-not-set logic check - nvsw-sn2201: Fix bus number in error message & spelling errors - portwell-ec: Move watchdog device under correct platform hierarchy - think-lmi: Error handling fixes (sysfs, kset, kobject, class dev unreg) - thinkpad_acpi: Handle HKEY 0x1402 event (2025 Thinkpads) - wmi: Fix WMI event enablement" * tag 'platform-drivers-x86-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (22 commits) platform/x86: think-lmi: Fix sysfs group cleanup platform/x86: think-lmi: Fix kobject cleanup platform/x86: think-lmi: Create ksets consecutively platform/mellanox: mlxreg-lc: Fix logic error in power state check i2c: Re-enable piix4 driver on non-x86 Move FCH header to a location accessible by all archs platform/x86/intel/hid: Add Wildcat Lake support platform/x86: dell-wmi-sysman: Fix class device unregistration platform/x86: think-lmi: Fix class device unregistration platform/x86: hp-bioscfg: Fix class device unregistration platform/x86: Update swnode graph for amd isp4 platform/x86: dell-wmi-sysman: Fix WMI data block retrieval in sysfs callbacks platform/x86: wmi: Update documentation of WCxx/WExx ACPI methods platform/x86: wmi: Fix WMI event enablement platform/mellanox: nvsw-sn2201: Fix bus number in adapter error message platform/mellanox: Fix spelling and comment clarity in Mellanox drivers platform/mellanox: mlxbf-pmc: Fix duplicate event ID for CACHE_DATA1 platform/x86: thinkpad_acpi: handle HKEY 0x1402 event platform/x86: asus-nb-wmi: add DMI quirk for ASUS Zenbook Duo UX8406CA platform/x86: dell-lis3lv02d: Add Latitude 5500 ...
2025-07-04Merge tag 'usb-6.16-rc5' of ↵Linus Torvalds2-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some USB driver fixes for 6.16-rc5. I originally wanted this to get into -rc4, but there were some regressions that had to be handled first. Now all looks good. Included in here are the following fixes: - cdns3 driver fixes - xhci driver fixes - typec driver fixes - USB hub fixes (this is what took the longest to get right) - new USB driver quirks added - chipidea driver fixes All of these have been in linux-next for a while and now we have no more reported problems with them" * tag 'usb-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (21 commits) usb: hub: Fix flushing of delayed work used for post resume purposes xhci: dbc: Flush queued requests before stopping dbc xhci: dbctty: disable ECHO flag by default xhci: Disable stream for xHC controller with XHCI_BROKEN_STREAMS usb: xhci: quirk for data loss in ISOC transfers usb: dwc3: gadget: Fix TRB reclaim logic for short transfers and ZLPs usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pm usb: typec: displayport: Fix potential deadlock usb: typec: altmodes/displayport: do not index invalid pin_assignments usb: cdnsp: Fix issue with CV Bad Descriptor test usb: typec: tcpm: apply vbus before data bringup in tcpm_src_attach Revert "usb: xhci: Implement xhci_handshake_check_state() helper" usb: xhci: Skip xhci_reset in xhci_resume if xhci is being removed usb: gadget: u_serial: Fix race condition in TTY wakeup Revert "usb: gadget: u_serial: Add null pointer check in gs_start_io" usb: chipidea: udc: disconnect/reconnect from host when do suspend/resume usb: acpi: fix device link removal usb: hub: fix detection of high tier USB3 devices behind suspended hubs Logitech C-270 even more broken usb: dwc3: Abort suspend on soft disconnect failure ...
2025-07-04Merge tag 'vfs-6.16-rc5.fixes' of ↵Linus Torvalds2-11/+12
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix a regression caused by the anonymous inode rework. Making them regular files causes various places in the kernel to tip over starting with io_uring. Revert to the former status quo and port our assertion to be based on checking the inode so we don't lose the valuable VFS_*_ON_*() assertions that have already helped discover weird behavior our outright bugs. - Fix the the upper bound calculation in fuse_fill_write_pages() - Fix priority inversion issues in the eventpoll code - Make secretmen use anon_inode_make_secure_inode() to avoid bypassing the LSM layer - Fix a netfs hang due to missing case in final DIO read result collection - Fix a double put of the netfs_io_request struct - Provide some helpers to abstract out NETFS_RREQ_IN_PROGRESS flag wrangling - Fix infinite looping in netfs_wait_for_pause/request() - Fix a netfs ref leak on an extra subrequest inserted into a request's list of subreqs - Fix various cifs RPC callbacks to set NETFS_SREQ_NEED_RETRY if a subrequest fails retriably - Fix a cifs warning in the workqueue code when reconnecting a channel - Fix the updating of i_size in netfs to avoid a race between testing if we should have extended the file with a DIO write and changing i_size - Merge the places in netfs that update i_size on write - Fix coredump socket selftests * tag 'vfs-6.16-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: anon_inode: rework assertions netfs: Update tracepoints in a number of ways netfs: Renumber the NETFS_RREQ_* flags to make traces easier to read netfs: Merge i_size update functions netfs: Fix i_size updating smb: client: set missing retry flag in cifs_writev_callback() smb: client: set missing retry flag in cifs_readv_callback() smb: client: set missing retry flag in smb2_writev_callback() netfs: Fix ref leak on inserted extra subreq in write retry netfs: Fix looping in wait functions netfs: Provide helpers to perform NETFS_RREQ_IN_PROGRESS flag wangling netfs: Fix double put of request netfs: Fix hang due to missing case in final DIO read result collection eventpoll: Fix priority inversion problem fuse: fix fuse_fill_write_pages() upper bound calculation fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass selftests/coredump: Fix "socket_detect_userspace_client" test failure
2025-07-04Merge tag 'icc-6.16-rc5' of ↵Greg Kroah-Hartman1-0/+7
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-linus Georgi writes: interconnect fixes for v6.16-rc This contains a few framework core fixes (related to the new dynamic node id feature), as well as some misc Qualcomm and Samsung driver fixes. - interconnect: qcom: sc7280: Add missing num_links to xm_pcie3_1 node - interconnect: exynos: handle node name allocation failure - interconnect: increase ICC_DYN_ID_START - interconnect: icc-clk: destroy nodes in case of memory allocation failures - interconnect: avoid memory allocation when 'icc_bw_lock' is held Signed-off-by: Georgi Djakov <djakov@kernel.org> * tag 'icc-6.16-rc5' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/djakov/icc: interconnect: avoid memory allocation when 'icc_bw_lock' is held interconnect: icc-clk: destroy nodes in case of memory allocation failures interconnect: increase ICC_DYN_ID_START interconnect: exynos: handle node name allocation failure interconnect: qcom: sc7280: Add missing num_links to xm_pcie3_1 node
2025-07-03Merge tag 'ffa-fixes-6.16' of ↵Arnd Bergmann1-0/+1
https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm FF-A fixes for v6.16 Couple of fixes to address: 1. The safety and memory issues in the FF-A notification callback handler: The fixes replaces a mutex with an rwlock to prevent sleeping in atomic context, resolving kernel warnings. Memory allocation is moved outside the lock to support this transition safely. Additionally, a memory leak in the notifier unregistration path is fixed by properly freeing the callback node. 2. The missing entry in struct ffa_indirect_msg_hdr: The fix adds the missing 32 bit reserved entry in the structure as required by the FF-A specification. * tag 'ffa-fixes-6.16' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_ffa: Fix the missing entry in struct ffa_indirect_msg_hdr firmware: arm_ffa: Replace mutex with rwlock to avoid sleep in atomic context firmware: arm_ffa: Move memory allocation outside the mutex locking firmware: arm_ffa: Fix memory leak by freeing notifier callback node Link: https://lore.kernel.org/r/20250609105207.1185570-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-07-01netfs: Renumber the NETFS_RREQ_* flags to make traces easier to readDavid Howells1-10/+10
Renumber the NETFS_RREQ_* flags to put the most useful status bits in the bottom nibble - and therefore the last hex digit in the trace output - making it easier to grasp the state at a glance. In particular, put the IN_PROGRESS flag in bit 0 and ALL_QUEUED at bit 1. Also make the flags field in /proc/fs/netfs/requests larger to accommodate all the flags. Also make the flags field in the netfs_sreq tracepoint larger to accommodate all the NETFS_SREQ_* flags. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-13-dhowells@redhat.com Reviewed-by: Paulo Alcantara <pc@manguebit.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Merge i_size update functionsDavid Howells1-1/+0
Netfslib has two functions for updating the i_size after a write: one for buffered writes into the pagecache and one for direct/unbuffered writes. However, what needs to be done is much the same in both cases, so merge them together. This does raise one question, though: should updating the i_size after a direct write do the same estimated update of i_blocks as is done for buffered writes. Also get rid of the cleanup function pointer from netfs_io_request as it's only used for direct write to update i_size; instead do the i_size setting directly from write collection. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-12-dhowells@redhat.com cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-30spi: spi-qpic-snand: avoid memory corruptionMark Brown1-0/+8
Merge series from Gabor Juhos <j4g8y7@gmail.com>: The 'spi-qpic-nand' driver may cause memory corruption under some circumstances. The first patch in the series changes the driver to avoid that, whereas the second adds some sanity checks to the common QPIC code in order to make detecting such errors easier in the future.