kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2 days	net/sched: act_mirred: Fix blockcast recursion bypass leading to stack overflow	Kito Xu (veritas501)	1	-7/+11
	commit a005fa5d7502eefec7ee6e1c01adadc06de2f9ad upstream. tcf_mirred_act() checks sched_mirred_nest against MIRRED_NEST_LIMIT (4) to prevent deep recursion. However, when the action uses blockcast (tcfm_blockid != 0), the function returns at the tcf_blockcast() call BEFORE reaching the counter increment. As a result, the recursion counter never advances and the limit check is entirely bypassed. When two devices share a TC egress block with a mirred blockcast rule, a packet egressing on device A is mirrored to device B via blockcast; device B's egress TC re-enters tcf_mirred_act() via blockcast and mirrors back to A, creating an unbounded recursion loop: tcf_mirred_act -> tcf_blockcast -> tcf_mirred_to_dev -> dev_queue_xmit -> sch_handle_egress -> tcf_classify -> tcf_mirred_act -> (repeat) This recursion continues until the kernel stack overflows. The bug is reachable from an unprivileged user via unshare(CLONE_NEWUSER \| CLONE_NEWNET): user namespaces grant CAP_NET_ADMIN in the new network namespace, which is sufficient to create dummy devices, attach clsact qdiscs with shared blocks, and install mirred blockcast filters. BUG: TASK stack guard page was hit at ffffc90000b7fff8 Oops: stack guard page: 0000 [#1] SMP KASAN NOPTI CPU: 2 UID: 1000 PID: 169 Comm: poc Not tainted 7.0.0-rc7-next-20260410 RIP: 0010:xas_find+0x17/0x480 Call Trace: xa_find+0x17b/0x1d0 tcf_mirred_act+0x640/0x1060 tcf_action_exec+0x400/0x530 basic_classify+0x128/0x1d0 tcf_classify+0xd83/0x1150 tc_run+0x328/0x620 __dev_queue_xmit+0x797/0x3100 tcf_mirred_to_dev+0x7b1/0xf70 tcf_mirred_act+0x68a/0x1060 [repeating ~30+ times until stack overflow] Kernel panic - not syncing: Fatal exception in interrupt Fix this by incrementing sched_mirred_nest before calling tcf_blockcast() and decrementing it on return, mirroring the non-blockcast path. This ensures subsequent recursive entries see the updated counter and are correctly limited by MIRRED_NEST_LIMIT. Fixes: fe946a751d9b ("net/sched: act_mirred: add loop detection") Signed-off-by: Kito Xu (veritas501) <hxzene@gmail.com> Link: https://patch.msgid.link/20260525122556.973584-7-jhs@mojatatu.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	ethtool: cmis_cdb: Fix incorrect read / write length extension	Ido Schimmel	2	-16/+3
	commit eaa517b77e63442260640d875f824d1111ca6569 upstream. The 'read_write_len_ext' field in 'struct ethtool_cmis_cdb_cmd_args' stores the maximum number of bytes that can be read from or written to the Local Payload (LPL) page in a single multi-byte access. Cited commit started overwriting this field with the maximum number of bytes that can be read from or written to the Extended Payload (LPL) pages in a single multi-byte access. Transceiver modules that support auto paging can advertise a number larger than 255 which is problematic as 'read_write_len_ext' is a 'u8', resulting in the number getting truncated and firmware flashing failing [1]. Fix by ignoring the maximum EPL access size as the kernel does not currently support auto paging (even if the transceiver module does) and will not try to read / write more than 128 bytes at once. [1] Transceiver module firmware flashing started for device enp177s0np0 Transceiver module firmware flashing in progress for device enp177s0np0 Progress: 0% Transceiver module firmware flashing encountered an error for device enp177s0np0 Status message: Write FW block EPL command failed, LPL length is longer than CDB read write length extension allows. Fixes: 9a3b0d078bd8 ("net: ethtool: Add support for writing firmware blocks using EPL payload") Reported-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Closes: https://lore.kernel.org/netdev/20250402183123.321036-3-michael.chan@broadcom.com/ Tested-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20250409112440.365672-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	mptcp: do not drop partial packets	Shardul Bankar	1	-5/+19
	[ Upstream commit 50c2d91c5dfa0e465826ec1f8dbad9cdc254bd85 ] When a packet arrives with map_seq < ack_seq < end_seq, the beginning of the packet has already been acknowledged but the end contains new data. Currently the entire packet is dropped as "old data," forcing the sender to retransmit. Instead, skip the already-acked bytes by adjusting the skb offset and enqueue only the new portion. Update bytes_received and ack_seq to reflect the new data consumed. A previous attempt at this fix has been sent by Paolo Abeni [1], but had issues [2]: it also added a zero-window check and changed rcv_wnd_sent initialization, which caused test regressions. This version addresses only the partial packet handling without modifying receive window accounting. Fixes: ab174ad8ef76 ("mptcp: move ooo skbs into msk out of order queue.") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/c9b426a4e163aa3c4fe8b80c79f1a610f47ae7d8.1763075056.git.pabeni@redhat.com [1] Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/600 [2] Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com> [pabeni@redhat.com: update map] Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260515-net-mptcp-misc-fixes-7-1-rc4-v2-1-701e96419f2f@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	mptcp: handle first subflow closing consistently	Paolo Abeni	2	-6/+11
	[ Upstream commit 0eeb372deebce6c25b9afc09e35d6c75a744299a ] Currently, as soon as the PM closes a subflow, the msk stops accepting data from it, even if the TCP socket could be still formally open in the incoming direction, with the notable exception of the first subflow. The root cause of such behavior is that code currently piggy back two separate semantic on the subflow->disposable bit: the subflow context must be released and that the subflow must stop accepting incoming data. The first subflow is never disposed, so it also never stop accepting incoming data. Use a separate bit to mark the latter status and set such bit in __mptcp_close_ssk() for all subflows. Beyond making per subflow behaviour more consistent this will also simplify the next patch. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251121-net-next-mptcp-memcg-backlog-imp-v1-11-1f34b6c1e0b1@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 50c2d91c5dfa ("mptcp: do not drop partial packets") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	mptcp: introduce the mptcp_init_skb helper	Paolo Abeni	1	-23/+27
	[ Upstream commit 9a0afe0db46720ce1a009c7dac168aa0584bd732 ] Factor out all the skb initialization step in a new helper and use it. Note that this change moves the MPTCP CB initialization earlier: we can do such step as soon as the skb leaves the subflow socket receive queues. Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Geliang Tang <geliang@kernel.org> Tested-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250927-net-next-mptcp-rcv-path-imp-v1-4-5da266aa9c1a@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 50c2d91c5dfa ("mptcp: do not drop partial packets") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	mptcp: reset rcv wnd on disconnect	Paolo Abeni	1	-0/+1
	[ Upstream commit 0981f90e1a05773a4c29c6e720f5ea1e3c8f1876 ] If the MPTCP socket fallback to TCP before the MP handshake completion, the IASN remain 0, and the rcv_wnd_sent field is not explicitly initialized, just incremented over time with the data transfer. At disconnect time such value is not cleared. If the next connection falls back to TCP before the MP handshake completion, the data transfer will keep incrementing the receive window end sequence starting from the last value used in the previous connection: the announced window will be unrelated from the actual receiver buffer size and likely too big. Address the issue zeroing the field at disconnect time. Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260515-net-mptcp-misc-fixes-7-1-rc4-v2-4-701e96419f2f@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	mptcp: cleanup fallback dummy mapping generation	Paolo Abeni	2	-1/+10
	[ Upstream commit 2834f8edd74d5dda368087a654c0e52b141e9893 ] MPTCP currently access ack_seq outside the msk socket log scope to generate the dummy mapping for fallback socket. Soon we are going to introduce backlog usage and even for fallback socket the ack_seq value will be significantly off outside of the msk socket lock scope. Avoid relying on ack_seq for dummy mapping generation, using instead the subflow sequence number. Note that in case of disconnect() and (re)connect() we must ensure that any previous state is re-set. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Geliang Tang <geliang@kernel.org> Tested-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251121-net-next-mptcp-memcg-backlog-imp-v1-6-1f34b6c1e0b1@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 0981f90e1a05 ("mptcp: reset rcv wnd on disconnect") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient	Li Xiasong	2	-10/+46
	[ Upstream commit 51e398a3b8961b26a8c0a4ba9a777c5339791707 ] When TCP option space is insufficient (e.g., when sending ADD_ADDR with an IPv6 address and port while tcp_timestamps is enabled), the original code jumped to out_unlock without clearing the addr_signal flag. This caused mptcp_pm_add_timer to keep rescheduling indefinitely, not sending ADD_ADDR, preventing subsequent addresses in the endpoint list from being announced. Handle this case by clearing the ADD_ADDR signal and skipping the matching ADD_ADDR retransmission entry. The skip path cancels the matching timer (with id check) and advances PM state progression, preserving forward progress to subsequent PM work. This cancellation is inherently best-effort. A concurrent add_timer callback may already be running and may acquire pm.lock before the cancel path updates entry state. In that case, one final ADD_ADDR transmit attempt can still be executed. Once the cancel path sets entry->retrans_times to ADD_ADDR_RETRANS_MAX, the callback-side retrans_times check suppresses further ADD_ADDR retransmissions. Note that when an ADD_ADDR is being prepared, a pure-ACK is queued. On the output side, it means that it is fine to skip non-pure-ACK packets, when drop_other_suboptions is set: a pure-ACK will be processed soon after. Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout") Cc: stable@vger.kernel.org Signed-off-by: Li Xiasong <lixiasong1@huawei.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260515-net-mptcp-misc-fixes-7-1-rc4-v2-2-701e96419f2f@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	net: hsr: defer node table free until after RCU readers	Michael Bommarito	1	-2/+4
	[ Upstream commit aaec7096f9961eb223b5b149abe9495525c205d9 ] HSR node-list and node-status generic-netlink operations run under rcu_read_lock(). They walk hsr->node_db through hsr_get_next_node() and hsr_get_node_data(), but RTM_DELLINK teardown removes the same node table with plain list_del() and frees each node immediately. That lets a generic-netlink reader hold a struct hsr_node pointer across hsr_dellink(). In a KASAN build, widening the reader window after hsr_get_next_node() obtains the node reproduces a slab-use-after-free when the reader copies node->macaddress_A; the freeing stack is hsr_del_nodes() from hsr_dellink(). Use list_del_rcu() and defer the free through the existing hsr_free_node_rcu() callback. This matches the lifetime rule used by the HSR prune paths, which already delete nodes with list_del_rcu() and call_rcu(). Fixes: b9a1e627405d ("hsr: implement dellink to clean up resources") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://patch.msgid.link/20260513233838.3064715-2-michael.bommarito@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ replaced `list_del`+`call_rcu(hsr_free_node_rcu)` with `list_del_rcu`+`kfree_rcu(node, rcu_head)` ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	rxrpc: Fix RESPONSE packet verification to extract skb to a linear buffer	David Howells	4	-42/+29
	[ Upstream commit 8bfab4b6ffc2fe92da86300728fc8c3c7ebffb56 ] This improves the fix for CVE-2026-43500. Fix the verification of RESPONSE packets to avoid the problem of overwriting a RESPONSE packet sent via splice to a local address by extracting the contents of the UDP packet into a kmalloc'd linear buffer rather than decrypting the data in place in the sk_buff (which may corrupt the original buffer). Fixes: 24481a7f5733 ("rxrpc: Fix conn-level packet handling to unshare RESPONSE packets") Reported-by: Hyunwoo Kim <imv4bel@gmail.com> Closes: https://lore.kernel.org/r/afKV2zGR6rrelPC7@v4bel/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Simon Horman <horms@kernel.org> cc: Jiayuan Chen <jiayuan.chen@linux.dev> cc: linux-afs@lists.infradead.org cc: stable@kernel.org Reviewed-by: Jeffrey Altman <jaltman@auristor.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> Link: https://patch.msgid.link/20260515230516.2718212-4-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	rxrpc: Fix DATA decrypt vs splice() by copying data to buffer in recvmsg	David Howells	6	-97/+96
	[ Upstream commit d2bc90cf6c75cb96d2ce549be6c35efa3099d25b ] This improves the fix for CVE-2026-43500. Fix the pagecache corruption from in-place decryption of a DATA packet transmitted locally by splice() by getting rid of the packet sharing in the I/O thread and unconditionally extracting the packet content into a bounce buffer in which the buffer is decrypted. recvmsg() (or the kernel equivalent) then copies the data from the bounce buffer to the destination buffer. The sk_buff then remains unmodified. This has an additional advantage in that the packet is then arranged in the buffer with the correct alignment required for the crypto algorithms to process directly. The performance of the crypto does seem to be a little faster and, surprisingly, the unencrypted performance doesn't seem to change much - possibly due to removing complexity from the I/O thread. Yet another advantage is that the I/O thread doesn't have to copy packets which would slow down packet distribution, ACK generation, etc.. The buffer belongs to the call and is allocated initially at 2K, sufficiently large to hold a whole jumbo subpacket, but the buffer will be increased in size if needed. However, to take this work, MSG_PEEK may cause a later packet to be decrypted into the buffer, in which case the earlier one will need re-decrypting for a subsequent recvmsg(). Note that rx_pkt_offset may legitimately see 0 as a valid offset now, so switch to using USHRT_MAX to indicate an invalid offset. Note also that I would generally prefer to replace the buffers of the current sk_buff with a new kmalloc'd buffer of the right size, ditching the old data and frags as this makes the handling of MSG_PEEK easier and removes the re-decryption issue, but this looks like quite a complicated thing to achieve. skb_morph() looks half way to what I want, but I don't want to have to allocate a new sk_buff. Fixes: d0d5c0cd1e71 ("rxrpc: Use skb_unshare() rather than skb_cow_data()") Reported-by: Hyunwoo Kim <imv4bel@gmail.com> Closes: https://lore.kernel.org/r/afKV2zGR6rrelPC7@v4bel/ Signed-off-by: David Howells <dhowells@redhat.com> cc: Simon Horman <horms@kernel.org> cc: Jiayuan Chen <jiayuan.chen@linux.dev> cc: linux-afs@lists.infradead.org Reviewed-by: Jeffrey Altman <jaltman@auristor.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> Link: https://patch.msgid.link/20260515230516.2718212-3-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 8bfab4b6ffc2 ("rxrpc: Fix RESPONSE packet verification to extract skb to a linear buffer") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	xfrm: esp: restore combined single-frag length gate	Jingguo Tan	2	-4/+4
	commit dfa0d7b0ff1eb6b2c416b8fdb9b4f2cefba57a40 upstream. The ESP out-of-place fast path appends the trailer in esp_output_head() before esp_output_tail() allocates the destination page frag. The head-side gate currently checks skb->data_len and tailen separately, but the tail code allocates a single destination frag from the combined post-trailer skb->data_len. Reject the page-frag fast path when the combined aligned length exceeds a page. Otherwise skb_page_frag_refill() may fall back to a single page while the destination sg still spans the combined skb->data_len. Restore this combined-length page gate for both IPv4 and IPv6. Fixes: 5bd8baab087d ("esp: limit skb_page_frag_refill use to a single page") Cc: stable@vger.kernel.org Signed-off-by: Lin Ma <malin89@huawei.com> Signed-off-by: Chenyuan Mi <michenyuan@huawei.com> Signed-off-by: Jingguo Tan <tanjingguo@huawei.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	netfilter: conntrack: tcp: do not force CLOSE on invalid-seq RST without ↵	Hamza Mahfooz	1	-1/+2
	direction check commit bed6e04be8e6b9133d8b16d5a42d0e0ce674fa9a upstream. An unintended behavior in the TCP conntrack state machine allows a connection to be forced into the CLOSE state using an RST packet with an invalid sequence number. Specifically, after a SYN packet is observed, an RST with an invalid SEQ can transition the conntrack entry to TCP_CONNTRACK_CLOSE, regardless of whether the RST corresponds to the expected reply direction. The relevant code path assumes the RST is a response to an outgoing SYN, but does not validate packet direction or ensure that a matching SYN was actually sent in the opposite direction. As a result, a crafted packet sequence consisting of a SYN followed by an invalid-sequence RST can prematurely terminate an active NAT entry. This makes connection teardown easier than intended. So, tighten the state transition logic to ensure that RST-triggered CLOSE transitions only occur when the RST is a valid response to a previously observed SYN in the correct direction. Cc: stable@vger.kernel.org Fixes: 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.") Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	xfrm: ah: use skb_to_full_sk in async output callbacks	Michael Bommarito	2	-2/+2
	commit 79d8be262377f7112cfa3088dfc4142d5a2533f3 upstream. When AH output is offloaded to an asynchronous crypto provider (hardware accelerators such as AMD CCP, or a forced-async software shim used for testing), the digest completion fires ah_output_done() / ah6_output_done() on a workqueue. The egress skb at that point may have been originated by a TCP listener sending a SYN-ACK, which sets skb->sk to a request_sock via skb_set_owner_edemux(); it may also have been originated by an inet_timewait_sock retransmit. Neither is a full struct sock, and passing the raw skb->sk to xfrm_output_resume() then forwards a non-full socket through the rest of the xfrm output chain. xfrm_output_resume() and its downstream consumers expect a full sk where they dereference at all. The natural egress path through ah_output_done() does not crash today because the consumers that read past sock_common are either gated by sk_fullsock() or short-circuit on flags that are clear on a fresh request_sock; an exhaustive walk of the 50 most plausible consumers under sch_fq, dev_queue_xmit, netfilter, tc-egress and cgroup-egress BPF found no current unguarded deref. The bug is still a real type confusion that future consumer changes could turn into a memory-corruption primitive. This is the same bug class fixed for ESP in commit 1620c88887b1 ("xfrm: Fix the usage of skb->sk"). Apply the analogous fix to AH: convert skb->sk to a full socket pointer (or NULL) via skb_to_full_sk() before handing it to xfrm_output_resume(). The same async AH callbacks were touched recently for an independent ESN-related ICV layout bug in commit ec54093e6a8f ("xfrm: ah: account for ESN high bits in async callbacks"); the sk type-confusion addressed here is orthogonal. This patch is part of an ongoing audit of the AH callback paths; an ah_output ihl-validation hardening series is also currently under review on netdev. Reproduced under UML + KASAN + lockdep with a forced-async hmac(sha1) shim that registers at priority 9999 and wraps the sync in-tree hmac-sha1-lib. With the shim loaded, ah_output_done runs on every SYN-ACK egress through a transport-mode AH SA and skb->sk arrives as a request_sock (TCP_NEW_SYN_RECV); after this patch, xfrm_output_resume() receives the listener (the result of sk_to_full_sk()) and consumer derefs land on full-sock fields as intended. Fixes: 9ab1265d5231 ("xfrm: Use actual socket sk instead of skb socket for xfrm_output_resume") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	xfrm: route MIGRATE notifications to caller's netns	Maoyi Xie	4	-9/+8
	commit 7e2a4f7ca0952820731ef7bdadfc9a9e9d3571b4 upstream. xfrm_send_migrate() in net/xfrm/xfrm_user.c and pfkey_send_migrate() in net/key/af_key.c both hardcode &init_net for the multicast that announces a successful XFRM_MSG_MIGRATE / SADB_X_MIGRATE. XFRM_MSG_MIGRATE arrives on a per-netns NETLINK_XFRM socket, and the rest of the xfrm/af_key netlink path was made netns-aware in 2008. The other 14 multicast paths in xfrm_user.c route their event using xs_net(x), xp_net(xp) or sock_net(skb->sk); only the migrate path was missed. Two consequences of the init_net hardcoding: 1. The notification (selector, old/new endpoint addresses, and the km_address) is delivered to listeners on init_net's XFRMNLGRP_MIGRATE / pfkey BROADCAST_ALL groups rather than on the issuing netns. An IKE daemon running in init_net therefore receives migration notifications originating from any other netns on the host. 2. An IKE daemon running inside a non-init netns and subscribed to its own XFRMNLGRP_MIGRATE / pfkey groups never receives the notification of its own migration. IKEv2 MOBIKE / address-update handling inside a netns is silently broken. Thread struct net through km_migrate() and the xfrm_mgr.migrate function pointer, drop the &init_net override in xfrm_send_migrate() and pfkey_send_migrate(), and pass the caller's net (already in scope in xfrm_migrate() via sock_net(skb->sk)) all the way down. struct xfrm_mgr is in-tree only and not exported as a stable API, so the function-pointer signature change is internal. pfkey_broadcast() is already netns-aware via net_generic(net, pfkey_net_id) since the pernet conversion. The five other pfkey_broadcast() callers in af_key.c already pass xs_net(x), sock_net(sk) or a per-netns net, so this only removes the &init_net outlier. Fixes: 5c79de6e79cd ("[XFRM]: User interface for handling XFRM_MSG_MIGRATE") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Maoyi Xie <maoyi.xie@ntu.edu.sg> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	nfc: hci: fix out-of-bounds read in HCP header parsing	Ashutosh Desai	2	-0/+20
	commit f040e590c035bfd9553fe79ee9585caf1b14d67b upstream. Both nfc_hci_recv_from_llc() and nci_hci_data_received_cb() read packet->header from skb->data at function entry without first checking that the buffer holds at least one byte. A malicious NFC peer can send a 0-byte HCP frame that passes through the SHDLC layer and reaches these functions, causing an out-of-bounds heap read of packet->header. The same 0-byte frame, if queued as a non-final fragment, also causes the reassembly loop to underflow msg_len to UINT_MAX, triggering skb_over_panic() when the reassembled skb is written. Fix this by adding a pskb_may_pull() check at the entry of each function before packet->header is first accessed. The existing pskb_may_pull() checks before the reassembled hcp_skb is cast to struct hcp_packet remain in place to guard the 2-byte HCP message header. Fixes: 8b8d2e08bf0d ("NFC: HCI support") Fixes: 11f54f228643 ("NFC: nci: Add HCI over NCI protocol support") Cc: stable@vger.kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com> Link: https://patch.msgid.link/20260505170712.96560-1-ashutoshdesai993@gmail.com Signed-off-by: David Heidelberg <david@ixit.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	net: skbuff: fix missing zerocopy reference in pskb_carve helpers	Minh Nguyen	1	-0/+4
	commit 98d0912e9f841e5529a5b89a972805f34cb1c69d upstream. pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy the old skb_shared_info header into a new buffer via memcpy(), which includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs. Neither function calls net_zcopy_get() for the new shinfo, creating an unaccounted holder: every skb_shared_info with destructor_arg set will call skb_zcopy_clear() once when freed, but the corresponding net_zcopy_get() was never called for the new copy. Repeated calls drive uarg->refcnt to zero prematurely, freeing ubuf_info_msgzc while TX skbs still hold live destructor_arg pointers. KASAN reports use-after-free on a freed ubuf_info_msgzc: BUG: KASAN: slab-use-after-free in skb_release_data+0x77b/0x810 Read of size 8 at addr ffff88801574d3e8 by task poc/220 Call Trace: skb_release_data+0x77b/0x810 kfree_skb_list_reason+0x13e/0x610 skb_release_data+0x4cd/0x810 sk_skb_reason_drop+0xf3/0x340 skb_queue_purge_reason+0x282/0x440 rds_tcp_inc_free+0x1e/0x30 rds_recvmsg+0x354/0x1780 __sys_recvmsg+0xdf/0x180 Allocated by task 219: msg_zerocopy_realloc+0x157/0x7b0 tcp_sendmsg_locked+0x2892/0x3ba0 Freed by task 219: ip_recv_error+0x74a/0xb10 tcp_recvmsg+0x475/0x530 The skb consuming the late access still referenced the same uarg via shinfo->destructor_arg copied by pskb_carve_inside_nonlinear() without a refcount bump. This has been verified to be reliably exploitable: a working proof-of-concept achieves full root privilege escalation from an unprivileged local user on a default kernel configuration. The fix follows the pattern of pskb_expand_head() which has the same memcpy/cloned structure. For pskb_carve_inside_header(), net_zcopy_get() is placed after skb_orphan_frags() succeeds, so the orphan error path needs no cleanup. For pskb_carve_inside_nonlinear(), net_zcopy_get() is placed after all failure points and just before skb_release_data(), so no error path needs cleanup at all -- matching pskb_expand_head() more closely and avoiding the need for a balancing net_zcopy_put(). Fixes: 6fa01ccd8830 ("skbuff: Add pskb_extract() helper function") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Minh Nguyen <minhnguyen.080505@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260526041240.329462-1-minhnguyen.080505@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	ip6: vti: Use ip6_tnl.net in vti6_changelink().	Kuniyuki Iwashima	1	-5/+7
	commit 11b326fb0a374f4654f9be22d0f0f7abd9f7d3fe upstream. ip netns add ns1 ip netns add ns2 ip -n ns1 link add vti6_test type vti6 remote ::1 local ::2 key 7 ip -n ns1 link set vti6_test netns ns2 ip -n ns2 link set vti6_test type vti6 remote ::3 local ::4 key 9 ip netns del ns2 ip netns del ns1 [ 132.495484] ------------[ cut here ]------------ [ 132.497609] kernel BUG at net/core/dev.c:12376! Commit 61220ab34948 ("vti6: Enable namespace changing") dropped NETIF_F_NETNS_LOCAL from vti6 devices. A vti6 tunnel can then move through IFLA_NET_NS_FD. After the move dev_net(dev) points at the new netns while t->net stays at the creation netns. vti6_changelink() and vti6_update() still use dev_net(dev) and dev_net(t->dev). They unlink from one per netns hash and relink into another. The creation netns is left with a stale entry. cleanup_net() of that netns later walks freed memory. Reachable from an unprivileged user namespace (unshare --user --map-root-user --net). Cross tenant scope on container hosts. Fixes: 61220ab34948 ("vti6: Enable namespace changing") Reported-by: Maoyi Xie <maoyi.xie@ntu.edu.sg> Reviewed-by: Eric Dumazet <edumazet@google.com> Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260521130555.3421684-2-maoyixie.tju@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname	Michael Bommarito	1	-5/+6
	commit 05f95729ca844704d15e49ce14868af4b403b32b upstream. A reader in l2tp_session_get_by_ifname() can return a pointer to a session whose refcount has reached zero. The getter takes its reference with plain refcount_inc(), but every other session getter in the same file (l2tp_v2_session_get, l2tp_v3_session_get, and the corresponding _get_next variants) uses refcount_inc_not_zero() because the IDR/RCU lookup can race with refcount_dec_and_test() -> l2tp_session_free() -> kfree_rcu(). The ifname getter is the only outlier; the inconsistency was raised on-list after 979c017803c4 ("l2tp: use list_del_rcu in l2tp_session_unhash"). A reader inside rcu_read_lock_bh() that matches session->ifname can be preempted between the strcmp() and the refcount_inc(). If the last reference drops on another CPU in that window, the reader's refcount_inc() runs on a counter that has reached zero. refcount_t catches the addition-on-zero, prints "refcount_t: addition on 0; use-after-free", saturates the counter, and returns the saturated pointer to the caller. Session memory is held live by the in-flight RCU read section, but the kfree_rcu() callback queued from l2tp_session_free() will free it once the grace period closes; a caller that dereferences the returned session past that point hits a slab-use-after-free. On PREEMPT_RT local_bh_disable() is a per-CPU sleeping lock and the preemption window is real; on stock PREEMPT kernels local_bh_disable() is a preempt_count increment that closes the cross-CPU race in practice (see below). Use refcount_inc_not_zero() and continue the list walk on failure, matching the other session getters in the file. The ifname getter is the only session getter in net/l2tp/ that still uses the bare refcount_inc() pattern; this change restores file-internal consistency. The success path is unchanged. Fixes: abe7a1a7d0b6 ("l2tp: improve tunnel/session refcount helpers") Cc: stable@vger.kernel.org Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Reviewed-by: James Chapman <jchapman@katalix.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260523023423.2568972-1-michael.bommarito@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	xfrm: input: hold netns during deferred transport reinjection	Zhengchuan Liang	1	-4/+12
	commit c16f74dc1d75d0e2e7670076d5375deda110ebeb upstream. Transport-mode reinjection stores a struct net pointer in skb->cb and uses it later from xfrm_trans_reinject(). That pointer must stay valid until the deferred callback runs. Take a netns reference when queueing deferred reinjection work and drop it after the callback completes. Use maybe_get_net() so the queueing path does not revive a namespace that is already being torn down. This keeps the existing workqueue design and fixes the netns lifetime handling in one place for all users of xfrm_trans_queue_net(). Fixes: 7b3801927e52 ("xfrm: introduce xfrm_trans_queue_net") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Co-developed-by: Luxing Yin <tr0jan@lzu.edu.cn> Signed-off-by: Luxing Yin <tr0jan@lzu.edu.cn> Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Assisted-by: Codex:gpt-5.4 Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	ipv6: validate extension header length before copying to cmsg	Qi Tang	1	-8/+46
	commit dd433671fef381fdaf7b530c631e6b782d66e224 upstream. ip6_datagram_recv_specific_ctl() builds IPV6_{HOPOPTS,DSTOPTS,RTHDR} cmsgs (and their IPV6_2292* legacy counterparts) by trusting the on-wire hdrlen byte (ptr[1]) when computing the put_cmsg() length. The length was validated only at parse time (ipv6_parse_hopopts(), etc.). An nftables payload-write expression can rewrite hdrlen after parsing and before the skb reaches recvmsg; the write itself is in-bounds but put_cmsg() then reads up to ((hdrlen+1) << 3) = 2040 bytes from an 8-byte header. nftables is reachable from an unprivileged user namespace, so this is an unprivileged slab-out-of-bounds read: BUG: KASAN: slab-out-of-bounds in put_cmsg+0x3ac/0x540 put_cmsg+0x3ac/0x540 udpv6_recvmsg+0xca0/0x1250 sock_recvmsg+0xdf/0x190 ____sys_recvmsg+0x1b1/0x620 Add ipv6_get_exthdr_len() which validates that at least two bytes are accessible before reading the hdrlen field, then checks the computed length against skb_tail_pointer(skb), returning 0 on failure. Extension headers are kept in the linear skb area by pskb_may_pull() during input, so skb_tail_pointer() is the correct bound. Use ipv6_get_exthdr_len() at all non-AH call sites: the five standalone cmsg blocks (HbH, 2292HbH, 2292DSTOPTS x2, 2292RTHDR) and the three standard cases in the extension-header walk loop (DSTOPTS, ROUTING, default). AH retains an inline bounds check because its length formula differs ((ptr[1]+2)<<2). The walk loop also gets a pre-read bounds check at the top to validate ptr before any case accesses ptr[0] or ptr[1]. When the walk loop detects a corrupted header, return from the function instead of continuing to process later socket options. Cc: stable@vger.kernel.org Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Qi Tang <tpluszz77@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260523143245.2281415-1-tpluszz77@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	ip6: vti: Use ip6_tnl.net in vti6_siocdevprivate().	Maoyi Xie	1	-2/+9
	commit 8b484efd5cb4eeef9021a661e198edc5349dacf6 upstream. After patch 1/2 in this series, vti6_update() unlinks and relinks the tunnel through t->net. vti6_siocdevprivate() still uses dev_net(dev) for the collision lookup. For a tunnel moved through IFLA_NET_NS_FD, dev_net(dev) is the new netns, not t->net. SIOCCHGTUNNEL on a migrated tunnel then runs: net = dev_net(dev) /* migrated netns / t = vti6_locate(net, &p1, false) / misses target in t->net / ... t = netdev_priv(dev) vti6_update(t, &p1, false) / mutates t->net's hash */ A caller in the migrated netns picks params that match a tunnel in the creation netns. The lookup in dev_net(dev) finds nothing. vti6_update() prepends the migrated tunnel at the head of the creation netns hash bucket for those params. Later lookups in the creation netns resolve to the migrated device. xfrm receive delivers the matched packets through a device the caller controls. Reachable from an unprivileged user namespace (unshare --user --map-root-user --net). Cross tenant scope on container hosts. Switch the SIOCCHGTUNNEL path on a non fallback device to use t->net for the lookup. The lookup now matches the netns vti6_update() operates on. Also add ns_capable(self->net->user_ns, CAP_NET_ADMIN) before the lookup. The check at the top of the case is against dev_net(dev)->user_ns, which after migration is the attacker's netns. A caller there can pick params absent from self->net, the lookup returns NULL, t becomes self, and vti6_update() inserts the device into the creation netns hash. The new check requires CAP_NET_ADMIN in the creation netns user_ns too. SIOCADDTUNNEL and SIOCCHGTUNNEL on the fallback device keep dev_net(dev), which equals init_net there. Fixes: 61220ab34948 ("vti6: Enable namespace changing") Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: Xiao Liang <shaw.leon@gmail.com> Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com> Link: https://patch.msgid.link/20260521130555.3421684-3-maoyixie.tju@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	ipv6: exthdrs: refresh nh after handling HAO option	Zhengchuan Liang	1	-0/+2
	commit f7b52afe3592eae66e160586b45a3f2242972c63 upstream. ip6_parse_tlv() caches skb_network_header(skb) in nh while walking IPv6 TLVs. ipv6_dest_hao() may call pskb_expand_head() for a cloned skb, which can move the skb head and invalidate the cached network header pointer. Refresh nh after ipv6_dest_hao() returns so any trailing padding or TLVs are parsed from the current skb head. This matches the existing pattern used in ip6_parse_tlv() after helpers that can modify skb header storage. Fixes: a831f5bbc89a ("[IPV6] MIP6: Add inbound interface of home address option.") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Co-developed-by: Luxing Yin <tr0jan@lzu.edu.cn> Signed-off-by: Luxing Yin <tr0jan@lzu.edu.cn> Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Reviewed-by: Justin Iurman <justin.iurman@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/7aba1debc2196189172499e5769802b026f8caf8.1779247873.git.zcliangcn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	ipv6: exthdrs: refresh nh pointer after ipv6_hop_jumbo()	Justin Iurman	1	-0/+2
	commit d47548a36639095939f4747d4c43f2271366f565 upstream. ipv6_hop_jumbo() calls pskb_trim_rcsum(), which can change skb pointers. Let's recompute nh pointer to make sure any change won't mess things up. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Justin Iurman <justin.iurman@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260522112013.12342-1-justin.iurman@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	bpf: sockmap: fix tail fragment offset in bpf_msg_push_data	Yuqi Xu	1	-1/+1
	commit f72eed9b84fb771019a955908132410a9ba9ea3f upstream. When bpf_msg_push_data() inserts data in the middle of a scatterlist entry, it splits the original entry into a left fragment and a right fragment. The right fragment offset is page-local, but the code advances it with `start`, which is the message-global insertion point. For inserts into a non-first SG entry, this over-advances the offset and leaves the split layout inconsistent. Advance the right fragment offset by the fragment-local delta, `start - offset`, which matches the length removed from the front of the original entry. Fixes: 6fff607e2f14 ("bpf: sk_msg program helper bpf_msg_push_data") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Zhengchuan Liang <zcliangcn@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Yuqi Xu <xuyq21@lenovo.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Link: https://patch.msgid.link/8b129d10566aa3eb43f61a8f9757bcf51707d324.1779636774.git.xuyq21@lenovo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	Bluetooth: hci_sync: fix UAF in hci_le_create_cis_sync	Doruk Tan Ozturk	1	-1/+3
	commit bfea6091e0fffb270c20e74384b660910277eb6c upstream. hci_le_create_cis_sync() dereferences conn->conn_timeout after releasing both rcu_read_lock() and hci_dev_lock(hdev). The conn pointer was obtained from an RCU-protected iteration over hdev->conn_hash.list and is not valid once these locks are dropped. A concurrent disconnect can free the hci_conn between the unlock and the dereference, causing a use-after-free read. The cancellation mechanism in hci_conn_del() cannot prevent this because hci_le_create_cis_pending() queues hci_create_cis_sync with data=NULL: hci_cmd_sync_queue(hdev, hci_create_cis_sync, NULL, NULL); While hci_conn_del() dequeues with data=conn: hci_cmd_sync_dequeue(hdev, NULL, conn, NULL); Since NULL != conn, the lookup in _hci_cmd_sync_lookup_entry() never matches, and the pending work item is not cancelled. Fix this by saving conn->conn_timeout into a local variable while the locks are still held, so the stale conn pointer is never dereferenced after unlock. This is the same class of bug as the one fixed by commit 035c25007c9e ("Bluetooth: hci_sync: Fix UAF on le_read_features_complete") which addressed the identical pattern in a different function. This vulnerability was identified using 0sec.ai, an open-source automated security auditing platform (https://github.com/0sec-labs). Fixes: c09b80be6ffc ("Bluetooth: hci_conn: Fix not waiting for HCI_EVT_LE_CIS_ESTABLISHED") Cc: stable@vger.kernel.org Reported-by: Doruk Tan Ozturk <doruk@0sec.ai> Signed-off-by: Doruk Tan Ozturk <doruk@0sec.ai> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	Bluetooth: ISO: serialize iso_sock_clear_timer with socket lock	Muhammad Bilal	1	-1/+1
	commit 4b5f8e608749b7e8fa386c6e4301cf9272595859 upstream. iso_sock_close() calls iso_sock_clear_timer() before acquiring lock_sock(sk). iso_sock_clear_timer() reads iso_pi(sk)->conn twice without the socket lock held: if (!iso_pi(sk)->conn) return; cancel_delayed_work(&iso_pi(sk)->conn->timeout_work); Concurrently, iso_conn_del() executes under lock_sock(sk) and calls iso_chan_del(), which sets iso_pi(sk)->conn to NULL and may result in the final reference to the connection being dropped: CPU0 CPU1 ---- ---- iso_sock_clear_timer() if (conn != NULL) ... lock_sock(sk) iso_chan_del() iso_pi(sk)->conn = NULL cancel_delayed_work(conn) /* NULL deref or UAF */ iso_pi(sk)->conn is not stable across the unlock window, causing a NULL pointer dereference or use-after-free. Serialize iso_sock_clear_timer() with the socket lock by moving it inside lock_sock()/release_sock(), matching the pattern used in iso_conn_del() and all other call sites. Fixes: ccf74f2390d60a2f9a75ef496d2564abb478f46a ("Bluetooth: Add BTPROTO_ISO socket type") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	Bluetooth: ISO: fix UAF in iso_recv_frame	Muhammad Bilal	1	-3/+7
	commit 47f23a259517abbdb8032c057a1e8a6bf3734878 upstream. iso_recv_frame reads conn->sk under iso_conn_lock but releases the lock before using sk, with no reference held. A concurrent iso_sock_kill() can free sk in that window, causing use-after-free on sk->sk_state and sock_queue_rcv_skb(). Fix by replacing the bare pointer read with iso_sock_hold(conn), which calls sock_hold() while the spinlock is held, atomically elevating the refcount before the lock drops. Add a drop_put label so sock_put() is called on all exit paths where the hold succeeded. Fixes: ccf74f2390d60a2f9a75ef496d2564abb478f46a ("Bluetooth: Add BTPROTO_ISO socket type") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	Bluetooth: HIDP: fix missing length checks in hidp_input_report()	Muhammad Bilal	1	-5/+18
	commit 2a3ac9ee11dbb9845f3947cef4a79dba658cf6f6 upstream. hidp_input_report() reads keyboard and mouse payload data from an skb without first verifying that skb->len contains enough data. hidp_recv_intr_frame() pulls the 1-byte HIDP header before dispatching to hidp_input_report(). If a paired device sends a truncated packet, the handler reads beyond the valid skb data, resulting in an out-of-bounds read of skb data. The OOB bytes may be interpreted as phantom key presses or spurious mouse movement. Replace the open-coded length tracking and pointer arithmetic with skb_pull_data() calls. skb_pull_data() returns NULL if the requested bytes are not present, eliminating the need for a manual size variable and the separate skb->len guard. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Muhammad Bilal <meatuni001@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	Bluetooth: L2CAP: fix chan ref leak in l2cap_chan_timeout() on !conn	Siwei Zhang	1	-1/+3
	commit 9dbd84990394c51f5cee1e8871bb5ff8af5ed939 upstream. __set_chan_timer() takes a l2cap_chan reference via l2cap_chan_hold() before scheduling the delayed work. The normal path in l2cap_chan_timeout() drops this reference with l2cap_chan_put() at the end, but the early return when chan->conn is NULL skips the put, leaking the reference. Add the missing l2cap_chan_put() before the early return. Fixes: adf0398cee86 ("Bluetooth: l2cap: fix null-ptr-deref in l2cap_chan_timeout") Cc: stable@vger.kernel.org Signed-off-by: Siwei Zhang <oss@fourdim.xyz> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	Bluetooth: L2CAP: use chan timer to close channels in cleanup_listen()	Siwei Zhang	1	-7/+9
	commit 8c8e620467a7b51562dbcefbd1f09f288d7d710d upstream. l2cap_chan_close() removes the channel from conn->chan_l, which must be done under conn->lock. cleanup_listen() runs under the parent sk_lock, so acquiring conn->lock would invert the established conn->lock -> chan->lock -> sk_lock order. Instead of calling l2cap_chan_close() directly, schedule l2cap_chan_timeout with delay 0 to close the channel asynchronously. The timeout handler already acquires conn->lock and chan->lock in the correct order. The timer is only armed when chan->conn is still set: if it is already NULL, l2cap_conn_del() has already processed this channel (l2cap_chan_del + l2cap_sock_teardown_cb + l2cap_sock_close_cb), so there is nothing left to do. If l2cap_conn_del() races in after the timer is armed, __clear_chan_timer() inside l2cap_chan_del() cancels it; if the timer has already fired, the handler returns harmlessly because chan->conn was cleared. Fixes: 3df91ea20e74 ("Bluetooth: Revert to mutexes from RCU list") Cc: <stable@vger.kernel.org> # 0b58004: Bluetooth: fix UAF in l2cap_sock_cleanup_listen() vs l2cap_conn_del() Signed-off-by: Siwei Zhang <oss@fourdim.xyz> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 days	batman-adv: tt: prevent TVLV entry number overflow	Sven Eckelmann	1	-3/+17
	commit 99d9958fa10fb684b2a8e2c48a8d704122721420 upstream. The helpers to prepare the buffers for the local and global TT based replies are trying to sum up all TT entries which can be found for each VLAN. In theory, this sum can be too big for an u16 and therefore overflow. A too small buffer would then be allocated for the TVLV. The too small buffer will be handled gracefully by batadv_tt_tvlv_generate() and is not causing a buffer overflow - just a truncated reply. But this overflow shouldn't have happened in the first and the too small buffer should never have been allocated when an overflow was detected. Cc: stable@kernel.org Fixes: 7ea7b4a14275 ("batman-adv: make the TT CRC logic VLAN specific") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	inet: frags: flush pending skbs in fqdir_pre_exit()	Jakub Kicinski	2	-5/+43
	[ Upstream commit 006a5035b495dec008805df249f92c22c89c3d2e ] We have been seeing occasional deadlocks on pernet_ops_rwsem since September in NIPA. The stuck task was usually modprobe (often loading a driver like ipvlan), trying to take the lock as a Writer. lockdep does not track readers for rwsems so the read wasn't obvious from the reports. On closer inspection the Reader holding the lock was conntrack looping forever in nf_conntrack_cleanup_net_list(). Based on past experience with occasional NIPA crashes I looked thru the tests which run before the crash and noticed that the crash follows ip_defrag.sh. An immediate red flag. Scouring thru (de)fragmentation queues reveals skbs sitting around, holding conntrack references. The problem is that since conntrack depends on nf_defrag_ipv6, nf_defrag_ipv6 will load first. Since nf_defrag_ipv6 loads first its netns exit hooks run _after_ conntrack's netns exit hook. Flush all fragment queue SKBs during fqdir_pre_exit() to release conntrack references before conntrack cleanup runs. Also flush the queues in timer expiry handlers when they discover fqdir->dead is set, in case packet sneaks in while we're running the pre_exit flush. The commit under Fixes is not exactly the culprit, but I think previously the timer firing would eventually unblock the spinning conntrack. Fixes: d5dd88794a13 ("inet: fix various use-after-free in defrags units") Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251207010942.1672972-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Rajani Kantha <681739313@139.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	inet: frags: add inet_frag_queue_flush()	Jakub Kicinski	2	-8/+13
	[ Upstream commit 1231eec6994be29d6bb5c303dfa54731ed9fc0e6 ] Instead of exporting inet_frag_rbtree_purge() which requires that caller takes care of memory accounting, add a new helper. We will need to call it from a few places in the next patch. Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251207010942.1672972-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Rajani Kantha <681739313@139.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	batman-adv: bla: avoid double decrement of bla.num_requests	Sven Eckelmann	3	-24/+67
	commit 83ab69bd12b80f6ea169c8bea6977701b53a043d upstream. The bla.num_requests is increased when no request_sent was in progress. And it is decremented in various places (announcement was received, backbone is purged, periodic work). But the check if the request_sent is actually set to a specific state and the atomic_dec/_inc are not safe because they are not atomic (TOCTOU) and multiple such code portions can run concurrently. At the same time, it is necessary to modify request_sent (state) and bla.num_requests atomically. Otherwise batadv_bla_send_request() might set request_sent to 1 and is interrupted. batadv_handle_announce() can then set request_sent back to 0 and decrement num_requests before batadv_bla_send_request() incremented it. The two operations must therefore be locked. And since state (request_sent) and wait_periods are only accessed inside this lock, they can be converted to simpler datatypes. And to avoid that the bla.num_requests is touched by a parallel running context with a valid backbone_gw reference after batadv_bla_purge_backbone_gw() ran, a third state "stopped" is required to correctly signal that a backbone_gw is in the state of being cleaned up. Cc: stable@kernel.org Fixes: 23721387c409 ("batman-adv: add basic bridge loop avoidance code") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	batman-adv: tt: avoid empty VLAN responses	Sven Eckelmann	1	-3/+18
	commit fa1bd704940b5bcbc32c0b28db9167405c8ee5e0 upstream. The commit 16116dac2339 ("batman-adv: prevent TT request storms by not sending inconsistent TT TLVLs") added checks to the local (direct) TT response code. But the response can also be done indirectly by another node using the global TT state. To avoid such inconsistency states reported in the original fix, also avoid sending empty VLANs for replies from the global TT state. Cc: stable@kernel.org Fixes: 7ea7b4a14275 ("batman-adv: make the TT CRC logic VLAN specific") [ Context, drop flex array dependency ] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	batman-adv: tt: fix TOCTOU race for reported vlans	Sven Eckelmann	1	-4/+10
	commit 94d27005016be15ffc638b2ecbc4d58805ad7b48 upstream. The local TT based TVLV is generated by first checking the number of VLANs which have at least one TT entry. A new buffer with the correct size for the VLANs is then allocated. Only then, the list of VLANs s used to fill the VLAN entries in the buffer. During this time, the meshif_vlan_list_lock is held. But the actual number of TT entries of each VLAN can still increase during this time - just not the number of VLANs in the list. But the prefilter used in the buffer size calculation might still cause an increase of the number of VLANs which need to be stored. Simply because a VLAN might now suddenly have at least one entry when it had none in the pre-alloc check - and then needs to occupy space which was not allocated. It is better to overestimate the buffer size at the beginning and then fill the buffer only with the VLANs which are not empty. Cc: stable@kernel.org Fixes: 16116dac2339 ("batman-adv: prevent TT request storms by not sending inconsistent TT TLVLs") [ Context, drop flex array dependency ] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	batman-adv: tp_meter: directly shut down timer on cleanup	Sven Eckelmann	1	-7/+1
	commit d5487249a81ea658717614009c8f46acc5b7101a upstream. batadv_tp_sender_cleanup() was calling timer_delete_sync() followed by timer_delete() to guard against the timer handler re-arming itself between the two calls. This double-deletion hack relied on the sending status being set to 0 to suppress re-arming. Replace both calls with a single timer_shutdown_sync(). This function both waits for any running timer callback to complete (like timer_delete_sync()) and permanently disarms the timer so it cannot be re-armed afterwards, making re-arming prevention unconditional and self-documenting. The re-arming property is also required because otherwise: 1. context 0 (batadv_tp_recv_ack()) checks in batadv_tp_reset_sender_timer() if sending is still 1 -> it is 2. context 1 changes in batadv_tp_sender_shutdown() sending to 0 and in this process forces the kthread to stop timer in batadv_tp_sender_cleanup() 3. context 0 continues in batadv_tp_reset_sender_timer() and rearms the timer -> but the reference for it is already gone Cc: stable@kernel.org Fixes: 33a3bb4a3345 ("batman-adv: throughput meter implementation") [ adapt pre-hunk to old del_timer* names ] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	batman-adv: tp_meter: avoid role confusion in tp_list	Sven Eckelmann	1	-23/+36
	commit ff24f2ecfd94c07a2b89bac497433e3b23271cac upstream. Session lookups in tp_list matched only on destination address (and optionally session ID), leaving role validation to the caller. If two sessions with the same other_end coexisted (one as sender, one as receiver) a lookup could silently return the wrong one, causing the caller's role to bail out early, potentially skipping necessary cleanup. Move the role check into the lookup functions themselves so the correct entry is always returned, or none at all. Since batadv_tp_start() legitimately needs to detect any active session to a destination regardless of role, introduce a dedicated helper for that case rather than bending the existing lookup semantics. Cc: stable@kernel.org Fixes: 33a3bb4a3345 ("batman-adv: throughput meter implementation") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	batman-adv: iv: recover OGM scheduling after forward packet error	Sven Eckelmann	2	-19/+60
	commit aa3153bd139a6c48667dcd02608d3b2c80bff02c upstream. When batadv_iv_ogm_schedule_buff() fails to allocate and queue a forward packet for OGM transmission, the work item that drives periodic OGM scheduling is never re-armed. This silently halts transmission of the node's own OGMs on the affected interface — only OGMs from other peers continue to be aggregated and forwarded. Fix this by tracking whether batadv_iv_ogm_queue_add() (and transitively batadv_iv_ogm_aggregate_new()) successfully scheduled a forward packet. When scheduling fails, batadv_iv_ogm_schedule_buff() falls back to queuing a dedicated recovery work item (reschedule_work) that fires after one originator interval and calls batadv_iv_ogm_schedule() again. Cc: stable@kernel.org Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	batman-adv: tvlv: reject oversized TVLV packets	Sven Eckelmann	1	-3/+8
	commit f50487e3566358b2b982b7801945e858c78ad9ab upstream. batadv_tvlv_container_ogm_append() builds a TVLV packet section from the tvlv.container_list. The total size of this section is computed by batadv_tvlv_container_list_size(), which sums the sizes of all registered containers. The return type and accumulator in batadv_tvlv_container_list_size() were u16. If the accumulated size exceeds U16_MAX, the value wraps around, causing the subsequent allocation in batadv_tvlv_container_ogm_append() to be undersized. The memcpy-style copy that follows would then write beyond the end of the allocated buffer, corrupting kernel memory. Fix this by widening the return type of batadv_tvlv_container_list_size() to size_t. In batadv_tvlv_container_ogm_append(), check the computed length against U16_MAX before proceeding, and bail out as if the allocation had failed when the limit is exceeded. Cc: stable@kernel.org Fixes: ef26157747d4 ("batman-adv: tvlv - basic infrastructure") Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Reviewed-by: Yuan Tan <yuantan098@gmail.com> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	batman-adv: bla: avoid NULL-ptr deref for claim via dropped interface	Sven Eckelmann	1	-2/+4
	commit f80d3d98d2ff78d9e2fe5d68b1f45948c4f7bd24 upstream. Without rtnl_lock held, a hardif might be retrieved as primary interface of a meshif, but then (while operating on this interface) getting decoupled from the mesh interface. In this case, the meshif still exists but the pointer from the primary hardif to the meshif is set to NULL. The mesh_iface must be checked first to be non-NULL before continuing to send an ARP request using meshif. Cc: stable@kernel.org Fixes: 23721387c409 ("batman-adv: add basic bridge loop avoidance code") Reported-by: Ido Schimmel <idosch@nvidia.com> Reported-by: syzbot+9fdcc9f05a98a540b816@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9fdcc9f05a98a540b816 [ switch to old "mesh_iface" name "soft_iface" ] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	batman-adv: tt: reject oversized local TVLV buffers	Sven Eckelmann	1	-3/+5
	commit 1e9fab756f8395096d5bba7be0c373c4c8f5d165 upstream. The commit 3a359bf5c61d ("batman-adv: reject oversized global TT response buffers") added a check to ensure that a global return buffer size can be stored in an u16. The same buffer handling also exists for the local data buffer but was not touched. A similar check should be also be in place for the local TVLV buffer. It doesn't have the similar attack surface because it is only generated from locally discovered MAC addresses but the dynamic nature could still cause temporarily to large buffers. Cc: stable@kernel.org Fixes: 7ea7b4a14275 ("batman-adv: make the TT CRC logic VLAN specific") [ Context ] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	batman-adv: tvlv: abort OGM send on tvlv append failure	Sven Eckelmann	4	-21/+40
	commit 501368506563e151b322c8c3f228b796e615b90d upstream. batadv_tvlv_container_ogm_append() could fail in two ways: a memory allocation failure when resizing the packet buffer, or the tvlv data exceeding U16_MAX bytes. In both cases the function previously returned the old (now stale) tvlv_value_len rather than signalling an error, causing the OGM/OGM2 send path to transmit a packet whose TVLV length field no longer matched the actual buffer contents. And because it also didn't fill in the new TVLV data, sending either uninitialized or corrupted data on the wire. All errors in batadv_tvlv_container_ogm_append() must be forwarded to the caller. And the caller must abort the send of the OGM2. For B.A.T.M.A.N. IV, it is currently not allowed to abort the send. The non-TVLV part of the OGM must be queued up instead. Cc: stable@kernel.org Fixes: ef26157747d4 ("batman-adv: tvlv - basic infrastructure") [ Context ] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	batman-adv: v: stop OGMv2 on disabled interface	Sven Eckelmann	1	-12/+21
	commit f8ce8b8331a1bc44ad4905886a482214d428b253 upstream. When a batadv_hard_iface is disabled, its mesh_iface pointer is set to NULL. However, batadv_v_ogm_send_meshif() may still dispatch OGMs via batadv_v_ogm_queue_on_if() for interfaces that have since lost their mesh_iface association. This results in a NULL pointer dereference when batadv_v_ogm_queue_on_if() unconditionally calls netdev_priv() on the now NULL hard_iface->mesh_iface to retrieve the batadv_priv. It is necessary to ensure that the batadv_v_ogm_queue_on_if() checks that it is using the same mesh_iface for which batadv_v_ogm_send_meshif() was called. Cc: stable@kernel.org Fixes: 0da0035942d4 ("batman-adv: OGMv2 - add basic infrastructure") Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Reviewed-by: Yuan Tan <yuantan098@gmail.com> [ switch to old "mesh_iface" name "soft_iface" ] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	net: skbuff: fix pskb_carve leaking zcopy pages	Pavel Begunkov	1	-0/+10
	[ Upstream commit ff6e798c2eac3ebd0501ad7e796f583fab928de8 ] When SKBFL_MANAGED_FRAG_REFS is set, frag pages are not refcounted but their lifetime is controlled by the attached ubuf_info. To make a copy of the skb_shared_info, we either should clear the flag and reference the frags, or keep the flag and have frags unreferenced. pskb_carve_inside_header() and pskb_carve_inside_nonlinear() don't follow the rule and thus can leak page references. Let's clear SKBFL_MANAGED_FRAG_REFS from the original skb to fix it. It's the simplest way to address it, but there are more performant ways to do that if it ever becomes a problem. Link: https://lore.kernel.org/all/20260523085809.26331-1-nvminh232@clc.fitus.edu.vn/ Fixes: 753f1ca4e1e50 ("net: introduce managed frags infrastructure") Reported-by: Minh Nguyen <minhnguyen.080505@gmail.com> Reported-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/1e2086aa69217d7f9c8da3d38f5be7160f1b4cd1.1779993185.git.asml.silence@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	ipv6: fix possible infinite loop in fib6_select_path()	Jiayuan Chen	1	-0/+3
	[ Upstream commit 9c7da87c2dc860bb17ca1ece942495d28b1ce3b9 ] Found while auditing the same pattern Sashiko reported in rt6_fill_node() [1]. Apply the same fix as commit f8d8ce1b515a ("ipv6: fix possible infinite loop in fib6_info_uses_dev()"). Writers holding tb6_lock can list_del_rcu(&first->fib6_siblings) without waiting for RCU readers; first->fib6_siblings.next then still points into the old ring and this softirq-side walker never reaches &first->fib6_siblings as its terminator. fib6_purge_rt() always WRITE_ONCE()s first->fib6_nsiblings to 0 before list_del_rcu(), so an inside-loop check is a reliable detach signal. [1] https://sashiko.dev/#/patchset/20260526020227.4857-1-jiayuan.chen%40linux.dev Fixes: d9ccb18f83ea ("ipv6: Fix soft lockups in fib6_select_path under high next hop churn") Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260527053133.180695-2-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	ipv6: fix possible infinite loop in rt6_fill_node()	Jiayuan Chen	1	-0/+2
	[ Upstream commit 9f72412bcf60144f252b0d6205106abf14344abc ] Sashiko reported this issue [1]. Apply the same fix as commit f8d8ce1b515a ("ipv6: fix possible infinite loop in fib6_info_uses_dev()"). Writers holding tb6_lock can list_del_rcu(&rt->fib6_siblings) without waiting for RCU readers; rt->fib6_siblings.next then still points into the old ring and this softirq-side walker never reaches &rt->fib6_siblings, causing a CPU stall. fib6_del_route() always WRITE_ONCE()s rt->fib6_nsiblings to 0 before list_del_rcu(), so an inside-loop check is a reliable detach signal. [1] https://sashiko.dev/#/patchset/20260526020227.4857-1-jiayuan.chen%40linux.dev Fixes: d9ccb18f83ea ("ipv6: Fix soft lockups in fib6_select_path under high next hop churn") Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20260527053133.180695-1-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	sctp: fix race between sctp_wait_for_connect and peeloff	Zhenghang Xiao	1	-0/+2
	[ Upstream commit f14fe6395a8b3d961a61e138ad7b36ba3626dd4e ] sctp_wait_for_connect() drops and re-acquires the socket lock while waiting for the association to reach ESTABLISHED state. During this window, another thread can peeloff the association to a new socket via getsockopt(SCTP_SOCKOPT_PEELOFF), changing asoc->base.sk. After re-acquiring the old socket lock, sctp_wait_for_connect() returns success without noticing the migration — the caller then accesses the association under the wrong lock in sctp_datamsg_from_user(). Add the same sk != asoc->base.sk check that sctp_wait_for_sndbuf() already has, returning an error if the association was migrated while we slept. Fixes: 668c9beb9020 ("sctp: implement assign_number for sctp_stream_interleave") Signed-off-by: Zhenghang Xiao <kipreyyy@gmail.com> Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20260527032411.60959-1-kipreyyy@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2 days	Bluetooth: hci_sync: Set HCI_CMD_DRAIN_WORKQUEUE during device close	Heitor Alves de Siqueira	1	-0/+8
	[ Upstream commit 525daaea459fc215f432de1b8debbd9144bf97b0 ] Since hci_dev_close_sync() can now be called during the reset path, we should also set HCI_CMD_DRAIN_WORKQUEUE. This avoids queuing timeouts while the hdev workqueue is being drained. Fixes: 877afadad2dc ("Bluetooth: When HCI work queue is drained, only queue chained work") Signed-off-by: Heitor Alves de Siqueira <halves@igalia.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>