<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/net/unix, branch v6.18.33</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.33</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.33'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-05-23T11:06:52+00:00</updated>
<entry>
<title>bpf, sockmap: Take state lock for af_unix iter</title>
<updated>2026-05-23T11:06:52+00:00</updated>
<author>
<name>Michal Luczaj</name>
<email>mhal@rbox.co</email>
</author>
<published>2026-04-14T14:13:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=921920c34cb591947dd30c692500795a69f1e3fa'/>
<id>urn:sha1:921920c34cb591947dd30c692500795a69f1e3fa</id>
<content type='text'>
[ Upstream commit 64c2f93fc3254d3bf5de4445fb732ee5c451edb6 ]

When a BPF iterator program updates a sockmap, there is a race condition in
unix_stream_bpf_update_proto() where the `peer` pointer can become stale[1]
during a state transition TCP_ESTABLISHED -&gt; TCP_CLOSE.

        CPU0 bpf                          CPU1 close
        --------                          ----------
// unix_stream_bpf_update_proto()
sk_pair = unix_peer(sk)
if (unlikely(!sk_pair))
   return -EINVAL;
                                     // unix_release_sock()
                                     skpair = unix_peer(sk);
                                     unix_peer(sk) = NULL;
                                     sock_put(skpair)
sock_hold(sk_pair) // UaF

More practically, this fix guarantees that the iterator program is
consistently provided with a unix socket that remains stable during
iterator execution.

[1]:
BUG: KASAN: slab-use-after-free in unix_stream_bpf_update_proto+0x155/0x490
Write of size 4 at addr ffff8881178c9a00 by task test_progs/2231
Call Trace:
 dump_stack_lvl+0x5d/0x80
 print_report+0x170/0x4f3
 kasan_report+0xe4/0x1c0
 kasan_check_range+0x125/0x200
 unix_stream_bpf_update_proto+0x155/0x490
 sock_map_link+0x71c/0xec0
 sock_map_update_common+0xbc/0x600
 sock_map_update_elem+0x19a/0x1f0
 bpf_prog_bbbf56096cdd4f01_selective_dump_unix+0x20c/0x217
 bpf_iter_run_prog+0x21e/0xae0
 bpf_iter_unix_seq_show+0x1e0/0x2a0
 bpf_seq_read+0x42c/0x10d0
 vfs_read+0x171/0xb20
 ksys_read+0xff/0x200
 do_syscall_64+0xf7/0x5e0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Allocated by task 2236:
 kasan_save_stack+0x30/0x50
 kasan_save_track+0x14/0x30
 __kasan_slab_alloc+0x63/0x80
 kmem_cache_alloc_noprof+0x1d5/0x680
 sk_prot_alloc+0x59/0x210
 sk_alloc+0x34/0x470
 unix_create1+0x86/0x8a0
 unix_stream_connect+0x318/0x15b0
 __sys_connect+0xfd/0x130
 __x64_sys_connect+0x72/0xd0
 do_syscall_64+0xf7/0x5e0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2236:
 kasan_save_stack+0x30/0x50
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x70
 __kasan_slab_free+0x47/0x70
 kmem_cache_free+0x11c/0x590
 __sk_destruct+0x432/0x6e0
 unix_release_sock+0x9b3/0xf60
 unix_release+0x8a/0xf0
 __sock_release+0xb0/0x270
 sock_close+0x18/0x20
 __fput+0x36e/0xac0
 fput_close_sync+0xe5/0x1a0
 __x64_sys_close+0x7d/0xd0
 do_syscall_64+0xf7/0x5e0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fixes: 2c860a43dd77 ("bpf: af_unix: Implement BPF iterator for UNIX domain socket.")
Suggested-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Signed-off-by: Michal Luczaj &lt;mhal@rbox.co&gt;
Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260414-unix-proto-update-null-ptr-deref-v4-5-2af6fe97918e@rbox.co
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf, sockmap: Fix af_unix null-ptr-deref in proto update</title>
<updated>2026-05-23T11:06:51+00:00</updated>
<author>
<name>Michal Luczaj</name>
<email>mhal@rbox.co</email>
</author>
<published>2026-04-14T14:13:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=041eb6348d73ee5e15fc8161f1eac5a6e8289ca0'/>
<id>urn:sha1:041eb6348d73ee5e15fc8161f1eac5a6e8289ca0</id>
<content type='text'>
[ Upstream commit dca38b7734d2ea00af4818ff3ae836fab33d5d5a ]

unix_stream_connect() sets sk_state (`WRITE_ONCE(sk-&gt;sk_state,
TCP_ESTABLISHED)`) _before_ it assigns a peer (`unix_peer(sk) = newsk`).
sk_state == TCP_ESTABLISHED makes sock_map_sk_state_allowed() believe that
socket is properly set up, which would include having a defined peer. IOW,
there's a window when unix_stream_bpf_update_proto() can be called on
socket which still has unix_peer(sk) == NULL.

         CPU0 bpf                            CPU1 connect
         --------                            ------------

                                WRITE_ONCE(sk-&gt;sk_state, TCP_ESTABLISHED)
sock_map_sk_state_allowed(sk)
...
sk_pair = unix_peer(sk)
sock_hold(sk_pair)
                                sock_hold(newsk)
                                smp_mb__after_atomic()
                                unix_peer(sk) = newsk

BUG: kernel NULL pointer dereference, address: 0000000000000080
RIP: 0010:unix_stream_bpf_update_proto+0xa0/0x1b0
Call Trace:
  sock_map_link+0x564/0x8b0
  sock_map_update_common+0x6e/0x340
  sock_map_update_elem_sys+0x17d/0x240
  __sys_bpf+0x26db/0x3250
  __x64_sys_bpf+0x21/0x30
  do_syscall_64+0x6b/0x3a0
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

Initial idea was to move peer assignment _before_ the sk_state update[1],
but that involved an additional memory barrier, and changing the hot path
was rejected.
Then a NULL check during proto update in unix_stream_bpf_update_proto() was
considered[2], but the follow-up discussion[3] focused on the root cause,
i.e. sockmap update taking a wrong lock. Or, more specifically, missing
unix_state_lock()[4].
In the end it was concluded that teaching sockmap about the af_unix locking
would be unnecessarily complex[5].
Complexity aside, since BPF_PROG_TYPE_SCHED_CLS and BPF_PROG_TYPE_SCHED_ACT
are allowed to update sockmaps, sock_map_update_elem() taking the unix
lock, as it is currently implemented in unix_state_lock():
spin_lock(&amp;unix_sk(s)-&gt;lock), would be problematic. unix_state_lock() taken
in a process context, followed by a softirq-context TC BPF program
attempting to take the same spinlock -- deadlock[6].
This way we circled back to the peer check idea[2].

[1]: https://lore.kernel.org/netdev/ba5c50aa-1df4-40c2-ab33-a72022c5a32e@rbox.co/
[2]: https://lore.kernel.org/netdev/20240610174906.32921-1-kuniyu@amazon.com/
[3]: https://lore.kernel.org/netdev/7603c0e6-cd5b-452b-b710-73b64bd9de26@linux.dev/
[4]: https://lore.kernel.org/netdev/CAAVpQUA+8GL_j63CaKb8hbxoL21izD58yr1NvhOhU=j+35+3og@mail.gmail.com/
[5]: https://lore.kernel.org/bpf/CAAVpQUAHijOMext28Gi10dSLuMzGYh+jK61Ujn+fZ-wvcODR2A@mail.gmail.com/
[6]: https://lore.kernel.org/bpf/dd043c69-4d03-46fe-8325-8f97101435cf@linux.dev/

Summary of scenarios where af_unix/stream connect() may race a sockmap
update:

1. connect() vs. bpf(BPF_MAP_UPDATE_ELEM), i.e. sock_map_update_elem_sys()

   Implemented NULL check is sufficient. Once assigned, socket peer won't
   be released until socket fd is released. And that's not an issue because
   sock_map_update_elem_sys() bumps fd refcnf.

2. connect() vs BPF program doing update

   Update restricted per verifier.c:may_update_sockmap() to

      BPF_PROG_TYPE_TRACING/BPF_TRACE_ITER
      BPF_PROG_TYPE_SOCK_OPS (bpf_sock_map_update() only)
      BPF_PROG_TYPE_SOCKET_FILTER
      BPF_PROG_TYPE_SCHED_CLS
      BPF_PROG_TYPE_SCHED_ACT
      BPF_PROG_TYPE_XDP
      BPF_PROG_TYPE_SK_REUSEPORT
      BPF_PROG_TYPE_FLOW_DISSECTOR
      BPF_PROG_TYPE_SK_LOOKUP

   Plus one more race to consider:

            CPU0 bpf                            CPU1 connect
            --------                            ------------

                                   WRITE_ONCE(sk-&gt;sk_state, TCP_ESTABLISHED)
   sock_map_sk_state_allowed(sk)
                                   sock_hold(newsk)
                                   smp_mb__after_atomic()
                                   unix_peer(sk) = newsk
   sk_pair = unix_peer(sk)
   if (unlikely(!sk_pair))
      return -EINVAL;

                                                 CPU1 close
                                                 ----------

                                   skpair = unix_peer(sk);
                                   unix_peer(sk) = NULL;
                                   sock_put(skpair)
   // use after free?
   sock_hold(sk_pair)

   2.1 BPF program invoking helper function bpf_sock_map_update() -&gt;
       BPF_CALL_4(bpf_sock_map_update(), ...)

       Helper limited to BPF_PROG_TYPE_SOCK_OPS. Nevertheless, a unix sock
       might be accessible via bpf_map_lookup_elem(). Which implies sk
       already having psock, which in turn implies sk already having
       sk_pair. Since sk_psock_destroy() is queued as RCU work, sk_pair
       won't go away while BPF executes the update.

   2.2 BPF program invoking helper function bpf_map_update_elem() -&gt;
       sock_map_update_elem()

       2.2.1 Unix sock accessible to BPF prog only via sockmap lookup in
             BPF_PROG_TYPE_SOCKET_FILTER, BPF_PROG_TYPE_SCHED_CLS,
             BPF_PROG_TYPE_SCHED_ACT, BPF_PROG_TYPE_XDP,
             BPF_PROG_TYPE_SK_REUSEPORT, BPF_PROG_TYPE_FLOW_DISSECTOR,
             BPF_PROG_TYPE_SK_LOOKUP.

             Pretty much the same as case 2.1.

       2.2.2 Unix sock accessible to BPF program directly:
             BPF_PROG_TYPE_TRACING, narrowed down to BPF_TRACE_ITER.

             Sockmap iterator (sock_map_seq_ops) is safe: unix sock
             residing in a sockmap means that the sock already went through
             the proto update step.

             Unix sock iterator (bpf_iter_unix_seq_ops), on the other hand,
             gives access to socks that may still be unconnected. Which
             means iterator prog can race sockmap/proto update against
             connect().

             BUG: KASAN: null-ptr-deref in unix_stream_bpf_update_proto+0x253/0x4d0
             Write of size 4 at addr 0000000000000080 by task test_progs/3140
             Call Trace:
              dump_stack_lvl+0x5d/0x80
              kasan_report+0xe4/0x1c0
              kasan_check_range+0x125/0x200
              unix_stream_bpf_update_proto+0x253/0x4d0
              sock_map_link+0x71c/0xec0
              sock_map_update_common+0xbc/0x600
              sock_map_update_elem+0x19a/0x1f0
              bpf_prog_bbbf56096cdd4f01_selective_dump_unix+0x20c/0x217
              bpf_iter_run_prog+0x21e/0xae0
              bpf_iter_unix_seq_show+0x1e0/0x2a0
              bpf_seq_read+0x42c/0x10d0
              vfs_read+0x171/0xb20
              ksys_read+0xff/0x200
              do_syscall_64+0xf7/0x5e0
              entry_SYSCALL_64_after_hwframe+0x76/0x7e

             While the introduced NULL check prevents null-ptr-deref in the
             BPF program path as well, it is insufficient to guard against
             a poorly timed close() leading to a use-after-free. This will
             be addressed in a subsequent patch.

Fixes: c63829182c37 ("af_unix: Implement -&gt;psock_update_sk_prot()")
Closes: https://lore.kernel.org/netdev/ba5c50aa-1df4-40c2-ab33-a72022c5a32e@rbox.co/
Reported-by: Michal Luczaj &lt;mhal@rbox.co&gt;
Reported-by: 钱一铭 &lt;yimingqian591@gmail.com&gt;
Suggested-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Suggested-by: Martin KaFai Lau &lt;martin.lau@linux.dev&gt;
Signed-off-by: Michal Luczaj &lt;mhal@rbox.co&gt;
Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260414-unix-proto-update-null-ptr-deref-v4-4-2af6fe97918e@rbox.co
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf, sockmap: Fix af_unix iter deadlock</title>
<updated>2026-05-23T11:06:51+00:00</updated>
<author>
<name>Michal Luczaj</name>
<email>mhal@rbox.co</email>
</author>
<published>2026-04-14T14:13:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=527057ebe8076dfbcaef51195ff1b7508646be2c'/>
<id>urn:sha1:527057ebe8076dfbcaef51195ff1b7508646be2c</id>
<content type='text'>
[ Upstream commit 4d328dd695383224aa750ddee6b4ad40c0f8d205 ]

bpf_iter_unix_seq_show() may deadlock when lock_sock_fast() takes the fast
path and the iter prog attempts to update a sockmap. Which ends up spinning
at sock_map_update_elem()'s bh_lock_sock():

WARNING: possible recursive locking detected
test_progs/1393 is trying to acquire lock:
ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: sock_map_update_elem+0xdb/0x1f0

but task is already holding lock:
ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: __lock_sock_fast+0x37/0xe0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(slock-AF_UNIX);
  lock(slock-AF_UNIX);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

4 locks held by test_progs/1393:
 #0: ffff88814b59c790 (&amp;p-&gt;lock){+.+.}-{4:4}, at: bpf_seq_read+0x59/0x10d0
 #1: ffff88811ec25fd8 (sk_lock-AF_UNIX){+.+.}-{0:0}, at: bpf_seq_read+0x42c/0x10d0
 #2: ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: __lock_sock_fast+0x37/0xe0
 #3: ffffffff85a6a7c0 (rcu_read_lock){....}-{1:3}, at: bpf_iter_run_prog+0x51d/0xb00

Call Trace:
 dump_stack_lvl+0x5d/0x80
 print_deadlock_bug.cold+0xc0/0xce
 __lock_acquire+0x130f/0x2590
 lock_acquire+0x14e/0x2b0
 _raw_spin_lock+0x30/0x40
 sock_map_update_elem+0xdb/0x1f0
 bpf_prog_2d0075e5d9b721cd_dump_unix+0x55/0x4f4
 bpf_iter_run_prog+0x5b9/0xb00
 bpf_iter_unix_seq_show+0x1f7/0x2e0
 bpf_seq_read+0x42c/0x10d0
 vfs_read+0x171/0xb20
 ksys_read+0xff/0x200
 do_syscall_64+0x6b/0x3a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fixes: 2c860a43dd77 ("bpf: af_unix: Implement BPF iterator for UNIX domain socket.")
Suggested-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Suggested-by: Martin KaFai Lau &lt;martin.lau@linux.dev&gt;
Signed-off-by: Michal Luczaj &lt;mhal@rbox.co&gt;
Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
Reviewed-by: Jiayuan Chen &lt;jiayuan.chen@linux.dev&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260414-unix-proto-update-null-ptr-deref-v4-2-2af6fe97918e@rbox.co
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>af_unix: Reject SIOCATMARK on non-stream sockets</title>
<updated>2026-05-14T13:30:17+00:00</updated>
<author>
<name>Jiexun Wang</name>
<email>wangjiexun2025@gmail.com</email>
</author>
<published>2026-05-06T14:08:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c34c41446acf6c0d13b5b06c809be11e0f7f2729'/>
<id>urn:sha1:c34c41446acf6c0d13b5b06c809be11e0f7f2729</id>
<content type='text'>
commit d119775f2bad827edc28071c061fdd4a91f889a5 upstream.

SIOCATMARK reports whether the receive queue is at the urgent mark for
MSG_OOB.

In AF_UNIX, MSG_OOB is supported only for SOCK_STREAM sockets.
SOCK_DGRAM and SOCK_SEQPACKET reject MSG_OOB in sendmsg() and recvmsg(),
so they should not support SIOCATMARK either.

Return -EOPNOTSUPP for non-stream sockets before checking the receive
queue.

Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Cc: stable@kernel.org
Reported-by: Yuan Tan &lt;yuantan098@gmail.com&gt;
Reported-by: Yifan Wu &lt;yifanwucs@gmail.com&gt;
Reported-by: Juefei Pu &lt;tomapufckgml@gmail.com&gt;
Reported-by: Xin Liu &lt;bird@lzu.edu.cn&gt;
Suggested-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Signed-off-by: Jiexun Wang &lt;wangjiexun2025@gmail.com&gt;
Signed-off-by: Ren Wei &lt;n05ec@lzu.edu.cn&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260506140825.2987635-1-n05ec@lzu.edu.cn
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>af_unix: read UNIX_DIAG_VFS data under unix_state_lock</title>
<updated>2026-04-22T11:22:23+00:00</updated>
<author>
<name>Jiexun Wang</name>
<email>wangjiexun2025@gmail.com</email>
</author>
<published>2026-04-07T08:00:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=900a4e0910e98b8caef117d5df00471fa438dcf9'/>
<id>urn:sha1:900a4e0910e98b8caef117d5df00471fa438dcf9</id>
<content type='text'>
[ Upstream commit 39897df386376912d561d4946499379effa1e7ef ]

Exact UNIX diag lookups hold a reference to the socket, but not to
u-&gt;path. Meanwhile, unix_release_sock() clears u-&gt;path under
unix_state_lock() and drops the path reference after unlocking.

Read the inode and device numbers for UNIX_DIAG_VFS while holding
unix_state_lock(), then emit the netlink attribute after dropping the
lock.

This keeps the VFS data stable while the reply is being built.

Fixes: 5f7b0569460b ("unix_diag: Unix inode info NLA")
Reported-by: Yifan Wu &lt;yifanwucs@gmail.com&gt;
Reported-by: Juefei Pu &lt;tomapufckgml@gmail.com&gt;
Co-developed-by: Yuan Tan &lt;yuantan098@gmail.com&gt;
Signed-off-by: Yuan Tan &lt;yuantan098@gmail.com&gt;
Suggested-by: Xin Liu &lt;bird@lzu.edu.cn&gt;
Tested-by: Ren Wei &lt;enjou1224z@gmail.com&gt;
Signed-off-by: Jiexun Wang &lt;wangjiexun2025@gmail.com&gt;
Signed-off-by: Ren Wei &lt;n05ec@lzu.edu.cn&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260407080015.1744197-1-n05ec@lzu.edu.cn
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>af_unix: Give up GC if MSG_PEEK intervened.</title>
<updated>2026-04-18T08:44:57+00:00</updated>
<author>
<name>Kuniyuki Iwashima</name>
<email>kuniyu@google.com</email>
</author>
<published>2026-03-11T05:40:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=72cf49ad50c16270b52bc512d9c2df5743922968'/>
<id>urn:sha1:72cf49ad50c16270b52bc512d9c2df5743922968</id>
<content type='text'>
[ Upstream commit e5b31d988a41549037b8d8721a3c3cae893d8670 ]

Igor Ushakov reported that GC purged the receive queue of
an alive socket due to a race with MSG_PEEK with a nice repro.

This is the exact same issue previously fixed by commit
cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK").

After GC was replaced with the current algorithm, the cited
commit removed the locking dance in unix_peek_fds() and
reintroduced the same issue.

The problem is that MSG_PEEK bumps a file refcount without
interacting with GC.

Consider an SCC containing sk-A and sk-B, where sk-A is
close()d but can be recv()ed via sk-B.

The bad thing happens if sk-A is recv()ed with MSG_PEEK from
sk-B and sk-B is close()d while GC is checking unix_vertex_dead()
for sk-A and sk-B.

  GC thread                    User thread
  ---------                    -----------
  unix_vertex_dead(sk-A)
  -&gt; true   &lt;------.
                    \
                     `------   recv(sk-B, MSG_PEEK)
              invalidate !!    -&gt; sk-A's file refcount : 1 -&gt; 2

                               close(sk-B)
                               -&gt; sk-B's file refcount : 2 -&gt; 1
  unix_vertex_dead(sk-B)
  -&gt; true

Initially, sk-A's file refcount is 1 by the inflight fd in sk-B
recvq.  GC thinks sk-A is dead because the file refcount is the
same as the number of its inflight fds.

However, sk-A's file refcount is bumped silently by MSG_PEEK,
which invalidates the previous evaluation.

At this moment, sk-B's file refcount is 2; one by the open fd,
and one by the inflight fd in sk-A.  The subsequent close()
releases one refcount by the former.

Finally, GC incorrectly concludes that both sk-A and sk-B are dead.

One option is to restore the locking dance in unix_peek_fds(),
but we can resolve this more elegantly thanks to the new algorithm.

The point is that the issue does not occur without the subsequent
close() and we actually do not need to synchronise MSG_PEEK with
the dead SCC detection.

When the issue occurs, close() and GC touch the same file refcount.
If GC sees the refcount being decremented by close(), it can just
give up garbage-collecting the SCC.

Therefore, we only need to signal the race during MSG_PEEK with
a proper memory barrier to make it visible to the GC.

Let's use seqcount_t to notify GC when MSG_PEEK occurs and let
it defer the SCC to the next run.

This way no locking is needed on the MSG_PEEK side, and we can
avoid imposing a penalty on every MSG_PEEK unnecessarily.

Note that we can retry within unix_scc_dead() if MSG_PEEK is
detected, but we do not do so to avoid hung task splat from
abusive MSG_PEEK calls.

Fixes: 118f457da9ed ("af_unix: Remove lock dance in unix_peek_fds().")
Reported-by: Igor Ushakov &lt;sysroot314@gmail.com&gt;
Signed-off-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260311054043.1231316-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>af_unix: Simplify GC state.</title>
<updated>2026-04-18T08:44:57+00:00</updated>
<author>
<name>Kuniyuki Iwashima</name>
<email>kuniyu@google.com</email>
</author>
<published>2025-11-15T02:08:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=33120558237c7e13db3c39f09fd712431e455005'/>
<id>urn:sha1:33120558237c7e13db3c39f09fd712431e455005</id>
<content type='text'>
[ Upstream commit 6b6f3c71fe568aa8ed3e16e9135d88a5f4fd3e84 ]

GC manages its state by two variables, unix_graph_maybe_cyclic
and unix_graph_grouped, both of which are set to false in the
initial state.

When an AF_UNIX socket is passed to an in-flight AF_UNIX socket,
unix_update_graph() sets unix_graph_maybe_cyclic to true and
unix_graph_grouped to false, making the next GC invocation call
unix_walk_scc() to group SCCs.

Once unix_walk_scc() finishes, sockets in the same SCC are linked
via vertex-&gt;scc_entry.  Then, unix_graph_grouped is set to true
so that the following GC invocations can skip Tarjan's algorithm
and simply iterate through the list in unix_walk_scc_fast().

In addition, if we know there is at least one cyclic reference,
we set unix_graph_maybe_cyclic to true so that we do not skip GC.

So the state transitions as follows:

  (unix_graph_maybe_cyclic, unix_graph_grouped)
  =
  (false, false) -&gt; (true, false) -&gt; (true, true) or (false, true)
                         ^.______________/________________/

There is no transition to the initial state where both variables
are false.

If we consider the initial state as grouped, we can see that the
GC actually has a tristate.

Let's consolidate two variables into one enum.

Signed-off-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20251115020935.2643121-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Stable-dep-of: e5b31d988a41 ("af_unix: Give up GC if MSG_PEEK intervened.")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>af_unix: Count cyclic SCC.</title>
<updated>2026-04-18T08:44:57+00:00</updated>
<author>
<name>Kuniyuki Iwashima</name>
<email>kuniyu@google.com</email>
</author>
<published>2025-11-15T02:08:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1e211179f1d9273b6cfd0b30d6983dc1d626736b'/>
<id>urn:sha1:1e211179f1d9273b6cfd0b30d6983dc1d626736b</id>
<content type='text'>
[ Upstream commit 58b47c713711b8afbf68e3158d4d5acdead00e9b ]

__unix_walk_scc() and unix_walk_scc_fast() call unix_scc_cyclic()
for each SCC to check if it forms a cyclic reference, so that we
can skip GC at the following invocations in case all SCCs do not
have any cycles.

If we count the number of cyclic SCCs in __unix_walk_scc(), we can
simplify unix_walk_scc_fast() because the number of cyclic SCCs
only changes when it garbage-collects a SCC.

So, let's count cyclic SCC in __unix_walk_scc() and decrement it
in unix_walk_scc_fast() when performing garbage collection.

Note that we will use this counter in a later patch to check if a
cycle existed in the previous GC run.

Signed-off-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20251115020935.2643121-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Stable-dep-of: e5b31d988a41 ("af_unix: Give up GC if MSG_PEEK intervened.")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: annotate data-races around sk-&gt;sk_{data_ready,write_space}</title>
<updated>2026-03-12T11:09:49+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2026-02-25T13:15:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f17c1c4acbe2bd702abce73a847a04a196fab2c5'/>
<id>urn:sha1:f17c1c4acbe2bd702abce73a847a04a196fab2c5</id>
<content type='text'>
[ Upstream commit 2ef2b20cf4e04ac8a6ba68493f8780776ff84300 ]

skmsg (and probably other layers) are changing these pointers
while other cpus might read them concurrently.

Add corresponding READ_ONCE()/WRITE_ONCE() annotations
for UDP, TCP and AF_UNIX.

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Reported-by: syzbot+87f770387a9e5dc6b79b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/699ee9fc.050a0220.1cd54b.0009.GAE@google.com/
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: John Fastabend &lt;john.fastabend@gmail.com&gt;
Cc: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Cc: Willem de Bruijn &lt;willemdebruijn.kernel@gmail.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260225131547.1085509-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>af_unix: Fix memleak of newsk in unix_stream_connect().</title>
<updated>2026-02-26T22:59:23+00:00</updated>
<author>
<name>Kuniyuki Iwashima</name>
<email>kuniyu@google.com</email>
</author>
<published>2026-02-07T23:22:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=365996a2b14d07caa9e33d367b67ea26c09d89b4'/>
<id>urn:sha1:365996a2b14d07caa9e33d367b67ea26c09d89b4</id>
<content type='text'>
[ Upstream commit 6884028cd7f275f8bcb854a347265cb1fb0e4bea ]

When prepare_peercred() fails in unix_stream_connect(),
unix_release_sock() is not called for newsk, and the memory
is leaked.

Let's move prepare_peercred() before unix_create1().

Fixes: fd0a109a0f6b ("net, pidfs: prepare for handing out pidfds for reaped sk-&gt;sk_peer_pid")
Signed-off-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260207232236.2557549-1-kuniyu@google.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
