<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/net/ipv6/udp.c, branch v7.0.10</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.10</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.10'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-05-23T11:08:41+00:00</updated>
<entry>
<title>udp: Force compute_score to always inline</title>
<updated>2026-05-23T11:08:41+00:00</updated>
<author>
<name>Gabriel Krisman Bertazi</name>
<email>krisman@suse.de</email>
</author>
<published>2026-04-10T15:59:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dd03fcffd3f1fbe7def71747935a5669f3e850ab'/>
<id>urn:sha1:dd03fcffd3f1fbe7def71747935a5669f3e850ab</id>
<content type='text'>
[ Upstream commit b80a95ccf1604a882bb153c45ccb4056e44c8edb ]

Back in 2024 I reported a 7-12% regression on an iperf3 UDP loopback
thoughput test that we traced to the extra overhead of calling
compute_score on two places, introduced by commit f0ea27e7bfe1 ("udp:
re-score reuseport groups when connected sockets are present").  At the
time, I pointed out the overhead was caused by the multiple calls,
associated with cpu-specific mitigations, and merged commit
50aee97d1511 ("udp: Avoid call to compute_score on multiple sites") to
jump back explicitly, to force the rescore call in a single place.

Recently though, we got another regression report against a newer distro
version, which a team colleague traced back to the same root-cause.
Turns out that once we updated to gcc-13, the compiler got smart enough
to unroll the loop, undoing my previous mitigation.  Let's bite the
bullet and __always_inline compute_score on both ipv4 and ipv6 to
prevent gcc from de-optimizing it again in the future.  These functions
are only called in two places each, udpX_lib_lookup1 and
udpX_lib_lookup2, so the extra size shouldn't be a problem and it is hot
enough to be very visible in profilings.  In fact, with gcc13, forcing
the inline will prevent gcc from unrolling the fix from commit
50aee97d1511, so we don't end up increasing udpX_lib_lookup2 at all.

I haven't recollected the results myself, as I don't have access to the
machine at the moment.  But the same colleague reported 4.67%
inprovement with this patch in the loopback benchmark, solving the
regression report within noise margins.

Eric Dumazet reported no size change to vmlinux when built with clang.
I report the same also with gcc-13:

scripts/bloat-o-meter vmlinux vmlinux-inline
add/remove: 0/2 grow/shrink: 4/0 up/down: 616/-416 (200)
Function                                     old     new   delta
udp6_lib_lookup2                             762     949    +187
__udp6_lib_lookup                            810     975    +165
udp4_lib_lookup2                             757     906    +149
__udp4_lib_lookup                            871     986    +115
__pfx_compute_score                           32       -     -32
compute_score                                384       -    -384
Total: Before=35011784, After=35011984, chg +0.00%

Fixes: 50aee97d1511 ("udp: Avoid call to compute_score on multiple sites")
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
Link: https://patch.msgid.link/20260410155936.654915-1-krisman@suse.de
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>udp: udplite is unlikely</title>
<updated>2026-01-07T01:06:03+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2026-01-05T10:17:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e9cd04b2816f8ebae9b274f5c52145c8f40d4a6f'/>
<id>urn:sha1:e9cd04b2816f8ebae9b274f5c52145c8f40d4a6f</id>
<content type='text'>
Add some unlikely() annotations to speed up the fast path,
at least with clang compiler.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20260105101719.2378881-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: Convert proto callbacks from sockaddr to sockaddr_unsized</title>
<updated>2025-11-05T03:10:33+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2025-11-04T00:26:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=449f68f8fffa2c41fc265730bd05a3c4947916c1'/>
<id>urn:sha1:449f68f8fffa2c41fc265730bd05a3c4947916c1</id>
<content type='text'>
Convert struct proto pre_connect(), connect(), bind(), and bind_add()
callback function prototypes from struct sockaddr to struct sockaddr_unsized.
This does not change per-implementation use of sockaddr for passing around
an arbitrarily sized sockaddr struct. Those will be addressed in future
patches.

Additionally removes the no longer referenced struct sockaddr from
include/net/inet_common.h.

No binary changes expected.

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Link: https://patch.msgid.link/20251104002617.2752303-5-kees@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>udp: remove busylock and add per NUMA queues</title>
<updated>2025-09-23T23:38:39+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-09-22T10:42:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b650bf0977d34c52befb31a9fa711534e11b220f'/>
<id>urn:sha1:b650bf0977d34c52befb31a9fa711534e11b220f</id>
<content type='text'>
busylock was protecting UDP sockets against packet floods,
but unfortunately was not protecting the host itself.

Under stress, many cpus could spin while acquiring the busylock,
and NIC had to drop packets. Or packets would be dropped
in cpu backlog if RPS/RFS were in place.

This patch replaces the busylock by intermediate
lockless queues. (One queue per NUMA node).

This means that fewer number of cpus have to acquire
the UDP receive queue lock.

Most of the cpus can either:
- immediately drop the packet.
- or queue it in their NUMA aware lockless queue.

Then one of the cpu is chosen to process this lockless queue
in a batch.

The batch only contains packets that were cooked on the same
NUMA node, thus with very limited latency impact.

Tested:

DDOS targeting a victim UDP socket, on a platform with 6 NUMA nodes
(Intel(R) Xeon(R) 6985P-C)

Before:

nstat -n ; sleep 1 ; nstat | grep Udp
Udp6InDatagrams                 1004179            0.0
Udp6InErrors                    3117               0.0
Udp6RcvbufErrors                3117               0.0

After:
nstat -n ; sleep 1 ; nstat | grep Udp
Udp6InDatagrams                 1116633            0.0
Udp6InErrors                    14197275           0.0
Udp6RcvbufErrors                14197275           0.0

We can see this host can now proces 14.2 M more packets per second
while under attack, and the victim socket can receive 11 % more
packets.

I used a small bpftrace program measuring time (in us) spent in
__udp_enqueue_schedule_skb().

Before:

@udp_enqueue_us[398]:
[0]                24901 |@@@                                                 |
[1]                63512 |@@@@@@@@@                                           |
[2, 4)            344827 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[4, 8)            244673 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                |
[8, 16)            54022 |@@@@@@@@                                            |
[16, 32)          222134 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                   |
[32, 64)          232042 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                  |
[64, 128)           4219 |                                                    |
[128, 256)           188 |                                                    |

After:

@udp_enqueue_us[398]:
[0]              5608855 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1]              1111277 |@@@@@@@@@@                                          |
[2, 4)            501439 |@@@@                                                |
[4, 8)            102921 |                                                    |
[8, 16)            29895 |                                                    |
[16, 32)           43500 |                                                    |
[32, 64)           31552 |                                                    |
[64, 128)            979 |                                                    |
[128, 256)            13 |                                                    |

Note that the remaining bottleneck for this platform is in
udp_drops_inc() because we limited struct numa_drop_counters
to only two nodes so far.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20250922104240.2182559-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>udp: add udp_drops_inc() helper</title>
<updated>2025-09-18T08:17:10+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-09-16T16:09:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9db27c80622bd612549ea213390500f7377ee3e1'/>
<id>urn:sha1:9db27c80622bd612549ea213390500f7377ee3e1</id>
<content type='text'>
Generic sk_drops_inc() reads sk-&gt;sk_drop_counters.
We know the precise location for UDP sockets.

Move sk_drop_counters out of sock_read_rxtx
so that sock_write_rxtx starts at a cache line boundary.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20250916160951.541279-9-edumazet@google.com
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
</entry>
<entry>
<title>ipv6: np-&gt;rxpmtu race annotation</title>
<updated>2025-09-18T08:17:09+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-09-16T16:09:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9fba1eb39e2f74d2002c5cbcf1d4435d37a4f752'/>
<id>urn:sha1:9fba1eb39e2f74d2002c5cbcf1d4435d37a4f752</id>
<content type='text'>
Add READ_ONCE() annotations because np-&gt;rxpmtu can be changed
while udpv6_recvmsg() and rawv6_recvmsg() read it.

Since this is a very rarely used feature, and that udpv6_recvmsg()
and rawv6_recvmsg() read np-&gt;rxopt anyway, change the test order
so that np-&gt;rxpmtu does not need to be in a hot cache line.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20250916160951.541279-4-edumazet@google.com
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
</entry>
<entry>
<title>ipv6: udp: fix typos in comments</title>
<updated>2025-09-12T01:41:58+00:00</updated>
<author>
<name>Alok Tiwari</name>
<email>alok.a.tiwari@oracle.com</email>
</author>
<published>2025-09-09T12:26:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ac36dea3bc85c2cde87e490736708032328dfbdc'/>
<id>urn:sha1:ac36dea3bc85c2cde87e490736708032328dfbdc</id>
<content type='text'>
Correct typos in ipv6/udp.c comments:
"execeeds" -&gt; "exceeds"
"tacking care" -&gt; "taking care"
"measureable" -&gt; "measurable"

No functional changes.

Signed-off-by: Alok Tiwari &lt;alok.a.tiwari@oracle.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Link: https://patch.msgid.link/20250909122611.3711859-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: add sk_drops_read(), sk_drops_inc() and sk_drops_reset() helpers</title>
<updated>2025-08-28T11:14:50+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-08-26T12:50:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f86f42ed2c471da5b061492bb8ab1d3d73c19c58'/>
<id>urn:sha1:f86f42ed2c471da5b061492bb8ab1d3d73c19c58</id>
<content type='text'>
We want to split sk-&gt;sk_drops in the future to reduce
potential contention on this field.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20250826125031.1578842-2-edumazet@google.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;

</content>
</entry>
<entry>
<title>net: track pfmemalloc drops via SKB_DROP_REASON_PFMEMALLOC</title>
<updated>2025-07-18T23:59:05+00:00</updated>
<author>
<name>Jesper Dangaard Brouer</name>
<email>hawk@kernel.org</email>
</author>
<published>2025-07-16T16:26:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a6f190630d070173897a7e98a30188b7638ba0a1'/>
<id>urn:sha1:a6f190630d070173897a7e98a30188b7638ba0a1</id>
<content type='text'>
Add a new SKB drop reason (SKB_DROP_REASON_PFMEMALLOC) to track packets
dropped due to memory pressure. In production environments, we've observed
memory exhaustion reported by memory layer stack traces, but these drops
were not properly tracked in the SKB drop reason infrastructure.

While most network code paths now properly report pfmemalloc drops, some
protocol-specific socket implementations still use sk_filter() without
drop reason tracking:
- Bluetooth L2CAP sockets
- CAIF sockets
- IUCV sockets
- Netlink sockets
- SCTP sockets
- Unix domain sockets

These remaining cases represent less common paths and could be converted
in a follow-up patch if needed. The current implementation provides
significantly improved observability into memory pressure events in the
network stack, especially for key protocols like TCP and UDP, helping to
diagnose problems in production environments.

Reported-by: Matt Fleming &lt;mfleming@cloudflare.com&gt;
Signed-off-by: Jesper Dangaard Brouer &lt;hawk@kernel.org&gt;
Link: https://patch.msgid.link/175268316579.2407873.11634752355644843509.stgit@firesoul
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>udp: move udp_memory_allocated into net_aligned_data</title>
<updated>2025-07-02T21:22:02+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-06-30T09:35:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e3d4825124bce0d1f72187fabcf972b7c0b6cb9b'/>
<id>urn:sha1:e3d4825124bce0d1f72187fabcf972b7c0b6cb9b</id>
<content type='text'>
____cacheline_aligned_in_smp attribute only makes sure to align
a field to a cache line. It does not prevent the linker to use
the remaining of the cache line for other variables, causing
potential false sharing.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Link: https://patch.msgid.link/20250630093540.3052835-5-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
