<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/net/packet/internal.h, branch v6.19.12</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.12</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.12'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-09-12T01:40:06+00:00</updated>
<entry>
<title>net: af_packet: Use hrtimer to do the retire operation</title>
<updated>2025-09-12T01:40:06+00:00</updated>
<author>
<name>Xin Zhao</name>
<email>jackzxcui1989@163.com</email>
</author>
<published>2025-09-08T10:45:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f7460d2989fa7fb29a0c6d8b929076521480a124'/>
<id>urn:sha1:f7460d2989fa7fb29a0c6d8b929076521480a124</id>
<content type='text'>
In a system with high real-time requirements, the timeout mechanism of
ordinary timers with jiffies granularity is insufficient to meet the
demands for real-time performance. Meanwhile, the optimization of CPU
usage with af_packet is quite significant. Use hrtimer instead of timer
to help compensate for the shortcomings in real-time performance.
In HZ=100 or HZ=250 system, the update of TP_STATUS_USER is not real-time
enough, with fluctuations reaching over 8ms (on a system with HZ=250).
This is unacceptable in some high real-time systems that require timely
processing of network packets. By replacing it with hrtimer, if a timeout
of 2ms is set, the update of TP_STATUS_USER can be stabilized to within
3 ms.

Delete delete_blk_timer field, because hrtimer_cancel will check and wait
until the timer callback return and ensure never enter callback again.

Simplify the logic related to setting timeout, only update the hrtimer
expire time within the hrtimer callback, no longer update the expire time
in prb_open_block which is called by tpacket_rcv or timer callback.
Reasons why NOT update hrtimer in prb_open_block:
1) It will increase complexity to distinguish the two caller scenario.
2) hrtimer_cancel and hrtimer_start need to be called if you want to update
TMO of an already enqueued hrtimer, leading to complex shutdown logic.

One side effect of NOT update hrtimer when called by tpacket_rcv is that
a newly opened block triggered by tpacket_rcv may be retired earlier than
expected. On the other hand, if timeout is updated in prb_open_block, the
frequent reception of network packets that leads to prb_open_block being
called may cause hrtimer to be removed and enqueued repeatedly.

The retire hrtimer expiration is unconditional and periodic. If there are
numerous packet sockets on the system, please set an appropriate timeout
to avoid frequent enqueueing of hrtimers.

Reviewed-by: Willem de Bruijn &lt;willemdebruijn.kernel@gmail.com&gt;
Reviewed-by: Jason Xing &lt;kerneljasonxing@gmail.com&gt;
Link: https://lore.kernel.org/all/20250831100822.1238795-1-jackzxcui1989@163.com/
Signed-off-by: Xin Zhao &lt;jackzxcui1989@163.com&gt;
Link: https://patch.msgid.link/20250908104549.204412-3-jackzxcui1989@163.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: af_packet: remove last_kactive_blk_num field</title>
<updated>2025-09-12T01:40:06+00:00</updated>
<author>
<name>Xin Zhao</name>
<email>jackzxcui1989@163.com</email>
</author>
<published>2025-09-08T10:45:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=28d2420d403ada8a5ff1bf2077ef66051b2aa4d7'/>
<id>urn:sha1:28d2420d403ada8a5ff1bf2077ef66051b2aa4d7</id>
<content type='text'>
kactive_blk_num (K) is only incremented on block close.
In timer callback prb_retire_rx_blk_timer_expired, except delete_blk_timer
is true, last_kactive_blk_num (L) is set to match kactive_blk_num (K) in
all cases. L is also set to match K in prb_open_block.
The only case K not equal to L is when scheduled by tpacket_rcv
and K is just incremented on block close but no new block could be opened,
so that it does not call prb_open_block in prb_dispatch_next_block.
This patch modifies the prb_retire_rx_blk_timer_expired function by simply
removing the check for L == K. This patch just provides another checkpoint
to thaw the might-be-frozen block in any case. It doesn't have any effect
because __packet_lookup_frame_in_block() has the same logic and does it
again without this patch when detecting the ring is frozen. The patch only
advances checking the status of the ring.

Suggested-by: Willem de Bruijn &lt;willemdebruijn.kernel@gmail.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemdebruijn.kernel@gmail.com&gt;
Reviewed-by: Jason Xing &lt;kerneljasonxing@gmail.com&gt;
Link: https://lore.kernel.org/all/20250831100822.1238795-1-jackzxcui1989@163.com/
Signed-off-by: Xin Zhao &lt;jackzxcui1989@163.com&gt;
Link: https://patch.msgid.link/20250908104549.204412-2-jackzxcui1989@163.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>af_packet: move notifier's packet_dev_mc out of rcu critical section</title>
<updated>2025-05-27T09:36:26+00:00</updated>
<author>
<name>Stanislav Fomichev</name>
<email>stfomichev@gmail.com</email>
</author>
<published>2025-05-22T03:11:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d8d85ef0a631df9127f202e6371bb33a0b589952'/>
<id>urn:sha1:d8d85ef0a631df9127f202e6371bb33a0b589952</id>
<content type='text'>
Syzkaller reports the following issue:

 BUG: sleeping function called from invalid context at kernel/locking/mutex.c:578
 __mutex_lock+0x106/0xe80 kernel/locking/mutex.c:746
 team_change_rx_flags+0x38/0x220 drivers/net/team/team_core.c:1781
 dev_change_rx_flags net/core/dev.c:9145 [inline]
 __dev_set_promiscuity+0x3f8/0x590 net/core/dev.c:9189
 netif_set_promiscuity+0x50/0xe0 net/core/dev.c:9201
 dev_set_promiscuity+0x126/0x260 net/core/dev_api.c:286 packet_dev_mc net/packet/af_packet.c:3698 [inline]
 packet_dev_mclist_delete net/packet/af_packet.c:3722 [inline]
 packet_notifier+0x292/0xa60 net/packet/af_packet.c:4247
 notifier_call_chain+0x1b3/0x3e0 kernel/notifier.c:85
 call_netdevice_notifiers_extack net/core/dev.c:2214 [inline]
 call_netdevice_notifiers net/core/dev.c:2228 [inline]
 unregister_netdevice_many_notify+0x15d8/0x2330 net/core/dev.c:11972
 rtnl_delete_link net/core/rtnetlink.c:3522 [inline]
 rtnl_dellink+0x488/0x710 net/core/rtnetlink.c:3564
 rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6955
 netlink_rcv_skb+0x219/0x490 net/netlink/af_netlink.c:2534

Calling `PACKET_ADD_MEMBERSHIP` on an ops-locked device can trigger
the `NETDEV_UNREGISTER` notifier, which may require disabling promiscuous
and/or allmulti mode. Both of these operations require acquiring
the netdev instance lock.

Move the call to `packet_dev_mc` outside of the RCU critical section.
The `mclist` modifications (add, del, flush, unregister) are protected by
the RTNL, not the RCU. The RCU only protects the `sklist` and its
associated `sks`. The delayed operation on the `mclist` entry remains
within the RTNL.

Reported-by: syzbot+b191b5ccad8d7a986286@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b191b5ccad8d7a986286
Fixes: ad7c7b2172c3 ("net: hold netdev instance lock during sysfs operations")
Signed-off-by: Stanislav Fomichev &lt;stfomichev@gmail.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Link: https://patch.msgid.link/20250522031129.3247266-1-stfomichev@gmail.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;

</content>
</entry>
<entry>
<title>packet: Move reference count in packet_sock to atomic_long_t</title>
<updated>2023-12-04T22:45:04+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2023-12-01T13:10:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=db3fadacaf0c817b222090290d06ca2a338422d0'/>
<id>urn:sha1:db3fadacaf0c817b222090290d06ca2a338422d0</id>
<content type='text'>
In some potential instances the reference count on struct packet_sock
could be saturated and cause overflows which gets the kernel a bit
confused. To prevent this, move to a 64-bit atomic reference count on
64-bit architectures to prevent the possibility of this type to overflow.

Because we can not handle saturation, using refcount_t is not possible
in this place. Maybe someday in the future if it changes it could be
used. Also, instead of using plain atomic64_t, use atomic_long_t instead.
32-bit machines tend to be memory-limited (i.e. anything that increases
a reference uses so much memory that you can't actually get to 2**32
references). 32-bit architectures also tend to have serious problems
with 64-bit atomics. Hence, atomic_long_t is the more natural solution.

Reported-by: "The UK's National Cyber Security Centre (NCSC)" &lt;security@ncsc.gov.uk&gt;
Co-developed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: stable@kernel.org
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://lore.kernel.org/r/20231201131021.19999-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net/packet: Annotate struct packet_fanout with __counted_by</title>
<updated>2023-10-06T10:36:25+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2023-10-03T23:17:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b3783e5efde4201b2cc7a2fee41791b413137f4c'/>
<id>urn:sha1:b3783e5efde4201b2cc7a2fee41791b413137f4c</id>
<content type='text'>
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct packet_fanout.

Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Jakub Kicinski &lt;kuba@kernel.org&gt;
Cc: Paolo Abeni &lt;pabeni@redhat.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Cc: Anqi Shen &lt;amy.saq@antgroup.com&gt;
Cc: netdev@vger.kernel.org
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1]
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net/packet: support mergeable feature of virtio</title>
<updated>2023-04-21T11:01:58+00:00</updated>
<author>
<name>Jianfeng Tan</name>
<email>henry.tjf@antgroup.com</email>
</author>
<published>2023-04-19T07:24:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dfc39d4026fb2432363c0f77543c4cf3adca4c7b'/>
<id>urn:sha1:dfc39d4026fb2432363c0f77543c4cf3adca4c7b</id>
<content type='text'>
Packet sockets, like tap, can be used as the backend for kernel vhost.
In packet sockets, virtio net header size is currently hardcoded to be
the size of struct virtio_net_hdr, which is 10 bytes; however, it is not
always the case: some virtio features, such as mrg_rxbuf, need virtio
net header to be 12-byte long.

Mergeable buffers, as a virtio feature, is worthy of supporting: packets
that are larger than one-mbuf size will be dropped in vhost worker's
handle_rx if mrg_rxbuf feature is not used, but large packets
cannot be avoided and increasing mbuf's size is not economical.

With this virtio feature enabled by virtio-user, packet sockets with
hardcoded 10-byte virtio net header will parse mac head incorrectly in
packet_snd by taking the last two bytes of virtio net header as part of
mac header.
This incorrect mac header parsing will cause packet to be dropped due to
invalid ether head checking in later under-layer device packet receiving.

By adding extra field vnet_hdr_sz with utilizing holes in struct
packet_sock to record currently used virtio net header size and supporting
extra sockopt PACKET_VNET_HDR_SZ to set specified vnet_hdr_sz, packet
sockets can know the exact length of virtio net header that virtio user
gives.
In packet_snd, tpacket_snd and packet_recvmsg, instead of using
hardcoded virtio net header size, it can get the exact vnet_hdr_sz from
corresponding packet_sock, and parse mac header correctly based on this
information to avoid the packets being mistakenly dropped.

Signed-off-by: Jianfeng Tan &lt;henry.tjf@antgroup.com&gt;
Co-developed-by: Anqi Shen &lt;amy.saq@antgroup.com&gt;
Signed-off-by: Anqi Shen &lt;amy.saq@antgroup.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net/packet: remove po-&gt;xmit</title>
<updated>2023-03-19T10:57:54+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-03-17T16:20:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=105a201ebf3312990b96c4fbaade22e31402f8cc'/>
<id>urn:sha1:105a201ebf3312990b96c4fbaade22e31402f8cc</id>
<content type='text'>
Use PACKET_SOCK_QDISC_BYPASS atomic bit instead of a pointer.

This removes one indirect call in fast path,
and READ_ONCE()/WRITE_ONCE() annotations as well.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Suggested-by: Willem de Bruijn &lt;willemb@google.com&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>af_packet: preserve const qualifier in pkt_sk()</title>
<updated>2023-03-18T12:23:33+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-03-17T15:55:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=68ac9a8b6e65c7cbbe96541353dab1b3f8de2043'/>
<id>urn:sha1:68ac9a8b6e65c7cbbe96541353dab1b3f8de2043</id>
<content type='text'>
We can change pkt_sk() to propagate const qualifier of its argument,
thanks to container_of_const()

This should avoid some potential errors caused by accidental
(const -&gt; not_const) promotion.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Reviewed-by: Simon Horman &lt;simon.horman@corigine.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net/packet: convert po-&gt;pressure to an atomic flag</title>
<updated>2023-03-17T08:52:06+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-03-16T01:10:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=791a3e9f1a86fe8eb09173c9788493b8b5c957f4'/>
<id>urn:sha1:791a3e9f1a86fe8eb09173c9788493b8b5c957f4</id>
<content type='text'>
Not only this removes some READ_ONCE()/WRITE_ONCE(),
this also removes one integer.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net/packet: convert po-&gt;running to an atomic flag</title>
<updated>2023-03-17T08:52:06+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-03-16T01:10:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=61edf479818e63978cabd243b82ca80f8948a313'/>
<id>urn:sha1:61edf479818e63978cabd243b82ca80f8948a313</id>
<content type='text'>
Instead of consuming 32 bits for po-&gt;running, use
one available bit in po-&gt;flags.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
