<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/uapi/linux/if_xdp.h, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-07-25T09:57:27+00:00</updated>
<entry>
<title>xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len</title>
<updated>2024-07-25T09:57:27+00:00</updated>
<author>
<name>Stanislav Fomichev</name>
<email>sdf@fomichev.me</email>
</author>
<published>2024-07-13T01:52:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d5e726d9143c5624135f5dc9e4069799adeef734'/>
<id>urn:sha1:d5e726d9143c5624135f5dc9e4069799adeef734</id>
<content type='text'>
Julian reports that commit 341ac980eab9 ("xsk: Support tx_metadata_len")
can break existing use cases which don't zero-initialize xdp_umem_reg
padding. Introduce new XDP_UMEM_TX_METADATA_LEN to make sure we
interpret the padding as tx_metadata_len only when being explicitly
asked.

Fixes: 341ac980eab9 ("xsk: Support tx_metadata_len")
Reported-by: Julian Schindel &lt;mail@arctic-alpaca.de&gt;
Signed-off-by: Stanislav Fomichev &lt;sdf@fomichev.me&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Maciej Fijalkowski &lt;maciej.fijalkowski@intel.com&gt;
Link: https://lore.kernel.org/bpf/20240713015253.121248-2-sdf@fomichev.me
</content>
</entry>
<entry>
<title>xsk: Add option to calculate TX checksum in SW</title>
<updated>2023-11-29T22:59:40+00:00</updated>
<author>
<name>Stanislav Fomichev</name>
<email>sdf@google.com</email>
</author>
<published>2023-11-27T19:03:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=11614723af26e7c32fcb704d8f30fdf60c1122dc'/>
<id>urn:sha1:11614723af26e7c32fcb704d8f30fdf60c1122dc</id>
<content type='text'>
For XDP_COPY mode, add a UMEM option XDP_UMEM_TX_SW_CSUM
to call skb_checksum_help in transmit path. Might be useful
to debugging issues with real hardware. I also use this mode
in the selftests.

Signed-off-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Link: https://lore.kernel.org/r/20231127190319.1190813-9-sdf@google.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>xsk: Add TX timestamp and TX checksum offload support</title>
<updated>2023-11-29T22:59:40+00:00</updated>
<author>
<name>Stanislav Fomichev</name>
<email>sdf@google.com</email>
</author>
<published>2023-11-27T19:03:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=48eb03dd26304c24f03bdbb9382e89c8564e71df'/>
<id>urn:sha1:48eb03dd26304c24f03bdbb9382e89c8564e71df</id>
<content type='text'>
This change actually defines the (initial) metadata layout
that should be used by AF_XDP userspace (xsk_tx_metadata).
The first field is flags which requests appropriate offloads,
followed by the offload-specific fields. The supported per-device
offloads are exported via netlink (new xsk-flags).

The offloads themselves are still implemented in a bit of a
framework-y fashion that's left from my initial kfunc attempt.
I'm introducing new xsk_tx_metadata_ops which drivers are
supposed to implement. The drivers are also supposed
to call xsk_tx_metadata_request/xsk_tx_metadata_complete in
the right places. Since xsk_tx_metadata_{request,_complete}
are static inline, we don't incur any extra overhead doing
indirect calls.

The benefit of this scheme is as follows:
- keeps all metadata layout parsing away from driver code
- makes it easy to grep and see which drivers implement what
- don't need any extra flags to maintain to keep track of what
  offloads are implemented; if the callback is implemented - the offload
  is supported (used by netlink reporting code)

Two offloads are defined right now:
1. XDP_TXMD_FLAGS_CHECKSUM: skb-style csum_start+csum_offset
2. XDP_TXMD_FLAGS_TIMESTAMP: writes TX timestamp back into metadata
   area upon completion (tx_timestamp field)

XDP_TXMD_FLAGS_TIMESTAMP is also implemented for XDP_COPY mode: it writes
SW timestamp from the skb destructor (note I'm reusing hwtstamps to pass
metadata pointer).

The struct is forward-compatible and can be extended in the future
by appending more fields.

Reviewed-by: Song Yoong Siang &lt;yoong.siang.song@intel.com&gt;
Signed-off-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Link: https://lore.kernel.org/r/20231127190319.1190813-3-sdf@google.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>xsk: Support tx_metadata_len</title>
<updated>2023-11-29T22:59:40+00:00</updated>
<author>
<name>Stanislav Fomichev</name>
<email>sdf@google.com</email>
</author>
<published>2023-11-27T19:03:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=341ac980eab90ac1f6c22ee9f9da83ed9604d899'/>
<id>urn:sha1:341ac980eab90ac1f6c22ee9f9da83ed9604d899</id>
<content type='text'>
For zerocopy mode, tx_desc-&gt;addr can point to an arbitrary offset
and carry some TX metadata in the headroom. For copy mode, there
is no way currently to populate skb metadata.

Introduce new tx_metadata_len umem config option that indicates how many
bytes to treat as metadata. Metadata bytes come prior to tx_desc address
(same as in RX case).

The size of the metadata has mostly the same constraints as XDP:
- less than 256 bytes
- 8-byte aligned (compared to 4-byte alignment on xdp, due to 8-byte
  timestamp in the completion)
- non-zero

This data is not interpreted in any way right now.

Reviewed-by: Song Yoong Siang &lt;yoong.siang.song@intel.com&gt;
Signed-off-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Link: https://lore.kernel.org/r/20231127190319.1190813-2-sdf@google.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>xsk: introduce XSK_USE_SG bind flag for xsk socket</title>
<updated>2023-07-19T16:56:48+00:00</updated>
<author>
<name>Tirthendu Sarkar</name>
<email>tirthendu.sarkar@intel.com</email>
</author>
<published>2023-07-19T13:23:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=81470b5c3c6649eef8e5f282cd06793f788ae165'/>
<id>urn:sha1:81470b5c3c6649eef8e5f282cd06793f788ae165</id>
<content type='text'>
As of now xsk core drops any xdp_buff with data size greater than the
xsk frame_size as set by the af_xdp application. With multi-buffer
support introduced in the next patch xsk core can now split those
buffers into multiple descriptors provided the af_xdp application can
handle them. Such capability of the application needs to be independent
of the xdp_prog's frag support capability since there are cases where
even a single xdp_buffer may need to be split into multiple descriptors
owing to a smaller xsk frame size.

For e.g., with NIC rx_buffer size set to 4kB, a 3kB packet will
constitute of a single buffer and so will be sent as such to AF_XDP layer
irrespective of 'xdp.frags' capability of the XDP program. Now if the xsk
frame size is set to 2kB by the AF_XDP application, then the packet will
need to be split into 2 descriptors if AF_XDP application can handle
multi-buffer, else it needs to be dropped.

Applications can now advertise their frag handling capability to xsk core
so that xsk core can decide if it should drop or split xdp_buffs that
exceed xsk frame size. This is done using a new 'XSK_USE_SG' bind flag
for the xdp socket.

Signed-off-by: Tirthendu Sarkar &lt;tirthendu.sarkar@intel.com&gt;
Link: https://lore.kernel.org/r/20230719132421.584801-3-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>xsk: prepare 'options' in xdp_desc for multi-buffer use</title>
<updated>2023-07-19T16:56:48+00:00</updated>
<author>
<name>Tirthendu Sarkar</name>
<email>tirthendu.sarkar@intel.com</email>
</author>
<published>2023-07-19T13:23:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=63a64a56bc3f77c74085047ee45356ac850da3e8'/>
<id>urn:sha1:63a64a56bc3f77c74085047ee45356ac850da3e8</id>
<content type='text'>
Use the 'options' field in xdp_desc as a packet continuity marker. Since
'options' field was unused till now and was expected to be set to 0, the
'eop' descriptor will have it set to 0, while the non-eop descriptors
will have to set it to 1. This ensures legacy applications continue to
work without needing any change for single-buffer packets.

Add helper functions and extend xskq_prod_reserve_desc() to use the
'options' field.

Signed-off-by: Tirthendu Sarkar &lt;tirthendu.sarkar@intel.com&gt;
Link: https://lore.kernel.org/r/20230719132421.584801-2-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>xsk: Add new statistics</title>
<updated>2020-07-13T22:32:56+00:00</updated>
<author>
<name>Ciara Loftus</name>
<email>ciara.loftus@intel.com</email>
</author>
<published>2020-07-08T07:28:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8aa5a33578e9685d06020bd10d1637557423e945'/>
<id>urn:sha1:8aa5a33578e9685d06020bd10d1637557423e945</id>
<content type='text'>
It can be useful for the user to know the reason behind a dropped packet.
Introduce new counters which track drops on the receive path caused by:
1. rx ring being full
2. fill ring being empty

Also, on the tx path introduce a counter which tracks the number of times
we attempt pull from the tx ring when it is empty.

Signed-off-by: Ciara Loftus &lt;ciara.loftus@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20200708072835.4427-2-ciara.loftus@intel.com
</content>
</entry>
<entry>
<title>xsk: add support to allow unaligned chunk placement</title>
<updated>2019-08-30T23:08:26+00:00</updated>
<author>
<name>Kevin Laatz</name>
<email>kevin.laatz@intel.com</email>
</author>
<published>2019-08-27T02:25:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c05cd3645814724bdeb32a2b4d953b12bdea5f8c'/>
<id>urn:sha1:c05cd3645814724bdeb32a2b4d953b12bdea5f8c</id>
<content type='text'>
Currently, addresses are chunk size aligned. This means, we are very
restricted in terms of where we can place chunk within the umem. For
example, if we have a chunk size of 2k, then our chunks can only be placed
at 0,2k,4k,6k,8k... and so on (ie. every 2k starting from 0).

This patch introduces the ability to use unaligned chunks. With these
changes, we are no longer bound to having to place chunks at a 2k (or
whatever your chunk size is) interval. Since we are no longer dealing with
aligned chunks, they can now cross page boundaries. Checks for page
contiguity have been added in order to keep track of which pages are
followed by a physically contiguous page.

Signed-off-by: Kevin Laatz &lt;kevin.laatz@intel.com&gt;
Signed-off-by: Ciara Loftus &lt;ciara.loftus@intel.com&gt;
Signed-off-by: Bruce Richardson &lt;bruce.richardson@intel.com&gt;
Acked-by: Jonathan Lemon &lt;jonathan.lemon@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
</entry>
<entry>
<title>xsk: add support for need_wakeup flag in AF_XDP rings</title>
<updated>2019-08-17T21:07:32+00:00</updated>
<author>
<name>Magnus Karlsson</name>
<email>magnus.karlsson@intel.com</email>
</author>
<published>2019-08-14T07:27:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=77cd0d7b3f257fd0e3096b4fdcff1a7d38e99e10'/>
<id>urn:sha1:77cd0d7b3f257fd0e3096b4fdcff1a7d38e99e10</id>
<content type='text'>
This commit adds support for a new flag called need_wakeup in the
AF_XDP Tx and fill rings. When this flag is set, it means that the
application has to explicitly wake up the kernel Rx (for the bit in
the fill ring) or kernel Tx (for bit in the Tx ring) processing by
issuing a syscall. Poll() can wake up both depending on the flags
submitted and sendto() will wake up tx processing only.

The main reason for introducing this new flag is to be able to
efficiently support the case when application and driver is executing
on the same core. Previously, the driver was just busy-spinning on the
fill ring if it ran out of buffers in the HW and there were none on
the fill ring. This approach works when the application is running on
another core as it can replenish the fill ring while the driver is
busy-spinning. Though, this is a lousy approach if both of them are
running on the same core as the probability of the fill ring getting
more entries when the driver is busy-spinning is zero. With this new
feature the driver now sets the need_wakeup flag and returns to the
application. The application can then replenish the fill queue and
then explicitly wake up the Rx processing in the kernel using the
syscall poll(). For Tx, the flag is only set to one if the driver has
no outstanding Tx completion interrupts. If it has some, the flag is
zero as it will be woken up by a completion interrupt anyway.

As a nice side effect, this new flag also improves the performance of
the case where application and driver are running on two different
cores as it reduces the number of syscalls to the kernel. The kernel
tells user space if it needs to be woken up by a syscall, and this
eliminates many of the syscalls.

This flag needs some simple driver support. If the driver does not
support this, the Rx flag is always zero and the Tx flag is always
one. This makes any application relying on this feature default to the
old behaviour of not requiring any syscalls in the Rx path and always
having to call sendto() in the Tx path.

For backwards compatibility reasons, this feature has to be explicitly
turned on using a new bind flag (XDP_USE_NEED_WAKEUP). I recommend
that you always turn it on as it so far always have had a positive
performance impact.

The name and inspiration of the flag has been taken from io_uring by
Jens Axboe. Details about this feature in io_uring can be found in
http://kernel.dk/io_uring.pdf, section 8.3.

Signed-off-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Acked-by: Jonathan Lemon &lt;jonathan.lemon@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
</entry>
<entry>
<title>xsk: Add getsockopt XDP_OPTIONS</title>
<updated>2019-06-27T20:53:26+00:00</updated>
<author>
<name>Maxim Mikityanskiy</name>
<email>maximmi@mellanox.com</email>
</author>
<published>2019-06-26T14:35:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2640d3c8123223e0a205b2a25a446df6f072b3ea'/>
<id>urn:sha1:2640d3c8123223e0a205b2a25a446df6f072b3ea</id>
<content type='text'>
Make it possible for the application to determine whether the AF_XDP
socket is running in zero-copy mode. To achieve this, add a new
getsockopt option XDP_OPTIONS that returns flags. The only flag
supported for now is the zero-copy mode indicator.

Signed-off-by: Maxim Mikityanskiy &lt;maximmi@mellanox.com&gt;
Signed-off-by: Tariq Toukan &lt;tariqt@mellanox.com&gt;
Acked-by: Saeed Mahameed &lt;saeedm@mellanox.com&gt;
Acked-by: Björn Töpel &lt;bjorn.topel@intel.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
</entry>
</feed>
