diff options
| author | Jason Xing <kernelxing@tencent.com> | 2026-05-02 23:07:15 +0300 |
|---|---|---|
| committer | Jakub Kicinski <kuba@kernel.org> | 2026-05-06 05:27:49 +0300 |
| commit | d73a9a63f9f7f7c17637731fd28daf3665992d1e (patch) | |
| tree | 662e600178e6ff9cf74406663371b2b182f5563a | |
| parent | 22675f07260ca26423851a42b553b0ea669228d1 (diff) | |
| download | linux-d73a9a63f9f7f7c17637731fd28daf3665992d1e.tar.xz | |
xsk: reject sw-csum UMEM binding to IFF_TX_SKB_NO_LINEAR devices
skb_checksum_help() is a common helper that writes the folded
16-bit checksum back via skb->data + csum_start + csum_offset,
i.e. it relies on the skb's linear head and fails (with WARN_ONCE
and -EINVAL) when skb_headlen() is 0.
AF_XDP generic xmit takes two very different paths depending on the
netdev. Drivers that advertise IFF_TX_SKB_NO_LINEAR (e.g. virtio_net)
skip the "copy payload into a linear head" step on purpose as a
performance optimisation: xsk_build_skb_zerocopy() only attaches UMEM
pages as frags and never calls skb_put(), so skb_headlen() stays 0
for the whole skb. For these skbs there is simply no linear area for
skb_checksum_help() to write the csum into - the sw-csum fallback is
structurally inapplicable.
The patch tries to catch this and reject the combination with error at
setup time. Rejecting at bind() converts this silent per-packet failure
into a synchronous, actionable -EOPNOTSUPP at setup time. HW csum and
launch_time metadata on IFF_TX_SKB_NO_LINEAR drivers are unaffected
because they do not call skb_checksum_help().
Without the patch, every descriptor carrying 'XDP_TX_METADATA |
XDP_TXMD_FLAGS_CHECKSUM' produces:
1) a WARN_ONCE "offset (N) >= skb_headlen() (0)" from skb_checksum_help(),
2) sendmsg() returning -EINVAL without consuming the descriptor
(invalid_descs is not incremented),
3) a wedged TX ring: __xsk_generic_xmit() does not advance the
consumer on non-EOVERFLOW errors, so the next sendmsg() re-reads
the same descriptor and re-hits the same WARN until the socket
is closed.
Closes: https://lore.kernel.org/all/20260419045822.843BFC2BCAF@smtp.kernel.org/#t
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Fixes: 30c3055f9c0d ("xsk: wrap generic metadata handling onto separate function")
Link: https://patch.msgid.link/20260502200722.53960-2-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| -rw-r--r-- | net/xdp/xsk_buff_pool.c | 3 |
1 files changed, 3 insertions, 0 deletions
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index cd7bc50872f6..d981cfdd8535 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -175,6 +175,9 @@ int xp_assign_dev(struct xsk_buff_pool *pool, if (force_zc && force_copy) return -EINVAL; + if (pool->tx_sw_csum && (netdev->priv_flags & IFF_TX_SKB_NO_LINEAR)) + return -EOPNOTSUPP; + if (xsk_get_pool_from_qid(netdev, queue_id)) return -EBUSY; |
