diff options
author | Martin KaFai Lau <martin.lau@kernel.org> | 2025-08-18 20:29:43 +0300 |
---|---|---|
committer | Martin KaFai Lau <martin.lau@kernel.org> | 2025-08-18 20:29:44 +0300 |
commit | 7f7a958a6a2c9f0e2e82eaffdb5965238c735591 (patch) | |
tree | ecf7140610fa5b609b05f14fc710de3f1ad3f57b /rust/helpers/pid_namespace.c | |
parent | 8f5ae30d69d7543eee0d70083daf4de8fe15d585 (diff) | |
parent | 403fae59781fddc699af761f38ed024d3245096b (diff) | |
download | linux-7f7a958a6a2c9f0e2e82eaffdb5965238c735591.tar.xz |
Merge branch 'add-a-dynptr-type-for-skb-metadata-for-tc-bpf'
Jakub Sitnicki says:
====================
Add a dynptr type for skb metadata for TC BPF
TL;DR
-----
This is the first step in an effort which aims to enable skb metadata
access for all BPF programs which operate on an skb context.
By skb metadata we mean the custom metadata area which can be allocated
from an XDP program with the bpf_xdp_adjust_meta helper [1]. Network stack
code accesses it using the skb_metadata_* helpers.
Changelog
---------
Changes in v7:
- Make dynptr read-only for cloned skbs for now. (Martin)
- Extend tests for skb clones to cover writes to metadata.
- Drop Jesse's review stamp for patch 2 due to an update.
- Link to v6: https://lore.kernel.org/r/20250804-skb-metadata-thru-dynptr-v6-0-05da400bfa4b@cloudflare.com
Changes in v6:
- Enable CONFIG_NET_ACT_MIRRED for bpf selftests to fix CI failure
- Switch from u32 to matchall classifier, which bpf selftests already use
- Link to v5: https://lore.kernel.org/r/20250731-skb-metadata-thru-dynptr-v5-0-f02f6b5688dc@cloudflare.com
Changes in v5:
- Invalidate skb payload and metadata slices on write to metadata. (Martin)
- Drop redundant bounds check in bpf_skb_meta_*(). (Martin)
- Check for unexpected flags in __bpf_dynptr_write(). (Martin)
- Fold bpf_skb_meta_{load,store}_bytes() into callers.
- Add a test for metadata access when an skb clone has been modified.
- Drop Eduard's Ack for patch 3. Patch updated.
- Keep Eduard's Ack for patches 4-8.
- Add Jesse's stamp from an internal review.
- Link to v4: https://lore.kernel.org/r/20250723-skb-metadata-thru-dynptr-v4-0-a0fed48bcd37@cloudflare.com
Changes in v4:
- Kill bpf_dynptr_from_skb_meta_rdonly. Not needed for now. (Marin)
- Add a test to cover passing OOB offsets to dynptr ops. (Eduard)
- Factor out bounds checks from bpf_dynptr_{read,write,slice}. (Eduard)
- Squash patches:
bpf: Enable read access to skb metadata with bpf_dynptr_read
bpf: Enable write access to skb metadata with bpf_dynptr_write
bpf: Enable read-write access to skb metadata with dynptr slice
- Kept Eduard's Acks for v3 on unchanged patches.
- Link to v3: https://lore.kernel.org/r/20250721-skb-metadata-thru-dynptr-v3-0-e92be5534174@cloudflare.com
Changes in v3:
- Add a kfunc set for skb metadata access. Limited to TC BPF. (Martin)
- Drop patches related to skb metadata access outside of TC BPF:
net: Clear skb metadata on handover from device to protocol
selftests/bpf: Cover lack of access to skb metadata at ip layer
selftests/bpf: Count successful bpf program runs
- Link to v2: https://lore.kernel.org/r/20250716-skb-metadata-thru-dynptr-v2-0-5f580447e1df@cloudflare.com
Changes in v2:
- Switch to a dedicated dynptr type for skb metadata (Andrii)
- Add verifier test coverage since we now touch its code
- Add missing test coverage for bpf_dynptr_adjust and access at an offset
- Link to v1: https://lore.kernel.org/r/20250630-skb-metadata-thru-dynptr-v1-0-f17da13625d8@cloudflare.com
Overview
--------
Today, the skb metadata is accessible only by the BPF TC ingress programs
through the __sk_buff->data_meta pointer. We propose a three step plan to
make skb metadata available to all other BPF programs which operate on skb
objects:
1) Add a dynptr type for skb metadata (this patch set)
This is a preparatory step, but it also stands on its own. Here we
enable access to the skb metadata through a bpf_dynptr, the same way we
can already access the skb payload today.
As the next step (2), we want to relocate the metadata as skb travels
through the network stack in order to persist it. That will require a
safe way to access the metadata area irrespective of its location.
This is where the dynptr [2] comes into play. It solves exactly that
problem. A dynptr to skb metadata can be backed by a memory area that
resides in a different location depending on the code path.
2) Persist skb metadata past the TC hook (future)
Having the metadata in front of the packet headers as the skb travels
through the network stack is problematic - see the discussion of
alternative approaches below. Hence, we plan to relocate it as
necessary past the TC hook.
Where to relocate it? We don't know yet. There are a couple of
options: (i) move it to the top of skb headroom, or (ii) allocate
dedicated memory for it. They are not mutually exclusive. The right
solution might be a mix.
When to relocate it? That is also an open question. It could be done
during device to protocol handover or lazily when headers get pushed or
headroom gets resized.
3) skb dynptr for sockops, sk_lookup, etc. (future)
There are BPF program types don't operate on __sk_buff context, but
either have, or could have, access to the skb itself. As a final touch,
we want to provide a way to create an skb metadata dynptr for these
program types.
TIMTOWDI
--------
Alternative approaches which we considered:
* Keep the metadata always in front of skb->data
We think it is a bad idea for two reasons, outlined below. Nevertheless we
are open to it, if necessary.
1) Performance concerns
It would require the network stack to move the metadata on each header
pull/push - see skb_reorder_vlan_header() [3] for an example. While
doable, there is an expected performance overhead.
2) Potential for bugs
In addition to updating skb_push/pull and pskp_expand_head, we would
need to audit any code paths which operate on skb->data pointer
directly without going through the helpers. This creates a "known
unknown" risk.
* Design a new custom metadata area from scratch
We have tried that in Arthur's patch set [4]. One of the outcomes of the
discussion there was that we don't want to have two places to store custom
metadata. Hence the change of approach to make the existing custom metadata
area work.
-jkbs
[1] https://docs.ebpf.io/linux/helper-function/bpf_xdp_adjust_meta/
[2] https://docs.ebpf.io/linux/concepts/dynptrs/
[3] https://elixir.bootlin.com/linux/v6.16-rc6/source/net/core/skbuff.c#L6211
[4] https://lore.kernel.org/all/20250422-afabre-traits-010-rfc2-v2-0-92bcc6b146c9@arthurfabre.com/
====================
Link: https://patch.msgid.link/20250814-skb-metadata-thru-dynptr-v7-0-8a39e636e0fb@cloudflare.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Diffstat (limited to 'rust/helpers/pid_namespace.c')
0 files changed, 0 insertions, 0 deletions