summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--Documentation/bpf/bpf_prog_run.rst117
-rw-r--r--Documentation/bpf/index.rst1
-rw-r--r--include/uapi/linux/bpf.h3
-rw-r--r--kernel/bpf/Kconfig1
-rw-r--r--kernel/bpf/syscall.c2
-rw-r--r--net/bpf/test_run.c334
-rw-r--r--tools/include/uapi/linux/bpf.h3
-rw-r--r--tools/lib/bpf/bpf.c1
-rw-r--r--tools/lib/bpf/bpf.h3
-rw-r--r--tools/testing/selftests/bpf/network_helpers.c86
-rw-r--r--tools/testing/selftests/bpf/network_helpers.h9
-rw-r--r--tools/testing/selftests/bpf/prog_tests/tc_redirect.c89
-rw-r--r--tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c177
-rw-r--r--tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c100
14 files changed, 821 insertions, 105 deletions
diff --git a/Documentation/bpf/bpf_prog_run.rst b/Documentation/bpf/bpf_prog_run.rst
new file mode 100644
index 000000000000..4868c909df5c
--- /dev/null
+++ b/Documentation/bpf/bpf_prog_run.rst
@@ -0,0 +1,117 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================
+Running BPF programs from userspace
+===================================
+
+This document describes the ``BPF_PROG_RUN`` facility for running BPF programs
+from userspace.
+
+.. contents::
+ :local:
+ :depth: 2
+
+
+Overview
+--------
+
+The ``BPF_PROG_RUN`` command can be used through the ``bpf()`` syscall to
+execute a BPF program in the kernel and return the results to userspace. This
+can be used to unit test BPF programs against user-supplied context objects, and
+as way to explicitly execute programs in the kernel for their side effects. The
+command was previously named ``BPF_PROG_TEST_RUN``, and both constants continue
+to be defined in the UAPI header, aliased to the same value.
+
+The ``BPF_PROG_RUN`` command can be used to execute BPF programs of the
+following types:
+
+- ``BPF_PROG_TYPE_SOCKET_FILTER``
+- ``BPF_PROG_TYPE_SCHED_CLS``
+- ``BPF_PROG_TYPE_SCHED_ACT``
+- ``BPF_PROG_TYPE_XDP``
+- ``BPF_PROG_TYPE_SK_LOOKUP``
+- ``BPF_PROG_TYPE_CGROUP_SKB``
+- ``BPF_PROG_TYPE_LWT_IN``
+- ``BPF_PROG_TYPE_LWT_OUT``
+- ``BPF_PROG_TYPE_LWT_XMIT``
+- ``BPF_PROG_TYPE_LWT_SEG6LOCAL``
+- ``BPF_PROG_TYPE_FLOW_DISSECTOR``
+- ``BPF_PROG_TYPE_STRUCT_OPS``
+- ``BPF_PROG_TYPE_RAW_TRACEPOINT``
+- ``BPF_PROG_TYPE_SYSCALL``
+
+When using the ``BPF_PROG_RUN`` command, userspace supplies an input context
+object and (for program types operating on network packets) a buffer containing
+the packet data that the BPF program will operate on. The kernel will then
+execute the program and return the results to userspace. Note that programs will
+not have any side effects while being run in this mode; in particular, packets
+will not actually be redirected or dropped, the program return code will just be
+returned to userspace. A separate mode for live execution of XDP programs is
+provided, documented separately below.
+
+Running XDP programs in "live frame mode"
+-----------------------------------------
+
+The ``BPF_PROG_RUN`` command has a separate mode for running live XDP programs,
+which can be used to execute XDP programs in a way where packets will actually
+be processed by the kernel after the execution of the XDP program as if they
+arrived on a physical interface. This mode is activated by setting the
+``BPF_F_TEST_XDP_LIVE_FRAMES`` flag when supplying an XDP program to
+``BPF_PROG_RUN``.
+
+The live packet mode is optimised for high performance execution of the supplied
+XDP program many times (suitable for, e.g., running as a traffic generator),
+which means the semantics are not quite as straight-forward as the regular test
+run mode. Specifically:
+
+- When executing an XDP program in live frame mode, the result of the execution
+ will not be returned to userspace; instead, the kernel will perform the
+ operation indicated by the program's return code (drop the packet, redirect
+ it, etc). For this reason, setting the ``data_out`` or ``ctx_out`` attributes
+ in the syscall parameters when running in this mode will be rejected. In
+ addition, not all failures will be reported back to userspace directly;
+ specifically, only fatal errors in setup or during execution (like memory
+ allocation errors) will halt execution and return an error. If an error occurs
+ in packet processing, like a failure to redirect to a given interface,
+ execution will continue with the next repetition; these errors can be detected
+ via the same trace points as for regular XDP programs.
+
+- Userspace can supply an ifindex as part of the context object, just like in
+ the regular (non-live) mode. The XDP program will be executed as though the
+ packet arrived on this interface; i.e., the ``ingress_ifindex`` of the context
+ object will point to that interface. Furthermore, if the XDP program returns
+ ``XDP_PASS``, the packet will be injected into the kernel networking stack as
+ though it arrived on that ifindex, and if it returns ``XDP_TX``, the packet
+ will be transmitted *out* of that same interface. Do note, though, that
+ because the program execution is not happening in driver context, an
+ ``XDP_TX`` is actually turned into the same action as an ``XDP_REDIRECT`` to
+ that same interface (i.e., it will only work if the driver has support for the
+ ``ndo_xdp_xmit`` driver op).
+
+- When running the program with multiple repetitions, the execution will happen
+ in batches. The batch size defaults to 64 packets (which is same as the
+ maximum NAPI receive batch size), but can be specified by userspace through
+ the ``batch_size`` parameter, up to a maximum of 256 packets. For each batch,
+ the kernel executes the XDP program repeatedly, each invocation getting a
+ separate copy of the packet data. For each repetition, if the program drops
+ the packet, the data page is immediately recycled (see below). Otherwise, the
+ packet is buffered until the end of the batch, at which point all packets
+ buffered this way during the batch are transmitted at once.
+
+- When setting up the test run, the kernel will initialise a pool of memory
+ pages of the same size as the batch size. Each memory page will be initialised
+ with the initial packet data supplied by userspace at ``BPF_PROG_RUN``
+ invocation. When possible, the pages will be recycled on future program
+ invocations, to improve performance. Pages will generally be recycled a full
+ batch at a time, except when a packet is dropped (by return code or because
+ of, say, a redirection error), in which case that page will be recycled
+ immediately. If a packet ends up being passed to the regular networking stack
+ (because the XDP program returns ``XDP_PASS``, or because it ends up being
+ redirected to an interface that injects it into the stack), the page will be
+ released and a new one will be allocated when the pool is empty.
+
+ When recycling, the page content is not rewritten; only the packet boundary
+ pointers (``data``, ``data_end`` and ``data_meta``) in the context object will
+ be reset to the original values. This means that if a program rewrites the
+ packet contents, it has to be prepared to see either the original content or
+ the modified version on subsequent invocations.
diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index ef5c996547ec..96056a7447c7 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -21,6 +21,7 @@ that goes into great technical depth about the BPF Architecture.
helpers
programs
maps
+ bpf_prog_run
classic_vs_extended.rst
bpf_licensing
test_debug
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4eebea830613..bc23020b638d 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1232,6 +1232,8 @@ enum {
/* If set, run the test on the cpu specified by bpf_attr.test.cpu */
#define BPF_F_TEST_RUN_ON_CPU (1U << 0)
+/* If set, XDP frames will be transmitted after processing */
+#define BPF_F_TEST_XDP_LIVE_FRAMES (1U << 1)
/* type for BPF_ENABLE_STATS */
enum bpf_stats_type {
@@ -1393,6 +1395,7 @@ union bpf_attr {
__aligned_u64 ctx_out;
__u32 flags;
__u32 cpu;
+ __u32 batch_size;
} test;
struct { /* anonymous struct used by BPF_*_GET_*_ID */
diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig
index c3cf0b86eeb2..d56ee177d5f8 100644
--- a/kernel/bpf/Kconfig
+++ b/kernel/bpf/Kconfig
@@ -30,6 +30,7 @@ config BPF_SYSCALL
select TASKS_TRACE_RCU
select BINARY_PRINTF
select NET_SOCK_MSG if NET
+ select PAGE_POOL if NET
default n
help
Enable the bpf() system call that allows to manipulate BPF programs
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index db402ebc5570..9beb585be5a6 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3336,7 +3336,7 @@ static int bpf_prog_query(const union bpf_attr *attr,
}
}
-#define BPF_PROG_TEST_RUN_LAST_FIELD test.cpu
+#define BPF_PROG_TEST_RUN_LAST_FIELD test.batch_size
static int bpf_prog_test_run(const union bpf_attr *attr,
union bpf_attr __user *uattr)
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index ba410b069824..25169908be4a 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -15,6 +15,7 @@
#include <net/sock.h>
#include <net/tcp.h>
#include <net/net_namespace.h>
+#include <net/page_pool.h>
#include <linux/error-injection.h>
#include <linux/smp.h>
#include <linux/sock_diag.h>
@@ -53,10 +54,11 @@ static void bpf_test_timer_leave(struct bpf_test_timer *t)
rcu_read_unlock();
}
-static bool bpf_test_timer_continue(struct bpf_test_timer *t, u32 repeat, int *err, u32 *duration)
+static bool bpf_test_timer_continue(struct bpf_test_timer *t, int iterations,
+ u32 repeat, int *err, u32 *duration)
__must_hold(rcu)
{
- t->i++;
+ t->i += iterations;
if (t->i >= repeat) {
/* We're done. */
t->time_spent += ktime_get_ns() - t->time_start;
@@ -88,6 +90,286 @@ reset:
return false;
}
+/* We put this struct at the head of each page with a context and frame
+ * initialised when the page is allocated, so we don't have to do this on each
+ * repetition of the test run.
+ */
+struct xdp_page_head {
+ struct xdp_buff orig_ctx;
+ struct xdp_buff ctx;
+ struct xdp_frame frm;
+ u8 data[];
+};
+
+struct xdp_test_data {
+ struct xdp_buff *orig_ctx;
+ struct xdp_rxq_info rxq;
+ struct net_device *dev;
+ struct page_pool *pp;
+ struct xdp_frame **frames;
+ struct sk_buff **skbs;
+ u32 batch_size;
+ u32 frame_cnt;
+};
+
+#define TEST_XDP_FRAME_SIZE (PAGE_SIZE - sizeof(struct xdp_page_head) \
+ - sizeof(struct skb_shared_info))
+#define TEST_XDP_MAX_BATCH 256
+
+static void xdp_test_run_init_page(struct page *page, void *arg)
+{
+ struct xdp_page_head *head = phys_to_virt(page_to_phys(page));
+ struct xdp_buff *new_ctx, *orig_ctx;
+ u32 headroom = XDP_PACKET_HEADROOM;
+ struct xdp_test_data *xdp = arg;
+ size_t frm_len, meta_len;
+ struct xdp_frame *frm;
+ void *data;
+
+ orig_ctx = xdp->orig_ctx;
+ frm_len = orig_ctx->data_end - orig_ctx->data_meta;
+ meta_len = orig_ctx->data - orig_ctx->data_meta;
+ headroom -= meta_len;
+
+ new_ctx = &head->ctx;
+ frm = &head->frm;
+ data = &head->data;
+ memcpy(data + headroom, orig_ctx->data_meta, frm_len);
+
+ xdp_init_buff(new_ctx, TEST_XDP_FRAME_SIZE, &xdp->rxq);
+ xdp_prepare_buff(new_ctx, data, headroom, frm_len, true);
+ new_ctx->data = new_ctx->data_meta + meta_len;
+
+ xdp_update_frame_from_buff(new_ctx, frm);
+ frm->mem = new_ctx->rxq->mem;
+
+ memcpy(&head->orig_ctx, new_ctx, sizeof(head->orig_ctx));
+}
+
+static int xdp_test_run_setup(struct xdp_test_data *xdp, struct xdp_buff *orig_ctx)
+{
+ struct xdp_mem_info mem = {};
+ struct page_pool *pp;
+ int err = -ENOMEM;
+ struct page_pool_params pp_params = {
+ .order = 0,
+ .flags = 0,
+ .pool_size = xdp->batch_size,
+ .nid = NUMA_NO_NODE,
+ .max_len = TEST_XDP_FRAME_SIZE,
+ .init_callback = xdp_test_run_init_page,
+ .init_arg = xdp,
+ };
+
+ xdp->frames = kvmalloc_array(xdp->batch_size, sizeof(void *), GFP_KERNEL);
+ if (!xdp->frames)
+ return -ENOMEM;
+
+ xdp->skbs = kvmalloc_array(xdp->batch_size, sizeof(void *), GFP_KERNEL);
+ if (!xdp->skbs)
+ goto err_skbs;
+
+ pp = page_pool_create(&pp_params);
+ if (IS_ERR(pp)) {
+ err = PTR_ERR(pp);
+ goto err_pp;
+ }
+
+ /* will copy 'mem.id' into pp->xdp_mem_id */
+ err = xdp_reg_mem_model(&mem, MEM_TYPE_PAGE_POOL, pp);
+ if (err)
+ goto err_mmodel;
+
+ xdp->pp = pp;
+
+ /* We create a 'fake' RXQ referencing the original dev, but with an
+ * xdp_mem_info pointing to our page_pool
+ */
+ xdp_rxq_info_reg(&xdp->rxq, orig_ctx->rxq->dev, 0, 0);
+ xdp->rxq.mem.type = MEM_TYPE_PAGE_POOL;
+ xdp->rxq.mem.id = pp->xdp_mem_id;
+ xdp->dev = orig_ctx->rxq->dev;
+ xdp->orig_ctx = orig_ctx;
+
+ return 0;
+
+err_mmodel:
+ page_pool_destroy(pp);
+err_pp:
+ kfree(xdp->skbs);
+err_skbs:
+ kfree(xdp->frames);
+ return err;
+}
+
+static void xdp_test_run_teardown(struct xdp_test_data *xdp)
+{
+ page_pool_destroy(xdp->pp);
+ kfree(xdp->frames);
+ kfree(xdp->skbs);
+}
+
+static bool ctx_was_changed(struct xdp_page_head *head)
+{
+ return head->orig_ctx.data != head->ctx.data ||
+ head->orig_ctx.data_meta != head->ctx.data_meta ||
+ head->orig_ctx.data_end != head->ctx.data_end;
+}
+
+static void reset_ctx(struct xdp_page_head *head)
+{
+ if (likely(!ctx_was_changed(head)))
+ return;
+
+ head->ctx.data = head->orig_ctx.data;
+ head->ctx.data_meta = head->orig_ctx.data_meta;
+ head->ctx.data_end = head->orig_ctx.data_end;
+ xdp_update_frame_from_buff(&head->ctx, &head->frm);
+}
+
+static int xdp_recv_frames(struct xdp_frame **frames, int nframes,
+ struct sk_buff **skbs,
+ struct net_device *dev)
+{
+ gfp_t gfp = __GFP_ZERO | GFP_ATOMIC;
+ int i, n;
+ LIST_HEAD(list);
+
+ n = kmem_cache_alloc_bulk(skbuff_head_cache, gfp, nframes, (void **)skbs);
+ if (unlikely(n == 0)) {
+ for (i = 0; i < nframes; i++)
+ xdp_return_frame(frames[i]);
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < nframes; i++) {
+ struct xdp_frame *xdpf = frames[i];
+ struct sk_buff *skb = skbs[i];
+
+ skb = __xdp_build_skb_from_frame(xdpf, skb, dev);
+ if (!skb) {
+ xdp_return_frame(xdpf);
+ continue;
+ }
+
+ list_add_tail(&skb->list, &list);
+ }
+ netif_receive_skb_list(&list);
+
+ return 0;
+}
+
+static int xdp_test_run_batch(struct xdp_test_data *xdp, struct bpf_prog *prog,
+ u32 repeat)
+{
+ struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
+ int err = 0, act, ret, i, nframes = 0, batch_sz;
+ struct xdp_frame **frames = xdp->frames;
+ struct xdp_page_head *head;
+ struct xdp_frame *frm;
+ bool redirect = false;
+ struct xdp_buff *ctx;
+ struct page *page;
+
+ batch_sz = min_t(u32, repeat, xdp->batch_size);
+
+ local_bh_disable();
+ xdp_set_return_frame_no_direct();
+
+ for (i = 0; i < batch_sz; i++) {
+ page = page_pool_dev_alloc_pages(xdp->pp);
+ if (!page) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ head = phys_to_virt(page_to_phys(page));
+ reset_ctx(head);
+ ctx = &head->ctx;
+ frm = &head->frm;
+ xdp->frame_cnt++;
+
+ act = bpf_prog_run_xdp(prog, ctx);
+
+ /* if program changed pkt bounds we need to update the xdp_frame */
+ if (unlikely(ctx_was_changed(head))) {
+ ret = xdp_update_frame_from_buff(ctx, frm);
+ if (ret) {
+ xdp_return_buff(ctx);
+ continue;
+ }
+ }
+
+ switch (act) {
+ case XDP_TX:
+ /* we can't do a real XDP_TX since we're not in the
+ * driver, so turn it into a REDIRECT back to the same
+ * index
+ */
+ ri->tgt_index = xdp->dev->ifindex;
+ ri->map_id = INT_MAX;
+ ri->map_type = BPF_MAP_TYPE_UNSPEC;
+ fallthrough;
+ case XDP_REDIRECT:
+ redirect = true;
+ ret = xdp_do_redirect_frame(xdp->dev, ctx, frm, prog);
+ if (ret)
+ xdp_return_buff(ctx);
+ break;
+ case XDP_PASS:
+ frames[nframes++] = frm;
+ break;
+ default:
+ bpf_warn_invalid_xdp_action(NULL, prog, act);
+ fallthrough;
+ case XDP_DROP:
+ xdp_return_buff(ctx);
+ break;
+ }
+ }
+
+out:
+ if (redirect)
+ xdp_do_flush();
+ if (nframes) {
+ ret = xdp_recv_frames(frames, nframes, xdp->skbs, xdp->dev);
+ if (ret)
+ err = ret;
+ }
+
+ xdp_clear_return_frame_no_direct();
+ local_bh_enable();
+ return err;
+}
+
+static int bpf_test_run_xdp_live(struct bpf_prog *prog, struct xdp_buff *ctx,
+ u32 repeat, u32 batch_size, u32 *time)
+
+{
+ struct xdp_test_data xdp = { .batch_size = batch_size };
+ struct bpf_test_timer t = { .mode = NO_MIGRATE };
+ int ret;
+
+ if (!repeat)
+ repeat = 1;
+
+ ret = xdp_test_run_setup(&xdp, ctx);
+ if (ret)
+ return ret;
+
+ bpf_test_timer_enter(&t);
+ do {
+ xdp.frame_cnt = 0;
+ ret = xdp_test_run_batch(&xdp, prog, repeat - t.i);
+ if (unlikely(ret < 0))
+ break;
+ } while (bpf_test_timer_continue(&t, xdp.frame_cnt, repeat, &ret, time));
+ bpf_test_timer_leave(&t);
+
+ xdp_test_run_teardown(&xdp);
+ return ret;
+}
+
static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat,
u32 *retval, u32 *time, bool xdp)
{
@@ -119,7 +401,7 @@ static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat,
*retval = bpf_prog_run_xdp(prog, ctx);
else
*retval = bpf_prog_run(prog, ctx);
- } while (bpf_test_timer_continue(&t, repeat, &ret, time));
+ } while (bpf_test_timer_continue(&t, 1, repeat, &ret, time));
bpf_reset_run_ctx(old_ctx);
bpf_test_timer_leave(&t);
@@ -446,7 +728,7 @@ int bpf_prog_test_run_tracing(struct bpf_prog *prog,
int b = 2, err = -EFAULT;
u32 retval = 0;
- if (kattr->test.flags || kattr->test.cpu)
+ if (kattr->test.flags || kattr->test.cpu || kattr->test.batch_size)
return -EINVAL;
switch (prog->expected_attach_type) {
@@ -510,7 +792,7 @@ int bpf_prog_test_run_raw_tp(struct bpf_prog *prog,
/* doesn't support data_in/out, ctx_out, duration, or repeat */
if (kattr->test.data_in || kattr->test.data_out ||
kattr->test.ctx_out || kattr->test.duration ||
- kattr->test.repeat)
+ kattr->test.repeat || kattr->test.batch_size)
return -EINVAL;
if (ctx_size_in < prog->aux->max_ctx_offset ||
@@ -741,7 +1023,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
void *data;
int ret;
- if (kattr->test.flags || kattr->test.cpu)
+ if (kattr->test.flags || kattr->test.cpu || kattr->test.batch_size)
return -EINVAL;
data = bpf_test_init(kattr, kattr->test.data_size_in,
@@ -922,7 +1204,9 @@ static void xdp_convert_buff_to_md(struct xdp_buff *xdp, struct xdp_md *xdp_md)
int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
union bpf_attr __user *uattr)
{
+ bool do_live = (kattr->test.flags & BPF_F_TEST_XDP_LIVE_FRAMES);
u32 tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+ u32 batch_size = kattr->test.batch_size;
u32 size = kattr->test.data_size_in;
u32 headroom = XDP_PACKET_HEADROOM;
u32 retval, duration, max_data_sz;
@@ -938,6 +1222,18 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
prog->expected_attach_type == BPF_XDP_CPUMAP)
return -EINVAL;
+ if (kattr->test.flags & ~BPF_F_TEST_XDP_LIVE_FRAMES)
+ return -EINVAL;
+
+ if (do_live) {
+ if (!batch_size)
+ batch_size = NAPI_POLL_WEIGHT;
+ else if (batch_size > TEST_XDP_MAX_BATCH)
+ return -E2BIG;
+ } else if (batch_size) {
+ return -EINVAL;
+ }
+
ctx = bpf_ctx_init(kattr, sizeof(struct xdp_md));
if (IS_ERR(ctx))
return PTR_ERR(ctx);
@@ -946,14 +1242,20 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
/* There can't be user provided data before the meta data */
if (ctx->data_meta || ctx->data_end != size ||
ctx->data > ctx->data_end ||
- unlikely(xdp_metalen_invalid(ctx->data)))
+ unlikely(xdp_metalen_invalid(ctx->data)) ||
+ (do_live && (kattr->test.data_out || kattr->test.ctx_out)))
goto free_ctx;
/* Meta data is allocated from the headroom */
headroom -= ctx->data;
}
max_data_sz = 4096 - headroom - tailroom;
- size = min_t(u32, size, max_data_sz);
+ if (size > max_data_sz) {
+ /* disallow live data mode for jumbo frames */
+ if (do_live)
+ goto free_ctx;
+ size = max_data_sz;
+ }
data = bpf_test_init(kattr, size, max_data_sz, headroom, tailroom);
if (IS_ERR(data)) {
@@ -1011,7 +1313,10 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
if (repeat > 1)
bpf_prog_change_xdp(NULL, prog);
- ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true);
+ if (do_live)
+ ret = bpf_test_run_xdp_live(prog, &xdp, repeat, batch_size, &duration);
+ else
+ ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true);
/* We convert the xdp_buff back to an xdp_md before checking the return
* code so the reference count of any held netdevice will be decremented
* even if the test run failed.
@@ -1073,7 +1378,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
if (prog->type != BPF_PROG_TYPE_FLOW_DISSECTOR)
return -EINVAL;
- if (kattr->test.flags || kattr->test.cpu)
+ if (kattr->test.flags || kattr->test.cpu || kattr->test.batch_size)
return -EINVAL;
if (size < ETH_HLEN)
@@ -1108,7 +1413,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
do {
retval = bpf_flow_dissect(prog, &ctx, eth->h_proto, ETH_HLEN,
size, flags);
- } while (bpf_test_timer_continue(&t, repeat, &ret, &duration));
+ } while (bpf_test_timer_continue(&t, 1, repeat, &ret, &duration));
bpf_test_timer_leave(&t);
if (ret < 0)
@@ -1140,7 +1445,7 @@ int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, const union bpf_attr *kat
if (prog->type != BPF_PROG_TYPE_SK_LOOKUP)
return -EINVAL;
- if (kattr->test.flags || kattr->test.cpu)
+ if (kattr->test.flags || kattr->test.cpu || kattr->test.batch_size)
return -EINVAL;
if (kattr->test.data_in || kattr->test.data_size_in || kattr->test.data_out ||
@@ -1203,7 +1508,7 @@ int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, const union bpf_attr *kat
do {
ctx.selected_sk = NULL;
retval = BPF_PROG_SK_LOOKUP_RUN_ARRAY(progs, ctx, bpf_prog_run);
- } while (bpf_test_timer_continue(&t, repeat, &ret, &duration));
+ } while (bpf_test_timer_continue(&t, 1, repeat, &ret, &duration));
bpf_test_timer_leave(&t);
if (ret < 0)
@@ -1242,7 +1547,8 @@ int bpf_prog_test_run_syscall(struct bpf_prog *prog,
/* doesn't support data_in/out, ctx_out, duration, or repeat or flags */
if (kattr->test.data_in || kattr->test.data_out ||
kattr->test.ctx_out || kattr->test.duration ||
- kattr->test.repeat || kattr->test.flags)
+ kattr->test.repeat || kattr->test.flags ||
+ kattr->test.batch_size)
return -EINVAL;
if (ctx_size_in < prog->aux->max_ctx_offset ||
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4eebea830613..bc23020b638d 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1232,6 +1232,8 @@ enum {
/* If set, run the test on the cpu specified by bpf_attr.test.cpu */
#define BPF_F_TEST_RUN_ON_CPU (1U << 0)
+/* If set, XDP frames will be transmitted after processing */
+#define BPF_F_TEST_XDP_LIVE_FRAMES (1U << 1)
/* type for BPF_ENABLE_STATS */
enum bpf_stats_type {
@@ -1393,6 +1395,7 @@ union bpf_attr {
__aligned_u64 ctx_out;
__u32 flags;
__u32 cpu;
+ __u32 batch_size;
} test;
struct { /* anonymous struct used by BPF_*_GET_*_ID */
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 3c7c180294fa..f69ce3a01385 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -995,6 +995,7 @@ int bpf_prog_test_run_opts(int prog_fd, struct bpf_test_run_opts *opts)
memset(&attr, 0, sizeof(attr));
attr.test.prog_fd = prog_fd;
+ attr.test.batch_size = OPTS_GET(opts, batch_size, 0);
attr.test.cpu = OPTS_GET(opts, cpu, 0);
attr.test.flags = OPTS_GET(opts, flags, 0);
attr.test.repeat = OPTS_GET(opts, repeat, 0);
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 16b21757b8bf..5253cb4a4c0a 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -512,8 +512,9 @@ struct bpf_test_run_opts {
__u32 duration; /* out: average per repetition in ns */
__u32 flags;
__u32 cpu;
+ __u32 batch_size;
};
-#define bpf_test_run_opts__last_field cpu
+#define bpf_test_run_opts__last_field batch_size
LIBBPF_API int bpf_prog_test_run_opts(int prog_fd,
struct bpf_test_run_opts *opts);
diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c
index 6db1af8fdee7..2bb1f9b3841d 100644
--- a/tools/testing/selftests/bpf/network_helpers.c
+++ b/tools/testing/selftests/bpf/network_helpers.c
@@ -1,18 +1,25 @@
// SPDX-License-Identifier: GPL-2.0-only
+#define _GNU_SOURCE
+
#include <errno.h>
#include <stdbool.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
+#include <sched.h>
#include <arpa/inet.h>
+#include <sys/mount.h>
+#include <sys/stat.h>
#include <linux/err.h>
#include <linux/in.h>
#include <linux/in6.h>
+#include <linux/limits.h>
#include "bpf_util.h"
#include "network_helpers.h"
+#include "test_progs.h"
#define clean_errno() (errno == 0 ? "None" : strerror(errno))
#define log_err(MSG, ...) ({ \
@@ -356,3 +363,82 @@ char *ping_command(int family)
}
return "ping";
}
+
+struct nstoken {
+ int orig_netns_fd;
+};
+
+static int setns_by_fd(int nsfd)
+{
+ int err;
+
+ err = setns(nsfd, CLONE_NEWNET);
+ close(nsfd);
+
+ if (!ASSERT_OK(err, "setns"))
+ return err;
+
+ /* Switch /sys to the new namespace so that e.g. /sys/class/net
+ * reflects the devices in the new namespace.
+ */
+ err = unshare(CLONE_NEWNS);
+ if (!ASSERT_OK(err, "unshare"))
+ return err;
+
+ /* Make our /sys mount private, so the following umount won't
+ * trigger the global umount in case it's shared.
+ */
+ err = mount("none", "/sys", NULL, MS_PRIVATE, NULL);
+ if (!ASSERT_OK(err, "remount private /sys"))
+ return err;
+
+ err = umount2("/sys", MNT_DETACH);
+ if (!ASSERT_OK(err, "umount2 /sys"))
+ return err;
+
+ err = mount("sysfs", "/sys", "sysfs", 0, NULL);
+ if (!ASSERT_OK(err, "mount /sys"))
+ return err;
+
+ err = mount("bpffs", "/sys/fs/bpf", "bpf", 0, NULL);
+ if (!ASSERT_OK(err, "mount /sys/fs/bpf"))
+ return err;
+
+ return 0;
+}
+
+struct nstoken *open_netns(const char *name)
+{
+ int nsfd;
+ char nspath[PATH_MAX];
+ int err;
+ struct nstoken *token;
+
+ token = malloc(sizeof(struct nstoken));
+ if (!ASSERT_OK_PTR(token, "malloc token"))
+ return NULL;
+
+ token->orig_netns_fd = open("/proc/self/ns/net", O_RDONLY);
+ if (!ASSERT_GE(token->orig_netns_fd, 0, "open /proc/self/ns/net"))
+ goto fail;
+
+ snprintf(nspath, sizeof(nspath), "%s/%s", "/var/run/netns", name);
+ nsfd = open(nspath, O_RDONLY | O_CLOEXEC);
+ if (!ASSERT_GE(nsfd, 0, "open netns fd"))
+ goto fail;
+
+ err = setns_by_fd(nsfd);
+ if (!ASSERT_OK(err, "setns_by_fd"))
+ goto fail;
+
+ return token;
+fail:
+ free(token);
+ return NULL;
+}
+
+void close_netns(struct nstoken *token)
+{
+ ASSERT_OK(setns_by_fd(token->orig_netns_fd), "setns_by_fd");
+ free(token);
+}
diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h
index d198181a5648..a4b3b2f9877b 100644
--- a/tools/testing/selftests/bpf/network_helpers.h
+++ b/tools/testing/selftests/bpf/network_helpers.h
@@ -55,4 +55,13 @@ int make_sockaddr(int family, const char *addr_str, __u16 port,
struct sockaddr_storage *addr, socklen_t *len);
char *ping_command(int family);
+struct nstoken;
+/**
+ * open_netns() - Switch to specified network namespace by name.
+ *
+ * Returns token with which to restore the original namespace
+ * using close_netns().
+ */
+struct nstoken *open_netns(const char *name);
+void close_netns(struct nstoken *token);
#endif
diff --git a/tools/testing/selftests/bpf/prog_tests/tc_redirect.c b/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
index 2b255e28ed26..7ad66a247c02 100644
--- a/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
+++ b/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
@@ -10,8 +10,6 @@
* to drop unexpected traffic.
*/
-#define _GNU_SOURCE
-
#include <arpa/inet.h>
#include <linux/if.h>
#include <linux/if_tun.h>
@@ -19,10 +17,8 @@
#include <linux/sysctl.h>
#include <linux/time_types.h>
#include <linux/net_tstamp.h>
-#include <sched.h>
#include <stdbool.h>
#include <stdio.h>
-#include <sys/mount.h>
#include <sys/stat.h>
#include <unistd.h>
@@ -92,91 +88,6 @@ static int write_file(const char *path, const char *newval)
return 0;
}
-struct nstoken {
- int orig_netns_fd;
-};
-
-static int setns_by_fd(int nsfd)
-{
- int err;
-
- err = setns(nsfd, CLONE_NEWNET);
- close(nsfd);
-
- if (!ASSERT_OK(err, "setns"))
- return err;
-
- /* Switch /sys to the new namespace so that e.g. /sys/class/net
- * reflects the devices in the new namespace.
- */
- err = unshare(CLONE_NEWNS);
- if (!ASSERT_OK(err, "unshare"))
- return err;
-
- /* Make our /sys mount private, so the following umount won't
- * trigger the global umount in case it's shared.
- */
- err = mount("none", "/sys", NULL, MS_PRIVATE, NULL);
- if (!ASSERT_OK(err, "remount private /sys"))
- return err;
-
- err = umount2("/sys", MNT_DETACH);
- if (!ASSERT_OK(err, "umount2 /sys"))
- return err;
-
- err = mount("sysfs", "/sys", "sysfs", 0, NULL);
- if (!ASSERT_OK(err, "mount /sys"))
- return err;
-
- err = mount("bpffs", "/sys/fs/bpf", "bpf", 0, NULL);
- if (!ASSERT_OK(err, "mount /sys/fs/bpf"))
- return err;
-
- return 0;
-}
-
-/**
- * open_netns() - Switch to specified network namespace by name.
- *
- * Returns token with which to restore the original namespace
- * using close_netns().
- */
-static struct nstoken *open_netns(const char *name)
-{
- int nsfd;
- char nspath[PATH_MAX];
- int err;
- struct nstoken *token;
-
- token = calloc(1, sizeof(struct nstoken));
- if (!ASSERT_OK_PTR(token, "malloc token"))
- return NULL;
-
- token->orig_netns_fd = open("/proc/self/ns/net", O_RDONLY);
- if (!ASSERT_GE(token->orig_netns_fd, 0, "open /proc/self/ns/net"))
- goto fail;
-
- snprintf(nspath, sizeof(nspath), "%s/%s", "/var/run/netns", name);
- nsfd = open(nspath, O_RDONLY | O_CLOEXEC);
- if (!ASSERT_GE(nsfd, 0, "open netns fd"))
- goto fail;
-
- err = setns_by_fd(nsfd);
- if (!ASSERT_OK(err, "setns_by_fd"))
- goto fail;
-
- return token;
-fail:
- free(token);
- return NULL;
-}
-
-static void close_netns(struct nstoken *token)
-{
- ASSERT_OK(setns_by_fd(token->orig_netns_fd), "setns_by_fd");
- free(token);
-}
-
static int netns_setup_namespaces(const char *verb)
{
const char * const *ns = namespaces;
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
new file mode 100644
index 000000000000..9926b07e38c8
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
@@ -0,0 +1,177 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include <network_helpers.h>
+#include <net/if.h>
+#include <linux/if_ether.h>
+#include <linux/if_packet.h>
+#include <linux/ipv6.h>
+#include <linux/in6.h>
+#include <linux/udp.h>
+#include <bpf/bpf_endian.h>
+#include "test_xdp_do_redirect.skel.h"
+
+#define SYS(fmt, ...) \
+ ({ \
+ char cmd[1024]; \
+ snprintf(cmd, sizeof(cmd), fmt, ##__VA_ARGS__); \
+ if (!ASSERT_OK(system(cmd), cmd)) \
+ goto out; \
+ })
+
+struct udp_packet {
+ struct ethhdr eth;
+ struct ipv6hdr iph;
+ struct udphdr udp;
+ __u8 payload[64 - sizeof(struct udphdr)
+ - sizeof(struct ethhdr) - sizeof(struct ipv6hdr)];
+} __packed;
+
+static struct udp_packet pkt_udp = {
+ .eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
+ .eth.h_dest = {0x00, 0x11, 0x22, 0x33, 0x44, 0x55},
+ .eth.h_source = {0x66, 0x77, 0x88, 0x99, 0xaa, 0xbb},
+ .iph.version = 6,
+ .iph.nexthdr = IPPROTO_UDP,
+ .iph.payload_len = bpf_htons(sizeof(struct udp_packet)
+ - offsetof(struct udp_packet, udp)),
+ .iph.hop_limit = 2,
+ .iph.saddr.s6_addr16 = {bpf_htons(0xfc00), 0, 0, 0, 0, 0, 0, bpf_htons(1)},
+ .iph.daddr.s6_addr16 = {bpf_htons(0xfc00), 0, 0, 0, 0, 0, 0, bpf_htons(2)},
+ .udp.source = bpf_htons(1),
+ .udp.dest = bpf_htons(1),
+ .udp.len = bpf_htons(sizeof(struct udp_packet)
+ - offsetof(struct udp_packet, udp)),
+ .payload = {0x42}, /* receiver XDP program matches on this */
+};
+
+static int attach_tc_prog(struct bpf_tc_hook *hook, int fd)
+{
+ DECLARE_LIBBPF_OPTS(bpf_tc_opts, opts, .handle = 1, .priority = 1, .prog_fd = fd);
+ int ret;
+
+ ret = bpf_tc_hook_create(hook);
+ if (!ASSERT_OK(ret, "create tc hook"))
+ return ret;
+
+ ret = bpf_tc_attach(hook, &opts);
+ if (!ASSERT_OK(ret, "bpf_tc_attach")) {
+ bpf_tc_hook_destroy(hook);
+ return ret;
+ }
+
+ return 0;
+}
+
+#define NUM_PKTS 10000
+void test_xdp_do_redirect(void)
+{
+ int err, xdp_prog_fd, tc_prog_fd, ifindex_src, ifindex_dst;
+ char data[sizeof(pkt_udp) + sizeof(__u32)];
+ struct test_xdp_do_redirect *skel = NULL;
+ struct nstoken *nstoken = NULL;
+ struct bpf_link *link;
+
+ struct xdp_md ctx_in = { .data = sizeof(__u32),
+ .data_end = sizeof(data) };
+ DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts,
+ .data_in = &data,
+ .data_size_in = sizeof(data),
+ .ctx_in = &ctx_in,
+ .ctx_size_in = sizeof(ctx_in),
+ .flags = BPF_F_TEST_XDP_LIVE_FRAMES,
+ .repeat = NUM_PKTS,
+ .batch_size = 64,
+ );
+ DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook,
+ .attach_point = BPF_TC_INGRESS);
+
+ memcpy(&data[sizeof(__u32)], &pkt_udp, sizeof(pkt_udp));
+ *((__u32 *)data) = 0x42; /* metadata test value */
+
+ skel = test_xdp_do_redirect__open();
+ if (!ASSERT_OK_PTR(skel, "skel"))
+ return;
+
+ /* The XDP program we run with bpf_prog_run() will cycle through all
+ * three xmit (PASS/TX/REDIRECT) return codes starting from above, and
+ * ending up with PASS, so we should end up with two packets on the dst
+ * iface and NUM_PKTS-2 in the TC hook. We match the packets on the UDP
+ * payload.
+ */
+ SYS("ip netns add testns");
+ nstoken = open_netns("testns");
+ if (!ASSERT_OK_PTR(nstoken, "setns"))
+ goto out;
+
+ SYS("ip link add veth_src type veth peer name veth_dst");
+ SYS("ip link set dev veth_src address 00:11:22:33:44:55");
+ SYS("ip link set dev veth_dst address 66:77:88:99:aa:bb");
+ SYS("ip link set dev veth_src up");
+ SYS("ip link set dev veth_dst up");
+ SYS("ip addr add dev veth_src fc00::1/64");
+ SYS("ip addr add dev veth_dst fc00::2/64");
+ SYS("ip neigh add fc00::2 dev veth_src lladdr 66:77:88:99:aa:bb");
+
+ /* We enable forwarding in the test namespace because that will cause
+ * the packets that go through the kernel stack (with XDP_PASS) to be
+ * forwarded back out the same interface (because of the packet dst
+ * combined with the interface addresses). When this happens, the
+ * regular forwarding path will end up going through the same
+ * veth_xdp_xmit() call as the XDP_REDIRECT code, which can cause a
+ * deadlock if it happens on the same CPU. There's a local_bh_disable()
+ * in the test_run code to prevent this, but an earlier version of the
+ * code didn't have this, so we keep the test behaviour to make sure the
+ * bug doesn't resurface.
+ */
+ SYS("sysctl -qw net.ipv6.conf.all.forwarding=1");
+
+ ifindex_src = if_nametoindex("veth_src");
+ ifindex_dst = if_nametoindex("veth_dst");
+ if (!ASSERT_NEQ(ifindex_src, 0, "ifindex_src") ||
+ !ASSERT_NEQ(ifindex_dst, 0, "ifindex_dst"))
+ goto out;
+
+ memcpy(skel->rodata->expect_dst, &pkt_udp.eth.h_dest, ETH_ALEN);
+ skel->rodata->ifindex_out = ifindex_src; /* redirect back to the same iface */
+ skel->rodata->ifindex_in = ifindex_src;
+ ctx_in.ingress_ifindex = ifindex_src;
+ tc_hook.ifindex = ifindex_src;
+
+ if (!ASSERT_OK(test_xdp_do_redirect__load(skel), "load"))
+ goto out;
+
+ link = bpf_program__attach_xdp(skel->progs.xdp_count_pkts, ifindex_dst);
+ if (!ASSERT_OK_PTR(link, "prog_attach"))
+ goto out;
+ skel->links.xdp_count_pkts = link;
+
+ tc_prog_fd = bpf_program__fd(skel->progs.tc_count_pkts);
+ if (attach_tc_prog(&tc_hook, tc_prog_fd))
+ goto out;
+
+ xdp_prog_fd = bpf_program__fd(skel->progs.xdp_redirect);
+ err = bpf_prog_test_run_opts(xdp_prog_fd, &opts);
+ if (!ASSERT_OK(err, "prog_run"))
+ goto out_tc;
+
+ /* wait for the packets to be flushed */
+ kern_sync_rcu();
+
+ /* There will be one packet sent through XDP_REDIRECT and one through
+ * XDP_TX; these will show up on the XDP counting program, while the
+ * rest will be counted at the TC ingress hook (and the counting program
+ * resets the packet payload so they don't get counted twice even though
+ * they are re-xmited out the veth device
+ */
+ ASSERT_EQ(skel->bss->pkts_seen_xdp, 2, "pkt_count_xdp");
+ ASSERT_EQ(skel->bss->pkts_seen_zero, 2, "pkt_count_zero");
+ ASSERT_EQ(skel->bss->pkts_seen_tc, NUM_PKTS - 2, "pkt_count_tc");
+
+out_tc:
+ bpf_tc_hook_destroy(&tc_hook);
+out:
+ if (nstoken)
+ close_netns(nstoken);
+ system("ip netns del testns");
+ test_xdp_do_redirect__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c
new file mode 100644
index 000000000000..77a123071940
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c
@@ -0,0 +1,100 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+
+#define ETH_ALEN 6
+#define HDR_SZ (sizeof(struct ethhdr) + sizeof(struct ipv6hdr) + sizeof(struct udphdr))
+const volatile int ifindex_out;
+const volatile int ifindex_in;
+const volatile __u8 expect_dst[ETH_ALEN];
+volatile int pkts_seen_xdp = 0;
+volatile int pkts_seen_zero = 0;
+volatile int pkts_seen_tc = 0;
+volatile int retcode = XDP_REDIRECT;
+
+SEC("xdp")
+int xdp_redirect(struct xdp_md *xdp)
+{
+ __u32 *metadata = (void *)(long)xdp->data_meta;
+ void *data_end = (void *)(long)xdp->data_end;
+ void *data = (void *)(long)xdp->data;
+
+ __u8 *payload = data + HDR_SZ;
+ int ret = retcode;
+
+ if (payload + 1 > data_end)
+ return XDP_ABORTED;
+
+ if (xdp->ingress_ifindex != ifindex_in)
+ return XDP_ABORTED;
+
+ if (metadata + 1 > data)
+ return XDP_ABORTED;
+
+ if (*metadata != 0x42)
+ return XDP_ABORTED;
+
+ if (*payload == 0) {
+ *payload = 0x42;
+ pkts_seen_zero++;
+ }
+
+ if (bpf_xdp_adjust_meta(xdp, 4))
+ return XDP_ABORTED;
+
+ if (retcode > XDP_PASS)
+ retcode--;
+
+ if (ret == XDP_REDIRECT)
+ return bpf_redirect(ifindex_out, 0);
+
+ return ret;
+}
+
+static bool check_pkt(void *data, void *data_end)
+{
+ struct ipv6hdr *iph = data + sizeof(struct ethhdr);
+ __u8 *payload = data + HDR_SZ;
+
+ if (payload + 1 > data_end)
+ return false;
+
+ if (iph->nexthdr != IPPROTO_UDP || *payload != 0x42)
+ return false;
+
+ /* reset the payload so the same packet doesn't get counted twice when
+ * it cycles back through the kernel path and out the dst veth
+ */
+ *payload = 0;
+ return true;
+}
+
+SEC("xdp")
+int xdp_count_pkts(struct xdp_md *xdp)
+{
+ void *data = (void *)(long)xdp->data;
+ void *data_end = (void *)(long)xdp->data_end;
+
+ if (check_pkt(data, data_end))
+ pkts_seen_xdp++;
+
+ /* Return XDP_DROP to make sure the data page is recycled, like when it
+ * exits a physical NIC. Recycled pages will be counted in the
+ * pkts_seen_zero counter above.
+ */
+ return XDP_DROP;
+}
+
+SEC("tc")
+int tc_count_pkts(struct __sk_buff *skb)
+{
+ void *data = (void *)(long)skb->data;
+ void *data_end = (void *)(long)skb->data_end;
+
+ if (check_pkt(data, data_end))
+ pkts_seen_tc++;
+
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";