<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/net/netkit.c, branch v6.18.21</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-07-11T18:01:09+00:00</updated>
<entry>
<title>netkit: Remove location field in netkit_link</title>
<updated>2025-07-11T18:01:09+00:00</updated>
<author>
<name>Tao Chen</name>
<email>chen.dylane@linux.dev</email>
</author>
<published>2025-07-10T03:20:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=601a3956fead680691cdf0b0108bdb27eea5108b'/>
<id>urn:sha1:601a3956fead680691cdf0b0108bdb27eea5108b</id>
<content type='text'>
Use attach_type in bpf_link to replace the location field, and
remove location field in netkit_link.

Signed-off-by: Tao Chen &lt;chen.dylane@linux.dev&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20250710032038.888700-8-chen.dylane@linux.dev
</content>
</entry>
<entry>
<title>bpf: Add attach_type field to bpf_link</title>
<updated>2025-07-11T17:51:55+00:00</updated>
<author>
<name>Tao Chen</name>
<email>chen.dylane@linux.dev</email>
</author>
<published>2025-07-10T03:20:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b725441f02c2b31c04a95d0e9ca5420fa029a767'/>
<id>urn:sha1:b725441f02c2b31c04a95d0e9ca5420fa029a767</id>
<content type='text'>
Attach_type will be set when a link is created by user. It is better to
record attach_type in bpf_link generically and have it available
universally for all link types. So add the attach_type field in bpf_link
and move the sleepable field to avoid unnecessary gap padding.

Signed-off-by: Tao Chen &lt;chen.dylane@linux.dev&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20250710032038.888700-2-chen.dylane@linux.dev
</content>
</entry>
<entry>
<title>netkit: Remove double invocation to clear ipvs property flag</title>
<updated>2025-02-28T00:53:05+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2025-02-25T21:29:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=047e059cf21210f633bb538799107af10650a9c7'/>
<id>urn:sha1:047e059cf21210f633bb538799107af10650a9c7</id>
<content type='text'>
With ipvs_reset() now done unconditionally in skb_scrub_packet()
we would then call the former twice netkit_prep_forward(). Thus
remove the now unnecessary explicit call.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://patch.msgid.link/20250225212927.69271-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: Use link/peer netns in newlink() of rtnl_link_ops</title>
<updated>2025-02-21T23:28:02+00:00</updated>
<author>
<name>Xiao Liang</name>
<email>shaw.leon@gmail.com</email>
</author>
<published>2025-02-19T12:50:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cf517ac16ad96f3953d65ea198c0b310a1ffa14f'/>
<id>urn:sha1:cf517ac16ad96f3953d65ea198c0b310a1ffa14f</id>
<content type='text'>
Add two helper functions - rtnl_newlink_link_net() and
rtnl_newlink_peer_net() for netns fallback logic. Peer netns falls back
to link netns, and link netns falls back to source netns.

Convert the use of params-&gt;net in netdevice drivers to one of the helper
functions for clarity.

Signed-off-by: Xiao Liang &lt;shaw.leon@gmail.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Link: https://patch.msgid.link/20250219125039.18024-4-shaw.leon@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>rtnetlink: Pack newlink() params into struct</title>
<updated>2025-02-21T23:28:02+00:00</updated>
<author>
<name>Xiao Liang</name>
<email>shaw.leon@gmail.com</email>
</author>
<published>2025-02-19T12:50:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69c7be1b903fca2835e80ec506bd1d75ce84fb4d'/>
<id>urn:sha1:69c7be1b903fca2835e80ec506bd1d75ce84fb4d</id>
<content type='text'>
There are 4 net namespaces involved when creating links:

 - source netns - where the netlink socket resides,
 - target netns - where to put the device being created,
 - link netns - netns associated with the device (backend),
 - peer netns - netns of peer device.

Currently, two nets are passed to newlink() callback - "src_net"
parameter and "dev_net" (implicitly in net_device). They are set as
follows, depending on netlink attributes in the request.

 +------------+-------------------+---------+---------+
 | peer netns | IFLA_LINK_NETNSID | src_net | dev_net |
 +------------+-------------------+---------+---------+
 |            | absent            | source  | target  |
 | absent     +-------------------+---------+---------+
 |            | present           | link    | link    |
 +------------+-------------------+---------+---------+
 |            | absent            | peer    | target  |
 | present    +-------------------+---------+---------+
 |            | present           | peer    | link    |
 +------------+-------------------+---------+---------+

When IFLA_LINK_NETNSID is present, the device is created in link netns
first and then moved to target netns. This has some side effects,
including extra ifindex allocation, ifname validation and link events.
These could be avoided if we create it in target netns from
the beginning.

On the other hand, the meaning of src_net parameter is ambiguous. It
varies depending on how parameters are passed. It is the effective
link (or peer netns) by design, but some drivers ignore it and use
dev_net instead.

To provide more netns context for drivers, this patch packs existing
newlink() parameters, along with the source netns, link netns and peer
netns, into a struct. The old "src_net" is renamed to "net" to avoid
confusion with real source netns, and will be deprecated later. The use
of src_net are converted to params-&gt;net trivially.

Signed-off-by: Xiao Liang &lt;shaw.leon@gmail.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Link: https://patch.msgid.link/20250219125039.18024-3-shaw.leon@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>netkit: Allow for configuring needed_{head,tail}room</title>
<updated>2025-01-06T08:48:49+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2024-12-20T23:46:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b9ed315d3c4c0c294a4348edb6874d489bac47fa'/>
<id>urn:sha1:b9ed315d3c4c0c294a4348edb6874d489bac47fa</id>
<content type='text'>
Allow the user to configure needed_{head,tail}room for both netkit
devices. The idea is similar to 163e529200af ("veth: implement
ndo_set_rx_headroom") with the difference that the two parameters
can be specified upon device creation. By default the current behavior
stays as is which is needed_{head,tail}room is 0.

In case of Cilium, for example, the netkit devices are not enslaved
into a bridge or openvswitch device (rather, BPF-based redirection
is used out of tcx), and as such these parameters are not propagated
into the Pod's netns via peer device.

Given Cilium can run in vxlan/geneve tunneling mode (needed_headroom)
and/or be used in combination with WireGuard (needed_{head,tail}room),
allow the Cilium CNI plugin to specify these two upon netkit device
creation.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://lore.kernel.org/bpf/20241220234658.490686-1-daniel@iogearbox.net
</content>
</entry>
<entry>
<title>rtnetlink: fix double call of rtnl_link_get_net_ifla()</title>
<updated>2024-12-03T10:29:29+00:00</updated>
<author>
<name>Cong Wang</name>
<email>cong.wang@bytedance.com</email>
</author>
<published>2024-11-29T21:25:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=48327566769a6ff2e873b6bf075392bd756625ca'/>
<id>urn:sha1:48327566769a6ff2e873b6bf075392bd756625ca</id>
<content type='text'>
Currently rtnl_link_get_net_ifla() gets called twice when we create
peer devices, once in rtnl_add_peer_net() and once in each -&gt;newlink()
implementation.

This looks safer, however, it leads to a classic Time-of-Check to
Time-of-Use (TOCTOU) bug since IFLA_NET_NS_PID is very dynamic. And
because of the lack of checking error pointer of the second call, it
also leads to a kernel crash as reported by syzbot.

Fix this by getting rid of the second call, which already becomes
redudant after Kuniyuki's work. We have to propagate the result of the
first rtnl_link_get_net_ifla() down to each -&gt;newlink().

Reported-by: syzbot+21ba4d5adff0b6a7cfc6@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=21ba4d5adff0b6a7cfc6
Fixes: 0eb87b02a705 ("veth: Set VETH_INFO_PEER to veth_link_ops.peer_type.")
Fixes: 6b84e558e95d ("vxcan: Set VXCAN_INFO_PEER to vxcan_link_ops.peer_type.")
Fixes: fefd5d082172 ("netkit: Set IFLA_NETKIT_PEER_INFO to netkit_link_ops.peer_type.")
Cc: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Signed-off-by: Cong Wang &lt;cong.wang@bytedance.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Link: https://patch.msgid.link/20241129212519.825567-1-xiyou.wangcong@gmail.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;

</content>
</entry>
<entry>
<title>netkit: Set IFLA_NETKIT_PEER_INFO to netkit_link_ops.peer_type.</title>
<updated>2024-11-12T01:26:52+00:00</updated>
<author>
<name>Kuniyuki Iwashima</name>
<email>kuniyu@amazon.com</email>
</author>
<published>2024-11-08T00:48:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fefd5d08217284a8894502eb1148ff88bc8510c0'/>
<id>urn:sha1:fefd5d08217284a8894502eb1148ff88bc8510c0</id>
<content type='text'>
For per-netns RTNL, we need to prefetch the peer device's netns.

Let's set rtnl_link_ops.peer_type and accordingly remove duplicated
validation in -&gt;newlink().

Signed-off-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://patch.msgid.link/20241108004823.29419-9-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>netkit: Simplify netkit mode over to use NLA_POLICY_MAX</title>
<updated>2024-10-08T00:12:37+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2024-10-04T10:13:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0ebe224ffce83b3c2b331295d473296220d9fc36'/>
<id>urn:sha1:0ebe224ffce83b3c2b331295d473296220d9fc36</id>
<content type='text'>
Jakub suggested to rely on netlink policy validation via NLA_POLICY_MAX()
instead of open-coding it. netkit_check_mode() is a candidate which can
be simplified through this as well aside from the netkit scrubbing one.

Suggested-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Acked-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://lore.kernel.org/r/20241004101335.117711-2-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
</content>
</entry>
<entry>
<title>netkit: Add option for scrubbing skb meta data</title>
<updated>2024-10-08T00:12:37+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2024-10-04T10:13:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=83134ef4609388f6b9ca31a384f531155196c2a7'/>
<id>urn:sha1:83134ef4609388f6b9ca31a384f531155196c2a7</id>
<content type='text'>
Jordan reported that when running Cilium with netkit in per-endpoint-routes
mode, network policy misclassifies traffic. In this direct routing mode
of Cilium which is used in case of GKE/EKS/AKS, the Pod's BPF program to
enforce policy sits on the netkit primary device's egress side.

The issue here is that in case of netkit's netkit_prep_forward(), it will
clear meta data such as skb-&gt;mark and skb-&gt;priority before executing the
BPF program. Thus, identity data stored in there from earlier BPF programs
(e.g. from tcx ingress on the physical device) gets cleared instead of
being made available for the primary's program to process. While for traffic
egressing the Pod via the peer device this might be desired, this is
different for the primary one where compared to tcx egress on the host
veth this information would be available.

To address this, add a new parameter for the device orchestration to
allow control of skb-&gt;mark and skb-&gt;priority scrubbing, to make the two
accessible from BPF (and eventually leave it up to the program to scrub).
By default, the current behavior is retained. For netkit peer this also
enables the use case where applications could cooperate/signal intent to
the BPF program.

Note that struct netkit has a 4 byte hole between policy and bundle which
is used here, in other words, struct netkit's first cacheline content used
in fast-path does not get moved around.

Fixes: 35dfaad7188c ("netkit, bpf: Add bpf programmable net device")
Reported-by: Jordan Rife &lt;jrife@google.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://github.com/cilium/cilium/issues/34042
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://lore.kernel.org/r/20241004101335.117711-1-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
</content>
</entry>
</feed>
