<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/net/openvswitch/actions.c, branch v7.1</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.1</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.1'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-29T18:21:23+00:00</updated>
<entry>
<title>net: convert remaining ipv6_stub users to direct function calls</title>
<updated>2026-03-29T18:21:23+00:00</updated>
<author>
<name>Fernando Fernandez Mancera</name>
<email>fmancera@suse.de</email>
</author>
<published>2026-03-25T12:08:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d76f6b170a10414a9c674b6db4bd0473b1b30050'/>
<id>urn:sha1:d76f6b170a10414a9c674b6db4bd0473b1b30050</id>
<content type='text'>
As IPv6 is built-in only, the ipv6_stub infrastructure is no longer
necessary.

Convert remaining ipv6_stub users to make direct function calls. The
fallback functions introduced previously will prevent linkage errors
when CONFIG_IPV6 is disabled.

Signed-off-by: Fernando Fernandez Mancera &lt;fmancera@suse.de&gt;
Tested-by: Ricardo B. Marlière &lt;rbm@suse.com&gt;
Link: https://patch.msgid.link/20260325120928.15848-9-fmancera@suse.de
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: openvswitch: remove never-working support for setting nsh fields</title>
<updated>2025-11-15T02:13:24+00:00</updated>
<author>
<name>Ilya Maximets</name>
<email>i.maximets@ovn.org</email>
</author>
<published>2025-11-12T11:14:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dfe28c4167a9259fc0c372d9f9473e1ac95cff67'/>
<id>urn:sha1:dfe28c4167a9259fc0c372d9f9473e1ac95cff67</id>
<content type='text'>
The validation of the set(nsh(...)) action is completely wrong.
It runs through the nsh_key_put_from_nlattr() function that is the
same function that validates NSH keys for the flow match and the
push_nsh() action.  However, the set(nsh(...)) has a very different
memory layout.  Nested attributes in there are doubled in size in
case of the masked set().  That makes proper validation impossible.

There is also confusion in the code between the 'masked' flag, that
says that the nested attributes are doubled in size containing both
the value and the mask, and the 'is_mask' that says that the value
we're parsing is the mask.  This is causing kernel crash on trying to
write into mask part of the match with SW_FLOW_KEY_PUT() during
validation, while validate_nsh() doesn't allocate any memory for it:

  BUG: kernel NULL pointer dereference, address: 0000000000000018
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 1c2383067 P4D 1c2383067 PUD 20b703067 PMD 0
  Oops: Oops: 0000 [#1] SMP NOPTI
  CPU: 8 UID: 0 Kdump: loaded Not tainted 6.17.0-rc4+ #107 PREEMPT(voluntary)
  RIP: 0010:nsh_key_put_from_nlattr+0x19d/0x610 [openvswitch]
  Call Trace:
   &lt;TASK&gt;
   validate_nsh+0x60/0x90 [openvswitch]
   validate_set.constprop.0+0x270/0x3c0 [openvswitch]
   __ovs_nla_copy_actions+0x477/0x860 [openvswitch]
   ovs_nla_copy_actions+0x8d/0x100 [openvswitch]
   ovs_packet_cmd_execute+0x1cc/0x310 [openvswitch]
   genl_family_rcv_msg_doit+0xdb/0x130
   genl_family_rcv_msg+0x14b/0x220
   genl_rcv_msg+0x47/0xa0
   netlink_rcv_skb+0x53/0x100
   genl_rcv+0x24/0x40
   netlink_unicast+0x280/0x3b0
   netlink_sendmsg+0x1f7/0x430
   ____sys_sendmsg+0x36b/0x3a0
   ___sys_sendmsg+0x87/0xd0
   __sys_sendmsg+0x6d/0xd0
   do_syscall_64+0x7b/0x2c0
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

The third issue with this process is that while trying to convert
the non-masked set into masked one, validate_set() copies and doubles
the size of the OVS_KEY_ATTR_NSH as if it didn't have any nested
attributes.  It should be copying each nested attribute and doubling
them in size independently.  And the process must be properly reversed
during the conversion back from masked to a non-masked variant during
the flow dump.

In the end, the only two outcomes of trying to use this action are
either validation failure or a kernel crash.  And if somehow someone
manages to install a flow with such an action, it will most definitely
not do what it is supposed to, since all the keys and the masks are
mixed up.

Fixing all the issues is a complex task as it requires re-writing
most of the validation code.

Given that and the fact that this functionality never worked since
introduction, let's just remove it altogether.  It's better to
re-introduce it later with a proper implementation instead of trying
to fix it in stable releases.

Fixes: b2d0f5d5dc53 ("openvswitch: enable NSH support")
Reported-by: Junvy Yang &lt;zhuque@tencent.com&gt;
Signed-off-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://patch.msgid.link/20251112112246.95064-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: openvswitch: allow providing upcall pid for the 'execute' command</title>
<updated>2025-07-07T21:30:39+00:00</updated>
<author>
<name>Ilya Maximets</name>
<email>i.maximets@ovn.org</email>
</author>
<published>2025-07-02T15:50:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=59f44c9ccc3bb68aa3b062b8e57ce0e1ee2fca75'/>
<id>urn:sha1:59f44c9ccc3bb68aa3b062b8e57ce0e1ee2fca75</id>
<content type='text'>
When a packet enters OVS datapath and there is no flow to handle it,
packet goes to userspace through a MISS upcall.  With per-CPU upcall
dispatch mechanism, we're using the current CPU id to select the
Netlink PID on which to send this packet.  This allows us to send
packets from the same traffic flow through the same handler.

The handler will process the packet, install required flow into the
kernel and re-inject the original packet via OVS_PACKET_CMD_EXECUTE.

While handling OVS_PACKET_CMD_EXECUTE, however, we may hit a
recirculation action that will pass the (likely modified) packet
through the flow lookup again.  And if the flow is not found, the
packet will be sent to userspace again through another MISS upcall.

However, the handler thread in userspace is likely running on a
different CPU core, and the OVS_PACKET_CMD_EXECUTE request is handled
in the syscall context of that thread.  So, when the time comes to
send the packet through another upcall, the per-CPU dispatch will
choose a different Netlink PID, and this packet will end up processed
by a different handler thread on a different CPU.

The process continues as long as there are new recirculations, each
time the packet goes to a different handler thread before it is sent
out of the OVS datapath to the destination port.  In real setups the
number of recirculations can go up to 4 or 5, sometimes more.

There is always a chance to re-order packets while processing upcalls,
because userspace will first install the flow and then re-inject the
original packet.  So, there is a race window when the flow is already
installed and the second packet can match it and be forwarded to the
destination before the first packet is re-injected.  But the fact that
packets are going through multiple upcalls handled by different
userspace threads makes the reordering noticeably more likely, because
we not only have a race between the kernel and a userspace handler
(which is hard to avoid), but also between multiple userspace handlers.

For example, let's assume that 10 packets got enqueued through a MISS
upcall for handler-1, it will start processing them, will install the
flow into the kernel and start re-injecting packets back, from where
they will go through another MISS to handler-2.  Handler-2 will install
the flow into the kernel and start re-injecting the packets, while
handler-1 continues to re-inject the last of the 10 packets, they will
hit the flow installed by handler-2 and be forwarded without going to
the handler-2, while handler-2 still re-injects the first of these 10
packets.  Given multiple recirculations and misses, these 10 packets
may end up completely mixed up on the output from the datapath.

Let's allow userspace to specify on which Netlink PID the packets
should be upcalled while processing OVS_PACKET_CMD_EXECUTE.
This makes it possible to ensure that all the packets are processed
by the same handler thread in the userspace even with them being
upcalled multiple times in the process.  Packets will remain in order
since they will be enqueued to the same socket and re-injected in the
same order.  This doesn't eliminate re-ordering as stated above, since
we still have a race between kernel and the userspace thread, but it
allows to eliminate races between multiple userspace threads.

Userspace knows the PID of the socket on which the original upcall is
received, so there is no need to send it up from the kernel.

Solution requires storing the value somewhere for the duration of the
packet processing.  There are two potential places for this: our skb
extension or the per-CPU storage.  It's not clear which is better,
so just following currently used scheme of storing this kind of things
along the skb.  We still have a decent amount of space in the cb.

Signed-off-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Acked-by: Flavio Leitner &lt;fbl@sysclose.org&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Acked-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://patch.msgid.link/20250702155043.2331772-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>openvswitch: Allocate struct ovs_pcpu_storage dynamically</title>
<updated>2025-06-17T12:47:46+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2025-06-13T12:36:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7b4ac12cc929e281cf7edc22203e0533790ebc2b'/>
<id>urn:sha1:7b4ac12cc929e281cf7edc22203e0533790ebc2b</id>
<content type='text'>
PERCPU_MODULE_RESERVE defines the maximum size that can by used for the
per-CPU data size used by modules. This is 8KiB.

Commit 035fcdc4d240c ("openvswitch: Merge three per-CPU structures into
one") restructured the per-CPU memory allocation for the module and
moved the separate alloc_percpu() invocations at module init time to a
static per-CPU variable which is allocated by the module loader.

The size of the per-CPU data section for openvswitch is 6488 bytes which
is ~80% of the available per-CPU memory. Together with a few other
modules it is easy to exhaust the available 8KiB of memory.

Allocate ovs_pcpu_storage dynamically at module init time.

Reported-by: Gal Pressman &lt;gal@nvidia.com&gt;
Closes: https://lore.kernel.org/all/c401e017-f8db-4f57-a1cd-89beb979a277@nvidia.com
Fixes: 035fcdc4d240c ("openvswitch: Merge three per-CPU structures into one")
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://patch.msgid.link/20250613123629.-XSoQTCu@linutronix.de
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;

</content>
</entry>
<entry>
<title>openvswitch: Move ovs_frag_data_storage into the struct ovs_pcpu_storage</title>
<updated>2025-05-15T13:23:31+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2025-05-12T09:27:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3af4cdd67f32529c177b885d4ca491710e961928'/>
<id>urn:sha1:3af4cdd67f32529c177b885d4ca491710e961928</id>
<content type='text'>
ovs_frag_data_storage is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.

Move ovs_frag_data_storage into the struct ovs_pcpu_storage which already
provides locking for the structure.

Cc: Aaron Conole &lt;aconole@redhat.com&gt;
Cc: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Cc: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Cc: dev@openvswitch.org
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://patch.msgid.link/20250512092736.229935-10-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;

</content>
</entry>
<entry>
<title>openvswitch: Use nested-BH locking for ovs_pcpu_storage</title>
<updated>2025-05-15T13:23:31+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2025-05-12T09:27:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=672318331b44753ab7bd8545558939c38b4c1132'/>
<id>urn:sha1:672318331b44753ab7bd8545558939c38b4c1132</id>
<content type='text'>
ovs_pcpu_storage is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
The data structure can be referenced recursive and there is a recursion
counter to avoid too many recursions.

Add a local_lock_t to the data structure and use
local_lock_nested_bh() for locking. Add an owner of the struct which is
the current task and acquire the lock only if the structure is not owned
by the current task.

Cc: Aaron Conole &lt;aconole@redhat.com&gt;
Cc: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Cc: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Cc: dev@openvswitch.org
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://patch.msgid.link/20250512092736.229935-9-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;

</content>
</entry>
<entry>
<title>openvswitch: Merge three per-CPU structures into one</title>
<updated>2025-05-15T13:23:31+00:00</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2025-05-12T09:27:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=035fcdc4d240c873c89b76b752dd9921bc88c1ba'/>
<id>urn:sha1:035fcdc4d240c873c89b76b752dd9921bc88c1ba</id>
<content type='text'>
exec_actions_level is a per-CPU integer allocated at compile time.
action_fifos and flow_keys are per-CPU pointer and have their data
allocated at module init time.
There is no gain in splitting it, once the module is allocated, the
structures are allocated.

Merge the three per-CPU variables into ovs_pcpu_storage, adapt callers.

Cc: Aaron Conole &lt;aconole@redhat.com&gt;
Cc: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Cc: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Cc: dev@openvswitch.org
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://patch.msgid.link/20250512092736.229935-8-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;

</content>
</entry>
<entry>
<title>openvswitch: Fix unsafe attribute parsing in output_userspace()</title>
<updated>2025-05-07T23:51:02+00:00</updated>
<author>
<name>Eelco Chaudron</name>
<email>echaudro@redhat.com</email>
</author>
<published>2025-05-06T14:28:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6beb6835c1fbb3f676aebb51a5fee6b77fed9308'/>
<id>urn:sha1:6beb6835c1fbb3f676aebb51a5fee6b77fed9308</id>
<content type='text'>
This patch replaces the manual Netlink attribute iteration in
output_userspace() with nla_for_each_nested(), which ensures that only
well-formed attributes are processed.

Fixes: ccb1352e76cf ("net: Add Open vSwitch kernel components.")
Signed-off-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Acked-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Acked-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://patch.msgid.link/0bd65949df61591d9171c0dc13e42cea8941da10.1746541734.git.echaudro@redhat.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>tunnels: Accept PACKET_HOST in skb_tunnel_check_pmtu().</title>
<updated>2025-04-03T00:22:39+00:00</updated>
<author>
<name>Guillaume Nault</name>
<email>gnault@redhat.com</email>
</author>
<published>2025-03-29T00:33:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8930424777e43257f5bf6f0f0f53defd0d30415c'/>
<id>urn:sha1:8930424777e43257f5bf6f0f0f53defd0d30415c</id>
<content type='text'>
Because skb_tunnel_check_pmtu() doesn't handle PACKET_HOST packets,
commit 30a92c9e3d6b ("openvswitch: Set the skbuff pkt_type for proper
pmtud support.") forced skb-&gt;pkt_type to PACKET_OUTGOING for
openvswitch packets that are sent using the OVS_ACTION_ATTR_OUTPUT
action. This allowed such packets to invoke the
iptunnel_pmtud_check_icmp() or iptunnel_pmtud_check_icmpv6() helpers
and thus trigger PMTU update on the input device.

However, this also broke other parts of PMTU discovery. Since these
packets don't have the PACKET_HOST type anymore, they won't trigger the
sending of ICMP Fragmentation Needed or Packet Too Big messages to
remote hosts when oversized (see the skb_in-&gt;pkt_type condition in
__icmp_send() for example).

These two skb-&gt;pkt_type checks are therefore incompatible as one
requires skb-&gt;pkt_type to be PACKET_HOST, while the other requires it
to be anything but PACKET_HOST.

It makes sense to not trigger ICMP messages for non-PACKET_HOST packets
as these messages should be generated only for incoming l2-unicast
packets. However there doesn't seem to be any reason for
skb_tunnel_check_pmtu() to ignore PACKET_HOST packets.

Allow both cases to work by allowing skb_tunnel_check_pmtu() to work on
PACKET_HOST packets and not overriding skb-&gt;pkt_type in openvswitch
anymore.

Fixes: 30a92c9e3d6b ("openvswitch: Set the skbuff pkt_type for proper pmtud support.")
Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
Signed-off-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Reviewed-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Tested-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://patch.msgid.link/eac941652b86fddf8909df9b3bf0d97bc9444793.1743208264.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>openvswitch: fix lockup on tx to unregistering netdev with carrier</title>
<updated>2025-01-11T02:20:49+00:00</updated>
<author>
<name>Ilya Maximets</name>
<email>i.maximets@ovn.org</email>
</author>
<published>2025-01-09T12:21:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=47e55e4b410f7d552e43011baa5be1aab4093990'/>
<id>urn:sha1:47e55e4b410f7d552e43011baa5be1aab4093990</id>
<content type='text'>
Commit in a fixes tag attempted to fix the issue in the following
sequence of calls:

    do_output
    -&gt; ovs_vport_send
       -&gt; dev_queue_xmit
          -&gt; __dev_queue_xmit
             -&gt; netdev_core_pick_tx
                -&gt; skb_tx_hash

When device is unregistering, the 'dev-&gt;real_num_tx_queues' goes to
zero and the 'while (unlikely(hash &gt;= qcount))' loop inside the
'skb_tx_hash' becomes infinite, locking up the core forever.

But unfortunately, checking just the carrier status is not enough to
fix the issue, because some devices may still be in unregistering
state while reporting carrier status OK.

One example of such device is a net/dummy.  It sets carrier ON
on start, but it doesn't implement .ndo_stop to set the carrier off.
And it makes sense, because dummy doesn't really have a carrier.
Therefore, while this device is unregistering, it's still easy to hit
the infinite loop in the skb_tx_hash() from the OVS datapath.  There
might be other drivers that do the same, but dummy by itself is
important for the OVS ecosystem, because it is frequently used as a
packet sink for tcpdump while debugging OVS deployments.  And when the
issue is hit, the only way to recover is to reboot.

Fix that by also checking if the device is running.  The running
state is handled by the net core during unregistering, so it covers
unregistering case better, and we don't really need to send packets
to devices that are not running anyway.

While only checking the running state might be enough, the carrier
check is preserved.  The running and the carrier states seem disjoined
throughout the code and different drivers.  And other core functions
like __dev_direct_xmit() check both before attempting to transmit
a packet.  So, it seems safer to check both flags in OVS as well.

Fixes: 066b86787fa3 ("net: openvswitch: fix race on port output")
Reported-by: Friedrich Weber &lt;f.weber@proxmox.com&gt;
Closes: https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html
Signed-off-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Tested-by: Friedrich Weber &lt;f.weber@proxmox.com&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://patch.msgid.link/20250109122225.4034688-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
