<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/net/ip_fib.h, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-13T16:20:43+00:00</updated>
<entry>
<title>net: ipv4: fix ARM64 alignment fault in multipath hash seed</title>
<updated>2026-03-13T16:20:43+00:00</updated>
<author>
<name>Yung Chih Su</name>
<email>yuuchihsu@gmail.com</email>
</author>
<published>2026-03-02T06:02:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4bdc94d45d5459f0149085dfc1efe733c8e14f11'/>
<id>urn:sha1:4bdc94d45d5459f0149085dfc1efe733c8e14f11</id>
<content type='text'>
[ Upstream commit 4ee7fa6cf78ff26d783d39e2949d14c4c1cd5e7f ]

`struct sysctl_fib_multipath_hash_seed` contains two u32 fields
(user_seed and mp_seed), making it an 8-byte structure with a 4-byte
alignment requirement.

In `fib_multipath_hash_from_keys()`, the code evaluates the entire
struct atomically via `READ_ONCE()`:

    mp_seed = READ_ONCE(net-&gt;ipv4.sysctl_fib_multipath_hash_seed).mp_seed;

While this silently works on GCC by falling back to unaligned regular
loads which the ARM64 kernel tolerates, it causes a fatal kernel panic
when compiled with Clang and LTO enabled.

Commit e35123d83ee3 ("arm64: lto: Strengthen READ_ONCE() to acquire
when CONFIG_LTO=y") strengthens `READ_ONCE()` to use Load-Acquire
instructions (`ldar` / `ldapr`) to prevent compiler reordering bugs
under Clang LTO. Since the macro evaluates the full 8-byte struct,
Clang emits a 64-bit `ldar` instruction. ARM64 architecture strictly
requires `ldar` to be naturally aligned, thus executing it on a 4-byte
aligned address triggers a strict Alignment Fault (FSC = 0x21).

Fix the read side by moving the `READ_ONCE()` directly to the `u32`
member, which emits a safe 32-bit `ldar Wn`.

Furthermore, Eric Dumazet pointed out that `WRITE_ONCE()` on the entire
struct in `proc_fib_multipath_hash_set_seed()` is also flawed. Analysis
shows that Clang splits this 8-byte write into two separate 32-bit
`str` instructions. While this avoids an alignment fault, it destroys
atomicity and exposes a tear-write vulnerability. Fix this by
explicitly splitting the write into two 32-bit `WRITE_ONCE()`
operations.

Finally, add the missing `READ_ONCE()` when reading `user_seed` in
`proc_fib_multipath_hash_seed()` to ensure proper pairing and
concurrency safety.

Fixes: 4ee2a8cace3f ("net: ipv4: Add a sysctl to set multipath hash seed")
Signed-off-by: Yung Chih Su &lt;yuuchihsu@gmail.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20260302060247.7066-1-yuuchihsu@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>ipv4: Fix user space build failure due to header change</title>
<updated>2024-09-04T23:40:33+00:00</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@nvidia.com</email>
</author>
<published>2024-09-03T13:35:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1083d733eb2624837753046924a9248555a1bfbf'/>
<id>urn:sha1:1083d733eb2624837753046924a9248555a1bfbf</id>
<content type='text'>
RT_TOS() from include/uapi/linux/in_route.h is defined using
IPTOS_TOS_MASK from include/uapi/linux/ip.h. This is problematic for
files such as include/net/ip_fib.h that want to use RT_TOS() as without
including both header files kernel compilation fails:

In file included from ./include/net/ip_fib.h:25,
                 from ./include/net/route.h:27,
                 from ./include/net/lwtunnel.h:9,
                 from net/core/dst.c:24:
./include/net/ip_fib.h: In function ‘fib_dscp_masked_match’:
./include/uapi/linux/in_route.h:31:32: error: ‘IPTOS_TOS_MASK’ undeclared (first use in this function)
   31 | #define RT_TOS(tos)     ((tos)&amp;IPTOS_TOS_MASK)
      |                                ^~~~~~~~~~~~~~
./include/net/ip_fib.h:440:45: note: in expansion of macro ‘RT_TOS’
  440 |         return dscp == inet_dsfield_to_dscp(RT_TOS(fl4-&gt;flowi4_tos));

Therefore, cited commit changed linux/in_route.h to include linux/ip.h.
However, as reported by David, this breaks iproute2 compilation due
overlapping definitions between linux/ip.h and
/usr/include/netinet/ip.h:

In file included from ../include/uapi/linux/in_route.h:5,
                 from iproute.c:19:
../include/uapi/linux/ip.h:25:9: warning: "IPTOS_TOS" redefined
   25 | #define IPTOS_TOS(tos)          ((tos)&amp;IPTOS_TOS_MASK)
      |         ^~~~~~~~~
In file included from iproute.c:17:
/usr/include/netinet/ip.h:222:9: note: this is the location of the previous definition
  222 | #define IPTOS_TOS(tos)          ((tos) &amp; IPTOS_TOS_MASK)

Fix by changing include/net/ip_fib.h to include linux/ip.h. Note that
usage of RT_TOS() should not spread further in the kernel due to recent
work in this area.

Fixes: 1fa3314c14c6 ("ipv4: Centralize TOS matching")
Reported-by: David Ahern &lt;dsahern@kernel.org&gt;
Closes: https://lore.kernel.org/netdev/2f5146ff-507d-4cab-a195-b28c0c9e654e@kernel.org/
Signed-off-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Reviewed-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Link: https://patch.msgid.link/20240903133554.2807343-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>ipv4: Centralize TOS matching</title>
<updated>2024-08-20T12:57:08+00:00</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@nvidia.com</email>
</author>
<published>2024-08-14T12:52:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1fa3314c14c6a20d098991a0a6980f9b18b2f930'/>
<id>urn:sha1:1fa3314c14c6a20d098991a0a6980f9b18b2f930</id>
<content type='text'>
The TOS field in the IPv4 flow information structure ('flowi4_tos') is
matched by the kernel against the TOS selector in IPv4 rules and routes.
The field is initialized differently by different call sites. Some treat
it as DSCP (RFC 2474) and initialize all six DSCP bits, some treat it as
RFC 1349 TOS and initialize it using RT_TOS() and some treat it as RFC
791 TOS and initialize it using IPTOS_RT_MASK.

What is common to all these call sites is that they all initialize the
lower three DSCP bits, which fits the TOS definition in the initial IPv4
specification (RFC 791).

Therefore, the kernel only allows configuring IPv4 FIB rules that match
on the lower three DSCP bits which are always guaranteed to be
initialized by all call sites:

 # ip -4 rule add tos 0x1c table 100
 # ip -4 rule add tos 0x3c table 100
 Error: Invalid tos.

While this works, it is unlikely to be very useful. RFC 791 that
initially defined the TOS and IP precedence fields was updated by RFC
2474 over twenty five years ago where these fields were replaced by a
single six bits DSCP field.

Extending FIB rules to match on DSCP can be done by adding a new DSCP
selector while maintaining the existing semantics of the TOS selector
for applications that rely on that.

A prerequisite for allowing FIB rules to match on DSCP is to adjust all
the call sites to initialize the high order DSCP bits and remove their
masking along the path to the core where the field is matched on.

However, making this change alone will result in a behavior change. For
example, a forwarded IPv4 packet with a DS field of 0xfc will no longer
match a FIB rule that was configured with 'tos 0x1c'.

This behavior change can be avoided by masking the upper three DSCP bits
in 'flowi4_tos' before comparing it against the TOS selectors in FIB
rules and routes.

Implement the above by adding a new function that checks whether a given
DSCP value matches the one specified in the IPv4 flow information
structure and invoke it from the three places that currently match on
'flowi4_tos'.

Use RT_TOS() for the masking of 'flowi4_tos' instead of IPTOS_RT_MASK
since the latter is not uAPI and we should be able to remove it at some
point.

Include &lt;linux/ip.h&gt; in &lt;linux/in_route.h&gt; since the former defines
IPTOS_TOS_MASK which is used in the definition of RT_TOS() in
&lt;linux/in_route.h&gt;.

No regressions in FIB tests:

 # ./fib_tests.sh
 [...]
 Tests passed: 218
 Tests failed:   0

And FIB rule tests:

 # ./fib_rule_tests.sh
 [...]
 Tests passed: 116
 Tests failed:   0

Signed-off-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;

</content>
</entry>
<entry>
<title>ipv4: Fix incorrect TOS in route get reply</title>
<updated>2024-07-18T09:11:02+00:00</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@nvidia.com</email>
</author>
<published>2024-07-15T14:23:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=338bb57e4c2a1c2c6fc92f9c0bd35be7587adca7'/>
<id>urn:sha1:338bb57e4c2a1c2c6fc92f9c0bd35be7587adca7</id>
<content type='text'>
The TOS value that is returned to user space in the route get reply is
the one with which the lookup was performed ('fl4-&gt;flowi4_tos'). This is
fine when the matched route is configured with a TOS as it would not
match if its TOS value did not match the one with which the lookup was
performed.

However, matching on TOS is only performed when the route's TOS is not
zero. It is therefore possible to have the kernel incorrectly return a
non-zero TOS:

 # ip link add name dummy1 up type dummy
 # ip address add 192.0.2.1/24 dev dummy1
 # ip route get 192.0.2.2 tos 0xfc
 192.0.2.2 tos 0x1c dev dummy1 src 192.0.2.1 uid 0
     cache

Fix by adding a DSCP field to the FIB result structure (inside an
existing 4 bytes hole), populating it in the route lookup and using it
when filling the route get reply.

Output after the patch:

 # ip link add name dummy1 up type dummy
 # ip address add 192.0.2.1/24 dev dummy1
 # ip route get 192.0.2.2 tos 0xfc
 192.0.2.2 dev dummy1 src 192.0.2.1 uid 0
     cache

Fixes: 1a00fee4ffb2 ("ipv4: Remove rt_key_{src,dst,tos} from struct rtable.")
Signed-off-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Reviewed-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;

</content>
</entry>
<entry>
<title>net: ipv4: Add a sysctl to set multipath hash seed</title>
<updated>2024-06-12T23:42:11+00:00</updated>
<author>
<name>Petr Machata</name>
<email>petrm@nvidia.com</email>
</author>
<published>2024-06-07T15:13:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4ee2a8cace3fb9a34aea6a56426f89d26dd514f3'/>
<id>urn:sha1:4ee2a8cace3fb9a34aea6a56426f89d26dd514f3</id>
<content type='text'>
When calculating hashes for the purpose of multipath forwarding, both IPv4
and IPv6 code currently fall back on flow_hash_from_keys(). That uses a
randomly-generated seed. That's a fine choice by default, but unfortunately
some deployments may need a tighter control over the seed used.

In this patch, make the seed configurable by adding a new sysctl key,
net.ipv4.fib_multipath_hash_seed to control the seed. This seed is used
specifically for multipath forwarding and not for the other concerns that
flow_hash_from_keys() is used for, such as queue selection. Expose the knob
as sysctl because other such settings, such as headers to hash, are also
handled that way. Like those, the multipath hash seed is a per-netns
variable.

Despite being placed in the net.ipv4 namespace, the multipath seed sysctl
is used for both IPv4 and IPv6, similarly to e.g. a number of TCP
variables.

The seed used by flow_hash_from_keys() is a 128-bit quantity. However it
seems that usually the seed is a much more modest value. 32 bits seem
typical (Cisco, Cumulus), some systems go even lower. For that reason, and
to decouple the user interface from implementation details, go with a
32-bit quantity, which is then quadruplicated to form the siphash key.

Signed-off-by: Petr Machata &lt;petrm@nvidia.com&gt;
Reviewed-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Reviewed-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Link: https://lore.kernel.org/r/20240607151357.421181-3-petrm@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: ipv4,ipv6: Pass multipath hash computation through a helper</title>
<updated>2024-06-12T23:42:11+00:00</updated>
<author>
<name>Petr Machata</name>
<email>petrm@nvidia.com</email>
</author>
<published>2024-06-07T15:13:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3e453ca122d483eb519f934b6624215f0536301c'/>
<id>urn:sha1:3e453ca122d483eb519f934b6624215f0536301c</id>
<content type='text'>
The following patches will add a sysctl to control multipath hash
seed. In order to centralize the hash computation, add a helper,
fib_multipath_hash_from_keys(), and have all IPv4 and IPv6 route.c
invocations of flow_hash_from_keys() go through this helper instead.

Signed-off-by: Petr Machata &lt;petrm@nvidia.com&gt;
Reviewed-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Reviewed-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Link: https://lore.kernel.org/r/20240607151357.421181-2-petrm@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>inet: allow ip_valid_fib_dump_req() to be called with RTNL or RCU</title>
<updated>2024-02-26T11:46:12+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2024-02-22T10:50:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=22e36ea9f5d7707ae3d64c497d172f4ef735c353'/>
<id>urn:sha1:22e36ea9f5d7707ae3d64c497d172f4ef735c353</id>
<content type='text'>
Add a new field into struct fib_dump_filter, to let callers
tell if they use RTNL locking or RCU.

This is used in the following patch, when inet_dump_fib()
no longer holds RTNL.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Donald Hunter &lt;donald.hunter@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2023-10-05T20:16:47+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-10-05T20:16:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2606cf059c56bfb86d5d6bd0f41bd7eedefc8b0a'/>
<id>urn:sha1:2606cf059c56bfb86d5d6bd0f41bd7eedefc8b0a</id>
<content type='text'>
Cross-merge networking fixes after downstream PR.

No conflicts (or adjacent changes of note).

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>ipv4/fib: send notify when delete source address routes</title>
<updated>2023-10-03T07:00:40+00:00</updated>
<author>
<name>Hangbin Liu</name>
<email>liuhangbin@gmail.com</email>
</author>
<published>2023-09-22T07:55:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4b2b606075e50cdae62ab2356b0a1e206947c354'/>
<id>urn:sha1:4b2b606075e50cdae62ab2356b0a1e206947c354</id>
<content type='text'>
After deleting an interface address in fib_del_ifaddr(), the function
scans the fib_info list for stray entries and calls fib_flush() and
fib_table_flush(). Then the stray entries will be deleted silently and no
RTM_DELROUTE notification will be sent.

This lack of notification can make routing daemons, or monitor like
`ip monitor route` miss the routing changes. e.g.

+ ip link add dummy1 type dummy
+ ip link add dummy2 type dummy
+ ip link set dummy1 up
+ ip link set dummy2 up
+ ip addr add 192.168.5.5/24 dev dummy1
+ ip route add 7.7.7.0/24 dev dummy2 src 192.168.5.5
+ ip -4 route
7.7.7.0/24 dev dummy2 scope link src 192.168.5.5
192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5
+ ip monitor route
+ ip addr del 192.168.5.5/24 dev dummy1
Deleted 192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5
Deleted broadcast 192.168.5.255 dev dummy1 table local proto kernel scope link src 192.168.5.5
Deleted local 192.168.5.5 dev dummy1 table local proto kernel scope host src 192.168.5.5

As Ido reminded, fib_table_flush() isn't only called when an address is
deleted, but also when an interface is deleted or put down. The lack of
notification in these cases is deliberate. And commit 7c6bb7d2faaf
("net/ipv6: Add knob to skip DELROUTE message on device down") introduced
a sysctl to make IPv6 behave like IPv4 in this regard. So we can't send
the route delete notify blindly in fib_table_flush().

To fix this issue, let's add a new flag in "struct fib_info" to track the
deleted prefer source address routes, and only send notify for them.

After update:
+ ip monitor route
+ ip addr del 192.168.5.5/24 dev dummy1
Deleted 192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5
Deleted broadcast 192.168.5.255 dev dummy1 table local proto kernel scope link src 192.168.5.5
Deleted local 192.168.5.5 dev dummy1 table local proto kernel scope host src 192.168.5.5
Deleted 7.7.7.0/24 dev dummy2 scope link src 192.168.5.5

Suggested-by: Thomas Haller &lt;thaller@redhat.com&gt;
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Acked-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Link: https://lore.kernel.org/r/20230922075508.848925-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
</entry>
<entry>
<title>ipv4: Annotate struct fib_info with __counted_by</title>
<updated>2023-10-02T18:24:51+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2023-09-22T17:28:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5b98fd5dc1e3b8446f98bc8c6e71d2585f9dcdce'/>
<id>urn:sha1:5b98fd5dc1e3b8446f98bc8c6e71d2585f9dcdce</id>
<content type='text'>
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct fib_info.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: David Ahern &lt;dsahern@kernel.org&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Link: https://lore.kernel.org/r/20230922172858.3822653-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
