<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/net/ip.h, branch v3.14.19</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v3.14.19</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v3.14.19'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2014-08-14T01:38:23+00:00</updated>
<entry>
<title>ip: make IP identifiers less predictable</title>
<updated>2014-08-14T01:38:23+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-07-26T06:58:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b733ea82c2a264c244fea37edda2268c44d76074'/>
<id>urn:sha1:b733ea82c2a264c244fea37edda2268c44d76074</id>
<content type='text'>
[ Upstream commit 04ca6973f7c1a0d8537f2d9906a0cf8e69886d75 ]

In "Counting Packets Sent Between Arbitrary Internet Hosts", Jeffrey and
Jedidiah describe ways exploiting linux IP identifier generation to
infer whether two machines are exchanging packets.

With commit 73f156a6e8c1 ("inetpeer: get rid of ip_id_count"), we
changed IP id generation, but this does not really prevent this
side-channel technique.

This patch adds a random amount of perturbation so that IP identifiers
for a given destination [1] are no longer monotonically increasing after
an idle period.

Note that prandom_u32_max(1) returns 0, so if generator is used at most
once per jiffy, this patch inserts no hole in the ID suite and do not
increase collision probability.

This is jiffies based, so in the worst case (HZ=1000), the id can
rollover after ~65 seconds of idle time, which should be fine.

We also change the hash used in __ip_select_ident() to not only hash
on daddr, but also saddr and protocol, so that ICMP probes can not be
used to infer information for other protocols.

For IPv6, adds saddr into the hash as well, but not nexthdr.

If I ping the patched target, we can see ID are now hard to predict.

21:57:11.008086 IP (...)
    A &gt; target: ICMP echo request, seq 1, length 64
21:57:11.010752 IP (... id 2081 ...)
    target &gt; A: ICMP echo reply, seq 1, length 64

21:57:12.013133 IP (...)
    A &gt; target: ICMP echo request, seq 2, length 64
21:57:12.015737 IP (... id 3039 ...)
    target &gt; A: ICMP echo reply, seq 2, length 64

21:57:13.016580 IP (...)
    A &gt; target: ICMP echo request, seq 3, length 64
21:57:13.019251 IP (... id 3437 ...)
    target &gt; A: ICMP echo reply, seq 3, length 64

[1] TCP sessions uses a per flow ID generator not changed by this patch.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Jeffrey Knockel &lt;jeffk@cs.unm.edu&gt;
Reported-by: Jedidiah R. Crandall &lt;crandall@cs.unm.edu&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Cc: Hannes Frederic Sowa &lt;hannes@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>inetpeer: get rid of ip_id_count</title>
<updated>2014-08-14T01:38:23+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-06-02T12:26:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=265459c3d15b39826d09b3380eef7572fb7649a2'/>
<id>urn:sha1:265459c3d15b39826d09b3380eef7572fb7649a2</id>
<content type='text'>
[ Upstream commit 73f156a6e8c1074ac6327e0abd1169e95eb66463 ]

Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ipv6: make IPV6_RECVPKTINFO work for ipv4 datagrams</title>
<updated>2014-01-20T03:53:18+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2014-01-20T02:43:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4b261c75a99f29c93a0b6babfc180cdf566bd654'/>
<id>urn:sha1:4b261c75a99f29c93a0b6babfc180cdf566bd654</id>
<content type='text'>
We currently don't report IPV6_RECVPKTINFO in cmsg access ancillary data
for IPv4 datagrams on IPv6 sockets.

This patch splits the ip6_datagram_recv_ctl into two functions, one
which handles both protocol families, AF_INET and AF_INET6, while the
ip6_datagram_recv_specific_ctl only handles IPv6 cmsg data.

ip6_datagram_recv_*_ctl never reported back any errors, so we can make
them return void. Also provide a helper for protocols which don't offer dual
personality to further use ip6_datagram_recv_ctl, which is exported to
modules.

I needed to shuffle the code for ping around a bit to make it easier to
implement dual personality for ping ipv6 sockets in future.

Reported-by: Gert Doering &lt;gert@space.net&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: register igmp_notifier even when !CONFIG_PROC_FS</title>
<updated>2014-01-14T23:03:33+00:00</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2014-01-11T00:09:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=72c1d3bdd5bf10a789608336ba0d61f1e44e4350'/>
<id>urn:sha1:72c1d3bdd5bf10a789608336ba0d61f1e44e4350</id>
<content type='text'>
We still need this notifier even when we don't config
PROC_FS.

It should be rare to have a kernel without PROC_FS,
so just for completeness.

Cc: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing</title>
<updated>2014-01-13T19:22:54+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2014-01-09T09:01:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f87c10a8aa1e82498c42d0335524d6ae7cf5a52b'/>
<id>urn:sha1:f87c10a8aa1e82498c42d0335524d6ae7cf5a52b</id>
<content type='text'>
While forwarding we should not use the protocol path mtu to calculate
the mtu for a forwarded packet but instead use the interface mtu.

We mark forwarded skbs in ip_forward with IPSKB_FORWARDED, which was
introduced for multicast forwarding. But as it does not conflict with
our usage in unicast code path it is perfect for reuse.

I moved the functions ip_sk_accept_pmtu, ip_sk_use_pmtu and ip_skb_dst_mtu
along with the new ip_dst_mtu_maybe_forward to net/ip.h to fix circular
dependencies because of IPSKB_FORWARDED.

Because someone might have written a software which does probe
destinations manually and expects the kernel to honour those path mtus
I introduced a new per-namespace "ip_forward_use_pmtu" knob so someone
can disable this new behaviour. We also still use mtus which are locked on a
route for forwarding.

The reason for this change is, that path mtus information can be injected
into the kernel via e.g. icmp_err protocol handler without verification
of local sockets. As such, this could cause the IPv4 forwarding path to
wrongfully emit fragmentation needed notifications or start to fragment
packets along a path.

Tunnel and ipsec output paths clear IPCB again, thus IPSKB_FORWARDED
won't be set and further fragmentation logic will use the path mtu to
determine the fragmentation size. They also recheck packet size with
help of path mtu discovery and report appropriate errors.

Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: John Heffner &lt;johnwheffner@gmail.com&gt;
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: make no_pmtu_disc per namespace and kill ipv4_config</title>
<updated>2013-12-18T21:58:20+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2013-12-14T04:13:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=974eda11c54290a1be8f8b155edae7d791e5ce57'/>
<id>urn:sha1:974eda11c54290a1be8f8b155edae7d791e5ce57</id>
<content type='text'>
The other field in ipv4_config, log_martians, was converted to a
per-interface setting, so we can just remove the whole structure.

Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: fix addr_len/msg-&gt;msg_namelen assignment in recv_error and rxpmtu functions</title>
<updated>2013-11-23T22:46:23+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2013-11-22T23:46:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=85fbaa75037d0b6b786ff18658ddf0b4014ce2a4'/>
<id>urn:sha1:85fbaa75037d0b6b786ff18658ddf0b4014ce2a4</id>
<content type='text'>
Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage
of uninitialized memory to user in recv syscalls") conditionally updated
addr_len if the msg_name is written to. The recv_error and rxpmtu
functions relied on the recvmsg functions to set up addr_len before.

As this does not happen any more we have to pass addr_len to those
functions as well and set it to the size of the corresponding sockaddr
length.

This broke traceroute and such.

Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls")
Reported-by: Brad Spengler &lt;spender@grsecurity.net&gt;
Reported-by: Tom Labanowski
Cc: mpb &lt;mpb.mail@gmail.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv6: make lookups simpler and faster</title>
<updated>2013-10-09T04:01:25+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-10-03T22:42:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=efe4208f47f907b86f528788da711e8ab9dea44d'/>
<id>urn:sha1:efe4208f47f907b86f528788da711e8ab9dea44d</id>
<content type='text'>
TCP listener refactoring, part 4 :

To speed up inet lookups, we moved IPv4 addresses from inet to struct
sock_common

Now is time to do the same for IPv6, because it permits us to have fast
lookups for all kind of sockets, including upcoming SYN_RECV.

Getting IPv6 addresses in TCP lookups currently requires two extra cache
lines, plus a dereference (and memory stall).

inet6_sk(sk) does the dereference of inet_sk(__sk)-&gt;pinet6

This patch is way bigger than its IPv4 counter part, because for IPv4,
we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
it's not doable easily.

inet6_sk(sk)-&gt;daddr becomes sk-&gt;sk_v6_daddr
inet6_sk(sk)-&gt;rcv_saddr becomes sk-&gt;sk_v6_rcv_saddr

And timewait socket also have tw-&gt;tw_v6_daddr &amp; tw-&gt;tw_v6_rcv_saddr
at the same offset.

We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
macro.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: ipv4 only populate IP_PKTINFO when needed</title>
<updated>2013-10-08T20:27:33+00:00</updated>
<author>
<name>Shawn Bohrer</name>
<email>sbohrer@rgmadvisors.com</email>
</author>
<published>2013-10-07T16:01:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fbf8866d65d5de84f75563eb0edd7fc27dbe9a90'/>
<id>urn:sha1:fbf8866d65d5de84f75563eb0edd7fc27dbe9a90</id>
<content type='text'>
The since the removal of the routing cache computing
fib_compute_spec_dst() does a fib_table lookup for each UDP multicast
packet received.  This has introduced a performance regression for some
UDP workloads.

This change skips populating the packet info for sockets that do not have
IP_PKTINFO set.

Benchmark results from a netperf UDP_RR test:
Before 89789.68 transactions/s
After  90587.62 transactions/s

Benchmark results from a fio 1 byte UDP multicast pingpong test
(Multicast one way unicast response):
Before 12.63us RTT
After  12.48us RTT

Signed-off-by: Shawn Bohrer &lt;sbohrer@rgmadvisors.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net ipv4: Convert ipv4.ip_local_port_range to be per netns v3</title>
<updated>2013-10-01T04:59:38+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2013-09-28T21:10:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0bbf87d852d243680ed7074110ccc1dea003b61a'/>
<id>urn:sha1:0bbf87d852d243680ed7074110ccc1dea003b61a</id>
<content type='text'>
- Move sysctl_local_ports from a global variable into struct netns_ipv4.
- Modify inet_get_local_port_range to take a struct net, and update all
  of the callers.
- Move the initialization of sysctl_local_ports into
   sysctl_net_ipv4.c:ipv4_sysctl_init_net from inet_connection_sock.c

v2:
- Ensure indentation used tabs
- Fixed ip.h so it applies cleanly to todays net-next

v3:
- Compile fixes of strange callers of inet_get_local_port_range.
  This patch now successfully passes an allmodconfig build.
  Removed manual inlining of inet_get_local_port_range in ipv4_local_port_range

Originally-by: Samya &lt;samya@twitter.com&gt;
Acked-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
