<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/net/tcp.h, branch v4.14.149</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.149</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.14.149'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2019-09-06T08:20:49+00:00</updated>
<entry>
<title>tcp: fix tcp_rtx_queue_tail in case of empty retransmit queue</title>
<updated>2019-09-06T08:20:49+00:00</updated>
<author>
<name>Tim Froidcoeur</name>
<email>tim.froidcoeur@tessares.net</email>
</author>
<published>2019-08-24T06:03:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e5df4baea324d3c21dc48f86ad24e5d9aed67c65'/>
<id>urn:sha1:e5df4baea324d3c21dc48f86ad24e5d9aed67c65</id>
<content type='text'>
Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()")
triggers following stack trace:

[25244.848046] kernel BUG at ./include/linux/skbuff.h:1406!
[25244.859335] RIP: 0010:skb_queue_prev+0x9/0xc
[25244.888167] Call Trace:
[25244.889182]  &lt;IRQ&gt;
[25244.890001]  tcp_fragment+0x9c/0x2cf
[25244.891295]  tcp_write_xmit+0x68f/0x988
[25244.892732]  __tcp_push_pending_frames+0x3b/0xa0
[25244.894347]  tcp_data_snd_check+0x2a/0xc8
[25244.895775]  tcp_rcv_established+0x2a8/0x30d
[25244.897282]  tcp_v4_do_rcv+0xb2/0x158
[25244.898666]  tcp_v4_rcv+0x692/0x956
[25244.899959]  ip_local_deliver_finish+0xeb/0x169
[25244.901547]  __netif_receive_skb_core+0x51c/0x582
[25244.903193]  ? inet_gro_receive+0x239/0x247
[25244.904756]  netif_receive_skb_internal+0xab/0xc6
[25244.906395]  napi_gro_receive+0x8a/0xc0
[25244.907760]  receive_buf+0x9a1/0x9cd
[25244.909160]  ? load_balance+0x17a/0x7b7
[25244.910536]  ? vring_unmap_one+0x18/0x61
[25244.911932]  ? detach_buf+0x60/0xfa
[25244.913234]  virtnet_poll+0x128/0x1e1
[25244.914607]  net_rx_action+0x12a/0x2b1
[25244.915953]  __do_softirq+0x11c/0x26b
[25244.917269]  ? handle_irq_event+0x44/0x56
[25244.918695]  irq_exit+0x61/0xa0
[25244.919947]  do_IRQ+0x9d/0xbb
[25244.921065]  common_interrupt+0x85/0x85
[25244.922479]  &lt;/IRQ&gt;

tcp_rtx_queue_tail() (called by tcp_fragment()) can call
tcp_write_queue_prev() on the first packet in the queue, which will trigger
the BUG in tcp_write_queue_prev(), because there is no previous packet.

This happens when the retransmit queue is empty, for example in case of a
zero window.

Commit 8c3088f895a0 ("tcp: be more careful in tcp_fragment()") was not a
simple cherry-pick of the original one from master (b617158dc096)
because there is a specific TCP rtx queue only since v4.15. For more
details, please see the commit message of b617158dc096 ("tcp: be more
careful in tcp_fragment()").

The BUG() is hit due to the specific code added to versions older than
v4.15. The comment in skb_queue_prev() (include/linux/skbuff.h:1406),
just before the BUG_ON() somehow suggests to add a check before using
it, what Tim did.

In master, this code path causing the issue will not be taken because
the implementation of tcp_rtx_queue_tail() is different:

    tcp_fragment() → tcp_rtx_queue_tail() → tcp_write_queue_prev() →
skb_queue_prev() → BUG_ON()

Fixes: 8c3088f895a0 ("tcp: be more careful in tcp_fragment()")
Signed-off-by: Tim Froidcoeur &lt;tim.froidcoeur@tessares.net&gt;
Signed-off-by: Matthieu Baerts &lt;matthieu.baerts@tessares.net&gt;
Reviewed-by: Christoph Paasch &lt;cpaasch@apple.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Revert "tcp: Clear sk_send_head after purging the write queue"</title>
<updated>2019-08-25T08:50:23+00:00</updated>
<author>
<name>Sasha Levin</name>
<email>sashal@kernel.org</email>
</author>
<published>2019-08-20T03:17:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=480d6d2f396e76bb9d77a180d32f2308fa8fb2d9'/>
<id>urn:sha1:480d6d2f396e76bb9d77a180d32f2308fa8fb2d9</id>
<content type='text'>
This reverts commit e99e7745d03fc50ba7c5b7c91c17294fee2d5991.

Ben Hutchings writes:

&gt;Sorry, this is the same issue that was already fixed by "tcp: reset
&gt;sk_send_head in tcp_write_queue_purge".  You can drop my version from
&gt;the queue for 4.4 and 4.9 and revert it for 4.14.

Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tcp: Clear sk_send_head after purging the write queue</title>
<updated>2019-08-16T08:13:47+00:00</updated>
<author>
<name>Ben Hutchings</name>
<email>ben@decadent.org.uk</email>
</author>
<published>2019-08-13T11:53:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e99e7745d03fc50ba7c5b7c91c17294fee2d5991'/>
<id>urn:sha1:e99e7745d03fc50ba7c5b7c91c17294fee2d5991</id>
<content type='text'>
Denis Andzakovic discovered a potential use-after-free in older kernel
versions, using syzkaller.  tcp_write_queue_purge() frees all skbs in
the TCP write queue and can leave sk-&gt;sk_send_head pointing to freed
memory.  tcp_disconnect() clears that pointer after calling
tcp_write_queue_purge(), but tcp_connect() does not.  It is
(surprisingly) possible to add to the write queue between
disconnection and reconnection, so this needs to be done in both
places.

This bug was introduced by backports of commit 7f582b248d0a ("tcp:
purge write queue in tcp_connect_init()") and does not exist upstream
because of earlier changes in commit 75c119afe14f ("tcp: implement
rb-tree based retransmit queue").  The latter is a major change that's
not suitable for stable.

Reported-by: Denis Andzakovic &lt;denis.andzakovic@pulsesecurity.co.nz&gt;
Bisected-by: Salvatore Bonaccorso &lt;carnil@debian.org&gt;
Fixes: 7f582b248d0a ("tcp: purge write queue in tcp_connect_init()")
Cc: &lt;stable@vger.kernel.org&gt; # before 4.15
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: be more careful in tcp_fragment()</title>
<updated>2019-08-09T15:53:32+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-08-06T15:09:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8c3088f895a0cfdf630973a9cd32f5e085eb973c'/>
<id>urn:sha1:8c3088f895a0cfdf630973a9cd32f5e085eb973c</id>
<content type='text'>
commit b617158dc096709d8600c53b6052144d12b89fab upstream.

Some applications set tiny SO_SNDBUF values and expect
TCP to just work. Recent patches to address CVE-2019-11478
broke them in case of losses, since retransmits might
be prevented.

We should allow these flows to make progress.

This patch allows the first and last skb in retransmit queue
to be split even if memory limits are hit.

It also adds the some room due to the fact that tcp_sendmsg()
and tcp_sendpage() might overshoot sk_wmem_queued by about one full
TSO skb (64KB size). Note this allowance was already present
in stable backports for kernels &lt; 4.15

Note for &lt; 4.15 backports :
 tcp_rtx_queue_tail() will probably look like :

static inline struct sk_buff *tcp_rtx_queue_tail(const struct sock *sk)
{
	struct sk_buff *skb = tcp_send_head(sk);

	return skb ? tcp_write_queue_prev(sk, skb) : tcp_write_queue_tail(sk);
}

Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Andrew Prout &lt;aprout@ll.mit.edu&gt;
Tested-by: Andrew Prout &lt;aprout@ll.mit.edu&gt;
Tested-by: Jonathan Lemon &lt;jonathan.lemon@gmail.com&gt;
Tested-by: Michal Kubecek &lt;mkubecek@suse.cz&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Acked-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Acked-by: Christoph Paasch &lt;cpaasch@apple.com&gt;
Cc: Jonathan Looney &lt;jtl@netflix.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Matthieu Baerts &lt;matthieu.baerts@tessares.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>tcp: fix tcp_set_congestion_control() use from bpf hook</title>
<updated>2019-07-31T05:28:46+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-07-19T02:28:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=adcac7370d7d9c6c8f58c590fff117d95c65fb61'/>
<id>urn:sha1:adcac7370d7d9c6c8f58c590fff117d95c65fb61</id>
<content type='text'>
[ Upstream commit 8d650cdedaabb33e85e9b7c517c0c71fcecc1de9 ]

Neal reported incorrect use of ns_capable() from bpf hook.

bpf_setsockopt(...TCP_CONGESTION...)
  -&gt; tcp_set_congestion_control()
   -&gt; ns_capable(sock_net(sk)-&gt;user_ns, CAP_NET_ADMIN)
    -&gt; ns_capable_common()
     -&gt; current_cred()
      -&gt; rcu_dereference_protected(current-&gt;cred, 1)

Accessing 'current' in bpf context makes no sense, since packets
are processed from softirq context.

As Neal stated : The capability check in tcp_set_congestion_control()
was written assuming a system call context, and then was reused from
a BPF call site.

The fix is to add a new parameter to tcp_set_congestion_control(),
so that the ns_capable() call is only performed under the right
context.

Fixes: 91b5b21c7c16 ("bpf: Add support for changing congestion control")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Lawrence Brakmo &lt;brakmo@fb.com&gt;
Reported-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Acked-by: Lawrence Brakmo &lt;brakmo@fb.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: limit payload size of sacked skbs</title>
<updated>2019-06-17T17:52:43+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-06-16T00:31:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d632920554c5aec81d8a79c23dac07efcbabbd54'/>
<id>urn:sha1:d632920554c5aec81d8a79c23dac07efcbabbd54</id>
<content type='text'>
commit 3b4929f65b0d8249f19a50245cd88ed1a2f78cff upstream.

Jonathan Looney reported that TCP can trigger the following crash
in tcp_shifted_skb() :

	BUG_ON(tcp_skb_pcount(skb) &lt; pcount);

This can happen if the remote peer has advertized the smallest
MSS that linux TCP accepts : 48

An skb can hold 17 fragments, and each fragment can hold 32KB
on x86, or 64KB on PowerPC.

This means that the 16bit witdh of TCP_SKB_CB(skb)-&gt;tcp_gso_segs
can overflow.

Note that tcp_sendmsg() builds skbs with less than 64KB
of payload, so this problem needs SACK to be enabled.
SACK blocks allow TCP to coalesce multiple skbs in the retransmit
queue, thus filling the 17 fragments to maximal capacity.

CVE-2019-11477 -- u16 overflow of TCP_SKB_CB(skb)-&gt;tcp_gso_segs

Backport notes, provided by Joao Martins &lt;joao.m.martins@oracle.com&gt;

v4.15 or since commit 737ff314563 ("tcp: use sequence distance to
detect reordering") had switched from the packet-based FACK tracking and
switched to sequence-based.

v4.14 and older still have the old logic and hence on
tcp_skb_shift_data() needs to retain its original logic and have
@fack_count in sync. In other words, we keep the increment of pcount with
tcp_skb_pcount(skb) to later used that to update fack_count. To make it
more explicit we track the new skb that gets incremented to pcount in
@next_pcount, and we get to avoid the constant invocation of
tcp_skb_pcount(skb) all together.

Fixes: 832d11c5cd07 ("tcp: Try to restore large SKBs while SACK processing")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Jonathan Looney &lt;jtl@netflix.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Reviewed-by: Tyler Hicks &lt;tyhicks@canonical.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Bruce Curtis &lt;brucec@netflix.com&gt;
Cc: Jonathan Lemon &lt;jonathan.lemon@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: clear icsk_backoff in tcp_write_queue_purge()</title>
<updated>2019-02-23T08:06:43+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-02-15T21:36:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3a493b762f31fc0c571051cdfb0d80f49498e5fd'/>
<id>urn:sha1:3a493b762f31fc0c571051cdfb0d80f49498e5fd</id>
<content type='text'>
[ Upstream commit 04c03114be82194d4a4858d41dba8e286ad1787c ]

soukjin bae reported a crash in tcp_v4_err() handling
ICMP_DEST_UNREACH after tcp_write_queue_head(sk)
returned a NULL pointer.

Current logic should have prevented this :

  if (seq != tp-&gt;snd_una  || !icsk-&gt;icsk_retransmits ||
      !icsk-&gt;icsk_backoff || fastopen)
      break;

Problem is the write queue might have been purged
and icsk_backoff has not been cleared.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: soukjin bae &lt;soukjin.bae@samsung.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp, ulp: add alias for all ulp modules</title>
<updated>2018-09-15T07:45:29+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2018-08-16T19:49:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0c02e0c3fd13f9b6eb4b5f7ce3b739e7c348ae2b'/>
<id>urn:sha1:0c02e0c3fd13f9b6eb4b5f7ce3b739e7c348ae2b</id>
<content type='text'>
[ Upstream commit 037b0b86ecf5646f8eae777d8b52ff8b401692ec ]

Lets not turn the TCP ULP lookup into an arbitrary module loader as
we only intend to load ULP modules through this mechanism, not other
unrelated kernel modules:

  [root@bar]# cat foo.c
  #include &lt;sys/types.h&gt;
  #include &lt;sys/socket.h&gt;
  #include &lt;linux/tcp.h&gt;
  #include &lt;linux/in.h&gt;

  int main(void)
  {
      int sock = socket(PF_INET, SOCK_STREAM, 0);
      setsockopt(sock, IPPROTO_TCP, TCP_ULP, "sctp", sizeof("sctp"));
      return 0;
  }

  [root@bar]# gcc foo.c -O2 -Wall
  [root@bar]# lsmod | grep sctp
  [root@bar]# ./a.out
  [root@bar]# lsmod | grep sctp
  sctp                 1077248  4
  libcrc32c              16384  3 nf_conntrack,nf_nat,sctp
  [root@bar]#

Fix it by adding module alias to TCP ULP modules, so probing module
via request_module() will be limited to tcp-ulp-[name]. The existing
modules like kTLS will load fine given tcp-ulp-tls alias, but others
will fail to load:

  [root@bar]# lsmod | grep sctp
  [root@bar]# ./a.out
  [root@bar]# lsmod | grep sctp
  [root@bar]#

Sockmap is not affected from this since it's either built-in or not.

Fixes: 734942cc4ea6 ("tcp: ULP infrastructure")
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Acked-by: Song Liu &lt;songliubraving@fb.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: remove DELAYED ACK events in DCTCP</title>
<updated>2018-08-24T11:09:17+00:00</updated>
<author>
<name>Yuchung Cheng</name>
<email>ycheng@google.com</email>
</author>
<published>2018-07-12T13:04:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2d2eacd650c6a01ef90c34c7d99f82854eb7fd96'/>
<id>urn:sha1:2d2eacd650c6a01ef90c34c7d99f82854eb7fd96</id>
<content type='text'>
[ Upstream commit a69258f7aa2623e0930212f09c586fd06674ad79 ]

After fixing the way DCTCP tracking delayed ACKs, the delayed-ACK
related callbacks are no longer needed

Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Acked-by: Lawrence Brakmo &lt;brakmo@fb.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode</title>
<updated>2018-08-03T05:50:45+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2018-05-21T22:08:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1c005489fa9876713d8f1f626947017dd45a9bfc'/>
<id>urn:sha1:1c005489fa9876713d8f1f626947017dd45a9bfc</id>
<content type='text'>
[ Upstream commit 9a9c9b51e54618861420093ae6e9b50a961914c5 ]

We want to add finer control of the number of ACK packets sent after
ECN events.

This patch is not changing current behavior, it only enables following
change.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Soheil Hassas Yeganeh &lt;soheil@google.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
