diff options
author | David S. Miller <davem@davemloft.net> | 2023-08-13 14:21:38 +0300 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2023-08-13 14:21:38 +0300 |
commit | 86f03776f6d58558912bc05158fa75add1886aca (patch) | |
tree | 7a98005eb4c3cc91e5bfe987db7f8b6da28e072d /net/ipv4/tcp_input.c | |
parent | 3e6860ec3a2252249e310b0e6e88e2258171b3d0 (diff) | |
parent | 031c44b7527aec2f22ddaae4bcd8b085ff810ec4 (diff) | |
download | linux-86f03776f6d58558912bc05158fa75add1886aca.tar.xz |
Merge branch 'tcp-oom-probe'
Menglong Dong says:
====================
net: tcp: support probing OOM
In this series, we make some small changes to make the tcp
retransmission become zero-window probes if the receiver drops the skb
because of memory pressure.
In the 1st patch, we reply a zero-window ACK if the skb is dropped
because out of memory, instead of dropping the skb silently.
In the 2nd patch, we allow a zero-window ACK to update the window.
In the 3rd patch, fix unexcepted socket die when snd_wnd is 0 in
tcp_retransmit_timer().
In the 4th patch, we refactor the debug message in
tcp_retransmit_timer() to make it more correct.
After these changes, the tcp can probe the OOM of the receiver forever.
Changes since v3:
- make the timeout "2 * TCP_RTO_MAX" in the 3rd patch
- tp->retrans_stamp is not based on jiffies and can't be compared with
icsk->icsk_timeout in the 3rd patch. Fix it.
- introduce the 4th patch
Changes since v2:
- refactor the code to avoid code duplication in the 1st patch
- use after() instead of max() in tcp_rtx_probe0_timed_out()
Changes since v1:
- send 0 rwin ACK for the receive queue empty case when necessary in the
1st patch
- send the ACK immediately by using the ICSK_ACK_NOW flag in the 1st
patch
- consider the case of the connection restart from idle, as Neal comment,
in the 3rd patch
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv4/tcp_input.c')
-rw-r--r-- | net/ipv4/tcp_input.c | 20 |
1 files changed, 13 insertions, 7 deletions
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 8e96ebe373d7..d34d52fdfdb1 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3525,7 +3525,7 @@ static inline bool tcp_may_update_window(const struct tcp_sock *tp, { return after(ack, tp->snd_una) || after(ack_seq, tp->snd_wl1) || - (ack_seq == tp->snd_wl1 && nwin > tp->snd_wnd); + (ack_seq == tp->snd_wl1 && (nwin > tp->snd_wnd || !nwin)); } /* If we update tp->snd_una, also update tp->bytes_acked */ @@ -5059,13 +5059,19 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) /* Ok. In sequence. In window. */ queue_and_out: - if (skb_queue_len(&sk->sk_receive_queue) == 0) - sk_forced_mem_schedule(sk, skb->truesize); - else if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) { - reason = SKB_DROP_REASON_PROTO_MEM; - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP); + if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) { + /* TODO: maybe ratelimit these WIN 0 ACK ? */ + inet_csk(sk)->icsk_ack.pending |= + (ICSK_ACK_NOMEM | ICSK_ACK_NOW); + inet_csk_schedule_ack(sk); sk->sk_data_ready(sk); - goto drop; + + if (skb_queue_len(&sk->sk_receive_queue)) { + reason = SKB_DROP_REASON_PROTO_MEM; + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP); + goto drop; + } + sk_forced_mem_schedule(sk, skb->truesize); } eaten = tcp_queue_rcv(sk, skb, &fragstolen); |