diff options
| author | Simon Baatz <gmbnomis@gmail.com> | 2026-03-09 11:02:26 +0300 |
|---|---|---|
| committer | Jakub Kicinski <kuba@kernel.org> | 2026-03-14 18:01:49 +0300 |
| commit | 0e24d17bd9668f9dad78ede6a0e8f13dab176682 (patch) | |
| tree | d640cc6fe2ee0fb37baed68699a4f95b423e39b7 /tools/testing | |
| parent | 9089c5f3c444ad6e9eb172e9375615ed0b0bc31c (diff) | |
| download | linux-0e24d17bd9668f9dad78ede6a0e8f13dab176682.tar.xz | |
tcp: implement RFC 7323 window retraction receiver requirements
By default, the Linux TCP implementation does not shrink the
advertised window (RFC 7323 calls this "window retraction") with the
following exceptions:
- When an incoming segment cannot be added due to the receive buffer
running out of memory. Since commit 8c670bdfa58e ("tcp: correct
handling of extreme memory squeeze") a zero window will be
advertised in this case. It turns out that reaching the required
memory pressure is easy when window scaling is in use. In the
simplest case, sending a sufficient number of segments smaller than
the scale factor to a receiver that does not read data is enough.
- Commit b650d953cd39 ("tcp: enforce receive buffer memory limits by
allowing the tcp window to shrink") addressed the "eating memory"
problem by introducing a sysctl knob that allows shrinking the
window before running out of memory.
However, RFC 7323 does not only state that shrinking the window is
necessary in some cases, it also formulates requirements for TCP
implementations when doing so (Section 2.4).
This commit addresses the receiver-side requirements: After retracting
the window, the peer may have a snd_nxt that lies within a previously
advertised window but is now beyond the retracted window. This means
that all incoming segments (including pure ACKs) will be rejected
until the application happens to read enough data to let the peer's
snd_nxt be in window again (which may be never).
To comply with RFC 7323, the receiver MUST honor any segment that
would have been in window for any ACK sent by the receiver and, when
window scaling is in effect, SHOULD track the maximum window sequence
number it has advertised. This patch tracks that maximum window
sequence number rcv_mwnd_seq throughout the connection and uses it in
tcp_sequence() when deciding whether a segment is acceptable.
rcv_mwnd_seq is updated together with rcv_wup and rcv_wnd in
tcp_select_window(). If we count tcp_sequence() as fast path, it is
read in the fast path. Therefore, rcv_mwnd_seq is put into rcv_wnd's
cacheline group.
The logic for handling received data in tcp_data_queue() is already
sufficient and does not need to be updated.
Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-1-4c7f96b1ec69@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'tools/testing')
| -rw-r--r-- | tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt index 6c0f32c40f19..12882be10f2e 100644 --- a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt @@ -36,7 +36,7 @@ +0 read(4, ..., 100000) = 4000 -// If queue is empty, accept a packet even if its end_seq is above wup + rcv_wnd +// If queue is empty, accept a packet even if its end_seq is above rcv_mwnd_seq +0 < P. 4001:54001(50000) ack 1 win 257 * > . 1:1(0) ack 54001 win 0 |
