From: Ilpo Järvinen Date: Wed, 12 Mar 2008 00:55:27 +0000 (-0700) Subject: [TCP]: Prevent sending past receiver window with TSO (at last skb) X-Git-Tag: v2.6.25-rc6~31^2~2 X-Git-Url: https://err.no/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=5ea3a7480606cef06321cd85bc5113c72d2c7c68;p=linux-2.6 [TCP]: Prevent sending past receiver window with TSO (at last skb) With TSO it was possible to send past the receiver window when the skb to be sent was the last in the write queue while the receiver window is the limiting factor. One can notice that there's a loophole in the tcp_mss_split_point that lacked a receiver window check for the tcp_write_queue_tail() if also cwnd was smaller than the full skb. Noticed by Thomas Gleixner in form of "Treason uncloaked! Peer ... shrinks window .... Repaired." messages (the peer didn't actually shrink its window as the message suggests, we had just sent something past it without a permission to do so). Signed-off-by: Ilpo Järvinen Tested-by: Thomas Gleixner Signed-off-by: David S. Miller --- diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index ed750f9ceb..01578f544a 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1035,6 +1035,13 @@ static void tcp_cwnd_validate(struct sock *sk) * introducing MSS oddities to segment boundaries. In rare cases where * mss_now != mss_cache, we will request caller to create a small skb * per input skb which could be mostly avoided here (if desired). + * + * We explicitly want to create a request for splitting write queue tail + * to a small skb for Nagle purposes while avoiding unnecessary modulos, + * thus all the complexity (cwnd_len is always MSS multiple which we + * return whenever allowed by the other factors). Basically we need the + * modulo only when the receiver window alone is the limiting factor or + * when we would be allowed to send the split-due-to-Nagle skb fully. */ static unsigned int tcp_mss_split_point(struct sock *sk, struct sk_buff *skb, unsigned int mss_now, unsigned int cwnd) @@ -1048,10 +1055,11 @@ static unsigned int tcp_mss_split_point(struct sock *sk, struct sk_buff *skb, if (likely(cwnd_len <= window && skb != tcp_write_queue_tail(sk))) return cwnd_len; - if (skb == tcp_write_queue_tail(sk) && cwnd_len <= skb->len) + needed = min(skb->len, window); + + if (skb == tcp_write_queue_tail(sk) && cwnd_len <= needed) return cwnd_len; - needed = min(skb->len, window); return needed - needed % mss_now; }