tcp: batch calls to sk_flush_backlog()
authorEric Dumazet <edumazet@google.com>
Fri, 9 Aug 2019 12:04:47 +0000 (05:04 -0700)
committerDavid S. Miller <davem@davemloft.net>
Fri, 9 Aug 2019 18:03:27 +0000 (11:03 -0700)
Starting from commit d41a69f1d390 ("tcp: make tcp_sendmsg() aware of socket backlog")
loopback flows got hurt, because for each skb sent, the socket receives an
immediate ACK and sk_flush_backlog() causes extra work.

Intent was to not let the backlog grow too much, but we went a bit too far.

We can check the backlog every 16 skbs (about 1MB chunks)
to increase TCP over loopback performance by about 15 %

Note that the call to sk_flush_backlog() handles a single ACK,
thanks to coalescing done on backlog, but cleans the 16 skbs
found in rtx rb-tree.

Reported-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/ipv4/tcp.c

index a0a66321c0ee99918b2080219dbaefcf3c398e13..f8fa1686f7f3e64f5d4ea8163e7f87538cc0d672 100644 (file)
@@ -1162,7 +1162,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
        struct sockcm_cookie sockc;
        int flags, err, copied = 0;
        int mss_now = 0, size_goal, copied_syn = 0;
-       bool process_backlog = false;
+       int process_backlog = 0;
        bool zc = false;
        long timeo;
 
@@ -1254,9 +1254,10 @@ new_segment:
                        if (!sk_stream_memory_free(sk))
                                goto wait_for_sndbuf;
 
-                       if (process_backlog && sk_flush_backlog(sk)) {
-                               process_backlog = false;
-                               goto restart;
+                       if (unlikely(process_backlog >= 16)) {
+                               process_backlog = 0;
+                               if (sk_flush_backlog(sk))
+                                       goto restart;
                        }
                        first_skb = tcp_rtx_and_write_queues_empty(sk);
                        skb = sk_stream_alloc_skb(sk, 0, sk->sk_allocation,
@@ -1264,7 +1265,7 @@ new_segment:
                        if (!skb)
                                goto wait_for_memory;
 
-                       process_backlog = true;
+                       process_backlog++;
                        skb->ip_summed = CHECKSUM_PARTIAL;
 
                        skb_entail(sk, skb);