Merge branch 'udp-scalability-improvements'

Paolo Abeni says: ==================== udp: scalability improvements This patch series implement an idea suggested by Eric Dumazet to reduce the contention of the udp sk_receive_queue lock when the socket is under flood. An ancillary queue is added to the udp socket, and the socket always tries first to read packets from such queue. If it's empty, we splice the content from sk_receive_queue into the ancillary queue. The first patch introduces some helpers to keep the udp code small, and the following two implement the ancillary queue strategy. The code is split to hopefully help the reviewing process. The measured overall gain under udp flood is up to the 30% depending on the numa layout and the number of ingress queue used by the relevant nic. The performance numbers have been gathered using pktgen as sender, with 64 bytes packets, random src port on a host b2b connected via a 10Gbs link with the dut. The receiver used the udp_sink program by Jesper [1] and an h/w l4 rx hash on the ingress nic, so that the number of ingress nic rx queues hit by the udp traffic could be controlled via ethtool -L. The udp_sink program was bound to the first idle cpu, to get more stable numbers. On a single numa node receiver: nic rx queues vanilla patched kernel 1 1820 kpps 1900 kpps 2 1950 kpps 2500 kpps 16 1670 kpps 2120 kpps When using a single nic rx queue, busy polling was also enabled, elsewhere, in the above scenario, the bh processing becomes the bottle-neck and this produces large artifacts in the measured performances (e.g. improving the udp sink run time, decreases the overall tput, since more action from the scheduler comes into play). [1] https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c v1 -> v2: Patches 1/3 and 2/3 are unchanged, in patch 3/3 the rx_queue_lock_held param of udp_rmem_release() is now a bool. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
author: David S. Miller <davem@davemloft.net> 2017-05-16 22:41:31 +0300
committer: David S. Miller <davem@davemloft.net> 2017-05-16 22:41:31 +0300
commit: 8dfedc5343401512f80060628263ee0f52937c86 (patch)
tree: d69a117f59aeb3a96f42f6b452b471becfa9d803 /include/linux
parent: 9dca599b7fa44f01f3635380767da1a5782e4c65 (diff)
parent: 6dfb4367cd911d2b03878fffa045d545ba4507f6 (diff)
download: linux-8dfedc5343401512f80060628263ee0f52937c86.tar.xz
2 files changed, 10 insertions, 0 deletions
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a098d95b3d84..bfc7892f6c33 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3056,6 +3056,13 @@ static inline void skb_frag_list_init(struct sk_buff *skb)
 
 int __skb_wait_for_more_packets(struct sock *sk, int *err, long *timeo_p,
 				const struct sk_buff *skb);
+struct sk_buff *__skb_try_recv_from_queue(struct sock *sk,
+					  struct sk_buff_head *queue,
+					  unsigned int flags,
+					  void (*destructor)(struct sock *sk,
+							   struct sk_buff *skb),
+					  int *peeked, int *off, int *err,
+					  struct sk_buff **last);
 struct sk_buff *__skb_try_recv_datagram(struct sock *sk, unsigned flags,
 					void (*destructor)(struct sock *sk,
 							   struct sk_buff *skb),
diff --git a/include/linux/udp.h b/include/linux/udp.h
index 6cb4061a720d..eaea63bc79bb 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -80,6 +80,9 @@ struct udp_sock {
 						struct sk_buff *skb,
 						int nhoff);
 
+	/* udp_recvmsg try to use this before splicing sk_receive_queue */
+	struct sk_buff_head	reader_queue ____cacheline_aligned_in_smp;
+
 	/* This field is dirtied by udp_recvmsg() */
 	int		forward_deficit;
 };
author	David S. Miller <davem@davemloft.net>	2017-05-16 22:41:31 +0300
committer	David S. Miller <davem@davemloft.net>	2017-05-16 22:41:31 +0300
commit	8dfedc5343401512f80060628263ee0f52937c86 (patch)
tree	d69a117f59aeb3a96f42f6b452b471becfa9d803 /include/linux
parent	9dca599b7fa44f01f3635380767da1a5782e4c65 (diff)
parent	6dfb4367cd911d2b03878fffa045d545ba4507f6 (diff)
download	linux-8dfedc5343401512f80060628263ee0f52937c86.tar.xz