<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/net/ip_vs.h, branch linux-5.8.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-5.8.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-5.8.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-08-19T06:26:31+00:00</updated>
<entry>
<title>ipvs: allow connection reuse for unconfirmed conntrack</title>
<updated>2020-08-19T06:26:31+00:00</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2020-07-01T15:17:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=101df88a583759bd151bc770c7fc3390f6e317df'/>
<id>urn:sha1:101df88a583759bd151bc770c7fc3390f6e317df</id>
<content type='text'>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ]

YangYuxi is reporting that connection reuse
is causing one-second delay when SYN hits
existing connection in TIME_WAIT state.
Such delay was added to give time to expire
both the IPVS connection and the corresponding
conntrack. This was considered a rare case
at that time but it is causing problem for
some environments such as Kubernetes.

As nf_conntrack_tcp_packet() can decide to
release the conntrack in TIME_WAIT state and
to replace it with a fresh NEW conntrack, we
can use this to allow rescheduling just by
tuning our check: if the conntrack is
confirmed we can not schedule it to different
real server and the one-second delay still
applies but if new conntrack was created,
we are free to select new real server without
any delays.

YangYuxi lists some of the problem reports:

- One second connection delay in masquerading mode:
https://marc.info/?t=151683118100004&amp;r=1&amp;w=2

- IPVS low throughput #70747
https://github.com/kubernetes/kubernetes/issues/70747

- Apache Bench can fill up ipvs service proxy in seconds #544
https://github.com/cloudnativelabs/kube-router/issues/544

- Additional 1s latency in `host -&gt; service IP -&gt; pod`
https://github.com/kubernetes/kubernetes/issues/90854

Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack")
Co-developed-by: YangYuxi &lt;yx.atom1@gmail.com&gt;
Signed-off-by: YangYuxi &lt;yx.atom1@gmail.com&gt;
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Reviewed-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2019-11-02T20:54:56+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-11-02T20:12:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d31e95585ca697fb31440c6fe30113adc85ecfbd'/>
<id>urn:sha1:d31e95585ca697fb31440c6fe30113adc85ecfbd</id>
<content type='text'>
The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.

The rest were (relatively) trivial in nature.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipvs: move old_secure_tcp into struct netns_ipvs</title>
<updated>2019-10-24T09:56:02+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-10-23T16:53:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c24b75e0f9239e78105f81c5f03a751641eb07ef'/>
<id>urn:sha1:c24b75e0f9239e78105f81c5f03a751641eb07ef</id>
<content type='text'>
syzbot reported the following issue :

BUG: KCSAN: data-race in update_defense_level / update_defense_level

read to 0xffffffff861a6260 of 4 bytes by task 3006 on cpu 1:
 update_defense_level+0x621/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:177
 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225
 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
 worker_thread+0xa0/0x800 kernel/workqueue.c:2415
 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

write to 0xffffffff861a6260 of 4 bytes by task 7333 on cpu 0:
 update_defense_level+0xa62/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:205
 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225
 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
 worker_thread+0xa0/0x800 kernel/workqueue.c:2415
 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 7333 Comm: kworker/0:5 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events defense_work_handler

Indeed, old_secure_tcp is currently a static variable, while it
needs to be a per netns variable.

Fixes: a0840e2e165a ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: batch __ip_vs_cleanup</title>
<updated>2019-10-08T09:28:33+00:00</updated>
<author>
<name>Haishuang Yan</name>
<email>yanhaishuang@cmss.chinamobile.com</email>
</author>
<published>2019-09-27T04:54:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5d5a0815f854a5b0e21d97e16cfadad69ce5fb04'/>
<id>urn:sha1:5d5a0815f854a5b0e21d97e16cfadad69ce5fb04</id>
<content type='text'>
It's better to batch __ip_vs_cleanup to speedup ipvs
connections dismantle.

Signed-off-by: Haishuang Yan &lt;yanhaishuang@cmss.chinamobile.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2019-07-09T02:48:57+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-07-09T02:48:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=af144a983402f7fd324ce556d9f9011a8b3e01fe'/>
<id>urn:sha1:af144a983402f7fd324ce556d9f9011a8b3e01fe</id>
<content type='text'>
Two cases of overlapping changes, nothing fancy.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipvs: fix tinfo memory leak in start_sync_thread</title>
<updated>2019-06-25T00:32:27+00:00</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2019-06-18T20:07:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5db7c8b9f9fc2aeec671ae3ca6375752c162e0e7'/>
<id>urn:sha1:5db7c8b9f9fc2aeec671ae3ca6375752c162e0e7</id>
<content type='text'>
syzkaller reports for memory leak in start_sync_thread [1]

As Eric points out, kthread may start and stop before the
threadfn function is called, so there is no chance the
data (tinfo in our case) to be released in thread.

Fix this by releasing tinfo in the controlling code instead.

[1]
BUG: memory leak
unreferenced object 0xffff8881206bf700 (size 32):
 comm "syz-executor761", pid 7268, jiffies 4294943441 (age 20.470s)
 hex dump (first 32 bytes):
   00 40 7c 09 81 88 ff ff 80 45 b8 21 81 88 ff ff  .@|......E.!....
   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 backtrace:
   [&lt;0000000057619e23&gt;] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
   [&lt;0000000057619e23&gt;] slab_post_alloc_hook mm/slab.h:439 [inline]
   [&lt;0000000057619e23&gt;] slab_alloc mm/slab.c:3326 [inline]
   [&lt;0000000057619e23&gt;] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
   [&lt;0000000086ce5479&gt;] kmalloc include/linux/slab.h:547 [inline]
   [&lt;0000000086ce5479&gt;] start_sync_thread+0x5d2/0xe10 net/netfilter/ipvs/ip_vs_sync.c:1862
   [&lt;000000001a9229cc&gt;] do_ip_vs_set_ctl+0x4c5/0x780 net/netfilter/ipvs/ip_vs_ctl.c:2402
   [&lt;00000000ece457c8&gt;] nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
   [&lt;00000000ece457c8&gt;] nf_setsockopt+0x4c/0x80 net/netfilter/nf_sockopt.c:115
   [&lt;00000000942f62d4&gt;] ip_setsockopt net/ipv4/ip_sockglue.c:1258 [inline]
   [&lt;00000000942f62d4&gt;] ip_setsockopt+0x9b/0xb0 net/ipv4/ip_sockglue.c:1238
   [&lt;00000000a56a8ffd&gt;] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
   [&lt;00000000fa895401&gt;] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130
   [&lt;0000000095eef4cf&gt;] __sys_setsockopt+0x98/0x120 net/socket.c:2078
   [&lt;000000009747cf88&gt;] __do_sys_setsockopt net/socket.c:2089 [inline]
   [&lt;000000009747cf88&gt;] __se_sys_setsockopt net/socket.c:2086 [inline]
   [&lt;000000009747cf88&gt;] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
   [&lt;00000000ded8ba80&gt;] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
   [&lt;00000000893b4ac8&gt;] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported-by: syzbot+7e2e50c8adfccd2e5041@syzkaller.appspotmail.com
Suggested-by: Eric Biggers &lt;ebiggers@kernel.org&gt;
Fixes: 998e7a76804b ("ipvs: Use kthread_run() instead of doing a double-fork via kernel_thread()")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipvs: add checksum support for gue encapsulation</title>
<updated>2019-05-31T16:23:52+00:00</updated>
<author>
<name>Jacky Hu</name>
<email>hengqing.hu@gmail.com</email>
</author>
<published>2019-05-30T00:16:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=29930e314da3833437a2ddc7b17f6a954f38d8fb'/>
<id>urn:sha1:29930e314da3833437a2ddc7b17f6a954f38d8fb</id>
<content type='text'>
Add checksum support for gue encapsulation with the tun_flags parameter,
which could be one of the values below:
IP_VS_TUNNEL_ENCAP_FLAG_NOCSUM
IP_VS_TUNNEL_ENCAP_FLAG_CSUM
IP_VS_TUNNEL_ENCAP_FLAG_REMCSUM

Signed-off-by: Jacky Hu &lt;hengqing.hu@gmail.com&gt;
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipvs: add function to find tunnels</title>
<updated>2019-05-31T15:48:09+00:00</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2019-05-05T12:14:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2aa3c9f48bc28ca0effd9877e010ad54c8a630e5'/>
<id>urn:sha1:2aa3c9f48bc28ca0effd9877e010ad54c8a630e5</id>
<content type='text'>
Add ip_vs_find_tunnel() to match tunnel headers
by family, address and optional port. Use it to
properly find the tunnel real server used in
received ICMP errors.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipvs: allow rs_table to contain different real server types</title>
<updated>2019-05-31T15:47:57+00:00</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2019-05-05T12:14:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1da40ab6caf924633116582c4c86939c486f20db'/>
<id>urn:sha1:1da40ab6caf924633116582c4c86939c486f20db</id>
<content type='text'>
Before now rs_table was used only for NAT real servers.
Change it to allow TUN real severs from different types,
possibly hashed with different port key.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipvs: allow tunneling with gue encapsulation</title>
<updated>2019-04-08T20:57:59+00:00</updated>
<author>
<name>Jacky Hu</name>
<email>hengqing.hu@gmail.com</email>
</author>
<published>2019-03-26T10:31:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=84c0d5e96f3ae20344fb3a79161eab18905dae56'/>
<id>urn:sha1:84c0d5e96f3ae20344fb3a79161eab18905dae56</id>
<content type='text'>
ipip packets are blocked in some public cloud environments, this patch
allows gue encapsulation with the tunneling method, which would make
tunneling working in those environments.

Signed-off-by: Jacky Hu &lt;hengqing.hu@gmail.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
</feed>
