Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 9382fe71c0058465e942a633869629929102843d ]
IPv4 and IPv6 packets may arrive with lower-layer padding that is not
included in the L3 length. For example, a short IPv4 packet may have
up to 6 bytes of padding following the IP payload when received on an
Ethernet device with a minimum packet length of 64 bytes.
Higher-layer processing functions in netfilter (e.g. nf_ip_checksum(),
and help() in nf_conntrack_ftp) assume skb->len reflects the length of
the L3 header and payload, rather than referring back to
ip_hdr->tot_len or ipv6_hdr->payload_len, and get confused by
lower-layer padding.
In the normal IPv4 receive path, ip_rcv() trims the packet to
ip_hdr->tot_len before invoking netfilter hooks. In the IPv6 receive
path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly
in the br_netfilter receive path, br_validate_ipv4() and
br_validate_ipv6() trim the packet to the L3 length before invoking
netfilter hooks.
Currently in the OVS conntrack receive path, ovs_ct_execute() pulls
the skb to the L3 header but does not trim it to the L3 length before
calling nf_conntrack_in(NF_INET_PRE_ROUTING). When
nf_conntrack_proto_tcp encounters a packet with lower-layer padding,
nf_ip_checksum() fails causing a "nf_ct_tcp: bad TCP checksum" log
message. While extra zero bytes don't affect the checksum, the length
in the IP pseudoheader does. That length is based on skb->len, and
without trimming, it doesn't match the length the sender used when
computing the checksum.
In ovs_ct_execute(), trim the skb to the L3 length before higher-layer
processing.
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit cf5d70918877c6a6655dc1e92e2ebb661ce904fd ]
Conntrack helpers do not check for a potentially clashing conntrack
entry when creating a new expectation. Also, nf_conntrack_in() will
check expectations (via init_conntrack()) only if a conntrack entry
can not be found. The expectation for a packet which also matches an
existing conntrack entry will not be removed by conntrack, and is
currently handled inconsistently by OVS, as OVS expects the
expectation to be removed when the connection tracking entry matching
that expectation is confirmed.
It should be noted that normally an IP stack would not allow reuse of
a 5-tuple of an old (possibly lingering) connection for a new data
connection, so this is somewhat unlikely corner case. However, it is
possible that a misbehaving source could cause conntrack entries be
created that could then interfere with new related connections.
Fix this in the OVS module by deleting the clashing conntrack entry
after an expectation has been matched. This causes the following
nf_conntrack_in() call also find the expectation and remove it when
creating the new conntrack entry, as well as the forthcoming reply
direction packets to match the new related connection instead of the
old clashing conntrack entry.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Yang Song <yangsong@vmware.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 69ec932e364b1ba9c3a2085fe96b76c8a3f71e7c ]
Before the 'type' is validated, we shouldn't use it to fetch the
ovs_ct_attr_lens's minlen and maxlen, else, out of bound access
may happen.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 48cac18ecf1de82f76259a54402c3adb7839ad01 ]
Andrey reported a use-after-free in IPv6 stack.
Issue here is that we free the socket while it still has skb
in TX path and in some queues.
It happens here because IPv6 reassembly unit messes skb->truesize,
breaking skb_set_owner_w() badly.
We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag:
Always orphan skbs inside ip_defrag()")
Acked-by: Joe Stringer <joe@ovn.org>
==================================================================
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
Read of size 8 at addr ffff880062da0060 by task a.out/4140
page:ffffea00018b6800 count:1 mapcount:0 mapping: (null)
index:0x0 compound_mapcount: 0
flags: 0x100000000008100(slab|head)
raw: 0100000000008100 0000000000000000 0000000000000000 0000000180130013
raw: dead000000000100 dead000000000200 ffff88006741f140 0000000000000000
page dumped because: kasan: bad access detected
CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:15
dump_stack+0x292/0x398 lib/dump_stack.c:51
describe_address mm/kasan/report.c:262
kasan_report_error+0x121/0x560 mm/kasan/report.c:370
kasan_report mm/kasan/report.c:392
__asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
sock_flag ./arch/x86/include/asm/bitops.h:324
sock_wfree+0x118/0x120 net/core/sock.c:1631
skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
skb_release_all+0x15/0x60 net/core/skbuff.c:668
__kfree_skb+0x15/0x20 net/core/skbuff.c:684
kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
inet_frag_put ./include/net/inet_frag.h:133
nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617
ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
nf_hook_entry_hookfn ./include/linux/netfilter.h:102
nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
nf_hook ./include/linux/netfilter.h:212
__ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
rawv6_push_pending_frames net/ipv6/raw.c:613
rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
sock_sendmsg_nosec net/socket.c:635
sock_sendmsg+0xca/0x110 net/socket.c:645
sock_write_iter+0x326/0x620 net/socket.c:848
new_sync_write fs/read_write.c:499
__vfs_write+0x483/0x760 fs/read_write.c:512
vfs_write+0x187/0x530 fs/read_write.c:560
SYSC_write fs/read_write.c:607
SyS_write+0xfb/0x230 fs/read_write.c:599
entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
RIP: 0033:0x7ff26e6f5b79
RSP: 002b:00007ff268e0ed98 EFLAGS: 00000206 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007ff268e0f9c0 RCX: 00007ff26e6f5b79
RDX: 0000000000000010 RSI: 0000000020f50fe1 RDI: 0000000000000003
RBP: 00007ff26ebc1220 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
R13: 00007ff268e0f9c0 R14: 00007ff26efec040 R15: 0000000000000003
The buggy address belongs to the object at ffff880062da0000
which belongs to the cache RAWv6 of size 1504
The buggy address ffff880062da0060 is located 96 bytes inside
of 1504-byte region [ffff880062da0000, ffff880062da05e0)
Freed by task 4113:
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
save_stack+0x43/0xd0 mm/kasan/kasan.c:502
set_track mm/kasan/kasan.c:514
kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
slab_free_hook mm/slub.c:1352
slab_free_freelist_hook mm/slub.c:1374
slab_free mm/slub.c:2951
kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
sk_prot_free net/core/sock.c:1377
__sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
sk_destruct+0x47/0x80 net/core/sock.c:1460
__sk_free+0x57/0x230 net/core/sock.c:1468
sk_free+0x23/0x30 net/core/sock.c:1479
sock_put ./include/net/sock.h:1638
sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
sock_release+0x8d/0x1e0 net/socket.c:599
sock_close+0x16/0x20 net/socket.c:1063
__fput+0x332/0x7f0 fs/file_table.c:208
____fput+0x15/0x20 fs/file_table.c:244
task_work_run+0x19b/0x270 kernel/task_work.c:116
exit_task_work ./include/linux/task_work.h:21
do_exit+0x186b/0x2800 kernel/exit.c:839
do_group_exit+0x149/0x420 kernel/exit.c:943
SYSC_exit_group kernel/exit.c:954
SyS_exit_group+0x1d/0x20 kernel/exit.c:952
entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
Allocated by task 4115:
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
save_stack+0x43/0xd0 mm/kasan/kasan.c:502
set_track mm/kasan/kasan.c:514
kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
slab_post_alloc_hook mm/slab.h:432
slab_alloc_node mm/slub.c:2708
slab_alloc mm/slub.c:2716
kmem_cache_alloc+0x1af/0x250 mm/slub.c:2721
sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1334
sk_alloc+0x105/0x1010 net/core/sock.c:1396
inet6_create+0x44d/0x1150 net/ipv6/af_inet6.c:183
__sock_create+0x4f6/0x880 net/socket.c:1199
sock_create net/socket.c:1239
SYSC_socket net/socket.c:1269
SyS_socket+0xf9/0x230 net/socket.c:1249
entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
Memory state around the buggy address:
ffff880062d9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff880062d9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff880062da0000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff880062da0080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff880062da0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 75f01a4c9cc291ff5cb28ca1216adb163b7a20ee ]
When executing conntrack actions on skbuffs with checksum mode
CHECKSUM_COMPLETE, the checksum must be updated to account for
header pushes and pulls. Otherwise we get "hw csum failure"
logs similar to this (ICMP packet received on geneve tunnel
via ixgbe NIC):
[ 405.740065] genev_sys_6081: hw csum failure
[ 405.740106] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G I 4.10.0-rc3+ #1
[ 405.740108] Call Trace:
[ 405.740110] <IRQ>
[ 405.740113] dump_stack+0x63/0x87
[ 405.740116] netdev_rx_csum_fault+0x3a/0x40
[ 405.740118] __skb_checksum_complete+0xcf/0xe0
[ 405.740120] nf_ip_checksum+0xc8/0xf0
[ 405.740124] icmp_error+0x1de/0x351 [nf_conntrack_ipv4]
[ 405.740132] nf_conntrack_in+0xe1/0x550 [nf_conntrack]
[ 405.740137] ? find_bucket.isra.2+0x62/0x70 [openvswitch]
[ 405.740143] __ovs_ct_lookup+0x95/0x980 [openvswitch]
[ 405.740145] ? netif_rx_internal+0x44/0x110
[ 405.740149] ovs_ct_execute+0x147/0x4b0 [openvswitch]
[ 405.740153] do_execute_actions+0x22e/0xa70 [openvswitch]
[ 405.740157] ovs_execute_actions+0x40/0x120 [openvswitch]
[ 405.740161] ovs_dp_process_packet+0x84/0x120 [openvswitch]
[ 405.740166] ovs_vport_receive+0x73/0xd0 [openvswitch]
[ 405.740168] ? udp_rcv+0x1a/0x20
[ 405.740170] ? ip_local_deliver_finish+0x93/0x1e0
[ 405.740172] ? ip_local_deliver+0x6f/0xe0
[ 405.740174] ? ip_rcv_finish+0x3a0/0x3a0
[ 405.740176] ? ip_rcv_finish+0xdb/0x3a0
[ 405.740177] ? ip_rcv+0x2a7/0x400
[ 405.740180] ? __netif_receive_skb_core+0x970/0xa00
[ 405.740185] netdev_frame_hook+0xd3/0x160 [openvswitch]
[ 405.740187] __netif_receive_skb_core+0x1dc/0xa00
[ 405.740194] ? ixgbe_clean_rx_irq+0x46d/0xa20 [ixgbe]
[ 405.740197] __netif_receive_skb+0x18/0x60
[ 405.740199] netif_receive_skb_internal+0x40/0xb0
[ 405.740201] napi_gro_receive+0xcd/0x120
[ 405.740204] gro_cell_poll+0x57/0x80 [geneve]
[ 405.740206] net_rx_action+0x260/0x3c0
[ 405.740209] __do_softirq+0xc9/0x28c
[ 405.740211] irq_exit+0xd9/0xf0
[ 405.740213] do_IRQ+0x51/0xd0
[ 405.740215] common_interrupt+0x93/0x93
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
If nf_ct_frag6_gather() returns an error other than -EINPROGRESS, it
means that we still have a reference to the skb. We should free it
before returning from handle_fragments, as stated in the comment above.
Fixes: daaa7d647f81 ("netfilter: ipv6: avoid nf_iterate recursion")
CC: Florian Westphal <fw@strlen.de>
CC: Pravin B Shelar <pshelar@ovn.org>
CC: Joe Stringer <joe@ovn.org>
Signed-off-by: Daniele Di Proietto <diproiettod@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When an error occurs during conntrack template creation as part of
actions validation, we need to free the template. Previously we've been
using nf_ct_put() to do this, but nf_ct_tmpl_free() is more appropriate.
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ovs_ct_find_existing() issues a warning if an existing conntrack entry
classified as IP_CT_NEW is found, with the premise that this should
not happen. However, a newly confirmed, non-expected conntrack entry
remains IP_CT_NEW as long as no reply direction traffic is seen. This
has resulted into somewhat confusing kernel log messages. This patch
removes this check and warning.
Fixes: 289f2253 ("openvswitch: Find existing conntrack entry after upcall.")
Suggested-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The conntrack label extension is currently variable-sized, e.g. if
only 2 labels are used by iptables rules then the labels->bits[] array
will only contain one element.
We track size of each label storage area in the 'words' member.
But in nftables and openvswitch we always have to ask for worst-case
since we don't know what bit will be used at configuration time.
As most arches are 64bit we need to allocate 24 bytes in this case:
struct nf_conn_labels {
u8 words; /* 0 1 */
/* XXX 7 bytes hole, try to pack */
long unsigned bits[2]; /* 8 24 */
Make bits a fixed size and drop the words member, it simplifies
the code and only increases memory requirements on x86 when
less than 64bit labels are required.
We still only allocate the extension if its needed.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Several cases of overlapping changes, except the packet scheduler
conflicts which deal with the addition of the free list parameter
to qdisc_enqueue().
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Only the first and last netlink message for a particular conntrack are
actually sent. The first message is sent through nf_conntrack_confirm when
the conntrack is committed. The last one is sent when the conntrack is
destroyed on timeout. The other conntrack state change messages are not
advertised.
When the conntrack subsystem is used from netfilter, nf_conntrack_confirm
is called for each packet, from the postrouting hook, which in turn calls
nf_ct_deliver_cached_events to send the state change netlink messages.
This commit fixes the problem by calling nf_ct_deliver_cached_events in the
non-commit case as well.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
CC: Joe Stringer <joestringer@nicira.com>
CC: Justin Pettit <jpettit@nicira.com>
CC: Andy Zhou <azhou@nicira.com>
CC: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Only set conntrack mark or labels when the commit flag is specified.
This makes sure we can not set them before the connection has been
persisted, as in that case the mark and labels would be lost in an
event of an userspace upcall.
OVS userspace already requires the commit flag to accept setting
ct_mark and/or ct_labels. Validate for this in the kernel API.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Set conntrack mark and labels right before committing so that
the initial conntrack NEW event has the mark and labels.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The nf_conntrack_core.c fix in 'net' is not relevant in 'net-next'
because we no longer have a per-netns conntrack hash.
The ip_gre.c conflict as well as the iwlwifi ones were cases of
overlapping changes.
Conflicts:
drivers/net/wireless/intel/iwlwifi/mvm/tx.c
net/ipv4/ip_gre.c
net/netfilter/nf_conntrack_core.c
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When using conntrack helpers from OVS, a common configuration is to
perform a lookup without specifying a helper, then go through a
firewalling policy, only to decide to attach a helper afterwards.
In this case, the initial lookup will cause a ct entry to be attached to
the skb, then the later commit with helper should attach the helper and
confirm the connection. However, the helper attachment has been missing.
If the user has enabled automatic helper attachment, then this issue
will be masked as it will be applied in init_conntrack(). It is also
masked if the action is executed from ovs_packet_cmd_execute() as that
will construct a fresh skb.
This patch fixes the issue by making an explicit call to try to assign
the helper if there is a discrepancy between the action's helper and the
current skb->nfct.
Fixes: cae3a2627520 ("openvswitch: Allow attaching helpers to ct action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the protocol is not natively supported, this assigns generic protocol
tracker so we can always assume a valid pointer after these calls.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Joe Stringer <joe@ovn.org>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for your net-next
tree, mostly from Florian Westphal to sort out the lack of sufficient
validation in x_tables and connlabel preparation patches to add
nf_tables support. They are:
1) Ensure we don't go over the ruleset blob boundaries in
mark_source_chains().
2) Validate that target jumps land on an existing xt_entry. This extra
sanitization comes with a performance penalty when loading the ruleset.
3) Introduce xt_check_entry_offsets() and use it from {arp,ip,ip6}tables.
4) Get rid of the smallish check_entry() functions in {arp,ip,ip6}tables.
5) Make sure the minimal possible target size in x_tables.
6) Similar to #3, add xt_compat_check_entry_offsets() for compat code.
7) Check that standard target size is valid.
8) More sanitization to ensure that the target_offset field is correct.
9) Add xt_check_entry_match() to validate that matches are well-formed.
10-12) Three patch to reduce the number of parameters in
translate_compat_table() for {arp,ip,ip6}tables by using a container
structure.
13) No need to return value from xt_compat_match_from_user(), so make
it void.
14) Consolidate translate_table() so it can be used by compat code too.
15) Remove obsolete check for compat code, so we keep consistent with
what was already removed in the native layout code (back in 2007).
16) Get rid of target jump validation from mark_source_chains(),
obsoleted by #2.
17) Introduce xt_copy_counters_from_user() to consolidate counter
copying, and use it from {arp,ip,ip6}tables.
18,22) Get rid of unnecessary explicit inlining in ctnetlink for dump
functions.
19) Move nf_connlabel_match() to xt_connlabel.
20) Skip event notification if connlabel did not change.
21) Update of nf_connlabels_get() to make the upcoming nft connlabel
support easier.
23) Remove spinlock to read protocol state field in conntrack.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is the IPv6 counterpart to commit 8282f27449bf ("inet: frag: Always
orphan skbs inside ip_defrag()").
Prior to commit 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free
clone operations"), ipv6 fragments sent to nf_ct_frag6_gather() would be
cloned (implicitly orphaning) prior to queueing for reassembly. As such,
when the IPv6 message is eventually reassembled, the skb->sk for all
fragments would be NULL. After that commit was introduced, rather than
cloning, the original skbs were queued directly without orphaning. The
end result is that all frags except for the first and last may have a
socket attached.
This commit explicitly orphans such skbs during nf_ct_frag6_gather() to
prevent BUG_ON(skb->sk) during a later call to ip6_fragment().
kernel BUG at net/ipv6/ip6_output.c:631!
[...]
Call Trace:
<IRQ>
[<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
[<ffffffffa042c7c0>] ? do_output.isra.28+0x1b0/0x1b0 [openvswitch]
[<ffffffff810bb8a2>] ? __lock_is_held+0x52/0x70
[<ffffffffa042c587>] ovs_fragment+0x1f7/0x280 [openvswitch]
[<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
[<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50
[<ffffffff81697ea0>] ? dst_discard_out+0x20/0x20
[<ffffffff81697e80>] ? dst_ifdown+0x80/0x80
[<ffffffffa042c703>] do_output.isra.28+0xf3/0x1b0 [openvswitch]
[<ffffffffa042d279>] do_execute_actions+0x709/0x12c0 [openvswitch]
[<ffffffffa04340a4>] ? ovs_flow_stats_update+0x74/0x1e0 [openvswitch]
[<ffffffffa04340d1>] ? ovs_flow_stats_update+0xa1/0x1e0 [openvswitch]
[<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40
[<ffffffffa042de75>] ovs_execute_actions+0x45/0x120 [openvswitch]
[<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch]
[<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40
[<ffffffffa042def4>] ovs_execute_actions+0xc4/0x120 [openvswitch]
[<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch]
[<ffffffffa04337f2>] ? key_extract+0x442/0xc10 [openvswitch]
[<ffffffffa043b26d>] ovs_vport_receive+0x5d/0xb0 [openvswitch]
[<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
[<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
[<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0
[<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50
[<ffffffffa043c11d>] internal_dev_xmit+0x6d/0x150 [openvswitch]
[<ffffffffa043c0b5>] ? internal_dev_xmit+0x5/0x150 [openvswitch]
[<ffffffff8168fb5f>] dev_hard_start_xmit+0x2df/0x660
[<ffffffff8168f5ea>] ? validate_xmit_skb.isra.105.part.106+0x1a/0x2b0
[<ffffffff81690925>] __dev_queue_xmit+0x8f5/0x950
[<ffffffff81690080>] ? __dev_queue_xmit+0x50/0x950
[<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
[<ffffffff81690990>] dev_queue_xmit+0x10/0x20
[<ffffffff8169a418>] neigh_resolve_output+0x178/0x220
[<ffffffff81752759>] ? ip6_finish_output2+0x219/0x7b0
[<ffffffff81752759>] ip6_finish_output2+0x219/0x7b0
[<ffffffff817525a5>] ? ip6_finish_output2+0x65/0x7b0
[<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80
[<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50
[<ffffffff81754af1>] ip6_fragment+0xba1/0xc50
[<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40
[<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0
[<ffffffff81754dcf>] ip6_output+0x5f/0x1a0
[<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50
[<ffffffff81797fbd>] ip6_local_out+0x3d/0x80
[<ffffffff817554df>] ip6_send_skb+0x2f/0xc0
[<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50
[<ffffffff817796cc>] icmpv6_push_pending_frames+0xac/0xe0
[<ffffffff8177a4be>] icmpv6_echo_reply+0x42e/0x500
[<ffffffff8177acbf>] icmpv6_rcv+0x4cf/0x580
[<ffffffff81755ac7>] ip6_input_finish+0x1a7/0x690
[<ffffffff81755925>] ? ip6_input_finish+0x5/0x690
[<ffffffff817567a0>] ip6_input+0x30/0xa0
[<ffffffff81755920>] ? ip6_rcv_finish+0x1a0/0x1a0
[<ffffffff817557ce>] ip6_rcv_finish+0x4e/0x1a0
[<ffffffff8175640f>] ipv6_rcv+0x45f/0x7c0
[<ffffffff81755fe6>] ? ipv6_rcv+0x36/0x7c0
[<ffffffff81755780>] ? ip6_make_skb+0x1c0/0x1c0
[<ffffffff8168b649>] __netif_receive_skb_core+0x229/0xb80
[<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
[<ffffffff8168c07f>] ? process_backlog+0x6f/0x230
[<ffffffff8168bfb6>] __netif_receive_skb+0x16/0x70
[<ffffffff8168c088>] process_backlog+0x78/0x230
[<ffffffff8168c0ed>] ? process_backlog+0xdd/0x230
[<ffffffff8168db43>] net_rx_action+0x203/0x480
[<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0
[<ffffffff817c156e>] __do_softirq+0xde/0x49f
[<ffffffff81752768>] ? ip6_finish_output2+0x228/0x7b0
[<ffffffff817c070c>] do_softirq_own_stack+0x1c/0x30
<EOI>
[<ffffffff8106f88b>] do_softirq.part.18+0x3b/0x40
[<ffffffff8106f946>] __local_bh_enable_ip+0xb6/0xc0
[<ffffffff81752791>] ip6_finish_output2+0x251/0x7b0
[<ffffffff81754af1>] ? ip6_fragment+0xba1/0xc50
[<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80
[<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50
[<ffffffff81754af1>] ip6_fragment+0xba1/0xc50
[<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40
[<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0
[<ffffffff81754dcf>] ip6_output+0x5f/0x1a0
[<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50
[<ffffffff81797fbd>] ip6_local_out+0x3d/0x80
[<ffffffff817554df>] ip6_send_skb+0x2f/0xc0
[<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50
[<ffffffff81778558>] rawv6_sendmsg+0xa28/0xe30
[<ffffffff81719097>] ? inet_sendmsg+0xc7/0x1d0
[<ffffffff817190d6>] inet_sendmsg+0x106/0x1d0
[<ffffffff81718fd5>] ? inet_sendmsg+0x5/0x1d0
[<ffffffff8166d078>] sock_sendmsg+0x38/0x50
[<ffffffff8166d4d6>] SYSC_sendto+0xf6/0x170
[<ffffffff8100201b>] ? trace_hardirqs_on_thunk+0x1b/0x1d
[<ffffffff8166e38e>] SyS_sendto+0xe/0x10
[<ffffffff817bebe5>] entry_SYSCALL_64_fastpath+0x18/0xa8
Code: 06 48 83 3f 00 75 26 48 8b 87 d8 00 00 00 2b 87 d0 00 00 00 48 39 d0 72 14 8b 87 e4 00 00 00 83 f8 01 75 09 48 83 7f 18 00 74 9a <0f> 0b 41 8b 86 cc 00 00 00 49 8#
RIP [<ffffffff8175468a>] ip6_fragment+0x73a/0xc50
RSP <ffff880072803120>
Fixes: 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free clone
operations")
Reported-by: Daniele Di Proietto <diproiettod@vmware.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
nf_connlabel_set() takes the bit number that we would like to set.
nf_connlabels_get() however took the number of bits that we want to
support.
So e.g. nf_connlabels_get(32) support bits 0 to 31, but not 32.
This changes nf_connlabels_get() to take the highest bit that we want
to set.
Callers then don't have to cope with a potential integer wrap
when using nf_connlabels_get(bit + 1) anymore.
Current callers are fine, this change is only to make folloup
nft ct label set support simpler.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for you net tree,
they are:
1) There was a race condition between parallel save/swap and delete,
which resulted a kernel crash due to the increase ref for save, swap,
wrong ref decrease operations. Reported and fixed by Vishwanath Pai.
2) OVS should call into CT NAT for packets of new expected connections only
when the conntrack state is persisted with the 'commit' option to the
OVS CT action. From Jarno Rajahalme.
3) Resolve kconfig dependencies with new OVS NAT support. From Arnd Bergmann.
4) Early validation of entry->target_offset to make sure it doesn't take us
out from the blob, from Florian Westphal.
5) Again early validation of entry->next_offset to make sure it doesn't take
out from the blob, also from Florian.
6) Check that entry->target_offset is always of of sizeof(struct xt_entry)
for unconditional entries, when checking both from check_underflow()
and when checking for loops in mark_source_chains(), again from
Florian.
7) Fix inconsistent behaviour in nfnetlink_queue when
NFQA_CFG_F_FAIL_OPEN is set and netlink_unicast() fails due to buffer
overrun, we have to reinject the packet as the user expects.
8) Enforce nul-terminated table names from getsockopt GET_ENTRIES
requests.
9) Don't assume skb->sk is set from nft_bridge_reject and synproxy,
this fixes a recent update of the code to namespaceify
ip_default_ttl, patch from Liping Zhang.
This batch comes with four patches to validate x_tables blobs coming
from userspace. CONFIG_USERNS exposes the x_tables interface to
unpriviledged users and to be honest this interface never received the
attention for this move away from the CAP_NET_ADMIN domain. Florian is
working on another round with more patches with more sanity checks, so
expect a bit more Netfilter fixes in this development cycle than usual.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The openvswitch code has gained support for calling into the
nf-nat-ipv4/ipv6 modules, however those can be loadable modules
in a configuration in which openvswitch is built-in, leading
to link errors:
net/built-in.o: In function `__ovs_ct_lookup':
:(.text+0x2cc2c8): undefined reference to `nf_nat_icmp_reply_translation'
:(.text+0x2cc66c): undefined reference to `nf_nat_icmpv6_reply_translation'
The dependency on (!NF_NAT || NF_NAT) prevents similar issues,
but NF_NAT is set to 'y' if any of the symbols selecting
it are built-in, but the link error happens when any of them
are modular.
A second issue is that even if CONFIG_NF_NAT_IPV6 is built-in,
CONFIG_NF_NAT_IPV4 might be completely disabled. This is unlikely
to be useful in practice, but the driver currently only handles
IPv6 being optional.
This patch improves the Kconfig dependency so that openvswitch
cannot be built-in if either of the two other symbols are set
to 'm', and it replaces the incorrect #ifdef in ovs_ct_nat_execute()
with two "if (IS_ENABLED())" checks that should catch all corner
cases also make the code more readable.
The same #ifdef exists ovs_ct_nat_to_attr(), where it does not
cause a link error, but for consistency I'm changing it the same
way.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 05752523e565 ("openvswitch: Interface with NAT.")
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
OVS should call into CT NAT for packets of new expected connections only
when the conntrack state is persisted with the 'commit' option to the
OVS CT action. The test for this condition is doubly wrong, as the CT
status field is ANDed with the bit number (IPS_EXPECTED_BIT) rather
than the mask (IPS_EXPECTED), and due to the wrong assumption that the
expected bit would apply only for the first (i.e., 'new') packet of a
connection, while in fact the expected bit remains on for the lifetime of
an expected connection. The 'ctinfo' value IP_CT_RELATED derived from
the ct status can be used instead, as it is only ever applicable to
the 'new' packets of the expected connection.
Fixes: 05752523e565 ('openvswitch: Interface with NAT.')
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
For the input parameter count, it's better to use the size
of destination buffer size, as nla_memcpy would take into
account the length of the source netlink attribute when
a data is copied from an attribute.
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extend OVS conntrack interface to cover NAT. New nested
OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action.
A bare OVS_CT_ATTR_NAT only mangles existing and expected connections.
If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested
attributes, new (non-committed/non-confirmed) connections are mangled
according to the rest of the nested attributes.
The corresponding OVS userspace patch series includes test cases (in
tests/system-traffic.at) that also serve as example uses.
This work extends on a branch by Thomas Graf at
https://github.com/tgraf/ovs/tree/nat.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
There is no need to help connections that are not confirmed, so we can
delay helping new connections to the time when they are confirmed.
This change is needed for NAT support, and having this as a separate
patch will make the following NAT patch a bit easier to review.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Repeat the nf_conntrack_in() call when it returns NF_REPEAT. This
avoids dropping a SYN packet re-opening an existing TCP connection.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add a new function ovs_ct_find_existing() to find an existing
conntrack entry for which this packet was already applied to. This is
only to be called when there is evidence that the packet was already
tracked and committed, but we lost the ct reference due to an
userspace upcall.
ovs_ct_find_existing() is called from skb_nfct_cached(), which can now
hide the fact that the ct reference may have been lost due to an
upcall. This allows ovs_ct_commit() to be simplified.
This patch is needed by later "openvswitch: Interface with NAT" patch,
as we need to be able to pass the packet through NAT using the
original ct reference also after the reference is lost after an
upcall.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Only a successful nf_conntrack_in() call can effect a connection state
change, so it suffices to update the key only after the
nf_conntrack_in() returns.
This change is needed for the later NAT patches.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This makes the code easier to understand and the following patches
more focused.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Remove the definition of IP_CT_NEW_REPLY from the kernel as it does
not make sense. This allows the definition of IP_CT_NUMBER to be
simplified as well.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
Commit 5b48bb8506c5 ("openvswitch: Fix helper reference leak") fixed a
reference leak on helper objects, but inadvertently introduced a leak on
the ct template.
Previously, ct_info.ct->general.use was initialized to 0 by
nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action()
returned successful. If an error occurred while adding the helper or
adding the action to the actions buffer, the __ovs_ct_free_action()
cleanup would use nf_ct_put() to free the entry; However, this relies on
atomic_dec_and_test(ct_info.ct->general.use). This reference must be
incremented first, or nf_ct_put() will never free it.
Fix the issue by acquiring a reference to the template immediately after
allocation.
Fixes: cae3a2627520 ("openvswitch: Allow attaching helpers to ct action")
Fixes: 5b48bb8506c5 ("openvswitch: Fix helper reference leak")
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains the first batch of Netfilter updates for
the upcoming 4.5 kernel. This batch contains userspace netfilter header
compilation fixes, support for packet mangling in nf_tables, the new
tracing infrastructure for nf_tables and cgroup2 support for iptables.
More specifically, they are:
1) Two patches to include dependencies in our netfilter userspace
headers to resolve compilation problems, from Mikko Rapeli.
2) Four comestic cleanup patches for the ebtables codebase, from Ian Morris.
3) Remove duplicate include in the netfilter reject infrastructure,
from Stephen Hemminger.
4) Two patches to simplify the netfilter defragmentation code for IPv6,
patch from Florian Westphal.
5) Fix root ownership of /proc/net netfilter for unpriviledged net
namespaces, from Philip Whineray.
6) Get rid of unused fields in struct nft_pktinfo, from Florian Westphal.
7) Add mangling support to our nf_tables payload expression, from
Patrick McHardy.
8) Introduce a new netlink-based tracing infrastructure for nf_tables,
from Florian Westphal.
9) Change setter functions in nfnetlink_log to be void, from
Rami Rosen.
10) Add netns support to the cttimeout infrastructure.
11) Add cgroup2 support to iptables, from Tejun Heo.
12) Introduce nfnl_dereference_protected() in nfnetlink, from Florian.
13) Add support for mangling pkttype in the nf_tables meta expression,
also from Florian.
BTW, I need that you pull net into net-next, I have another batch that
requires changes that I don't yet see in net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If userspace executes ct(zone=1), and the connection tracker determines
that the packet is invalid, then the ct_zone flow key field is populated
with the default zone rather than the zone that was specified. Even
though connection tracking failed, this field should be updated with the
value that the action specified. Fix the issue.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the actions (re)allocation fails, or the actions list is larger than the
maximum size, and the conntrack action is the last action when these
problems are hit, then references to helper modules may be leaked. Fix
the issue.
Fixes: cae3a2627520 ("openvswitch: Allow attaching helpers to ct action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The previous patch changed nf_ct_frag6_gather() to morph reassembled skb
with the previous one.
This means that the return value is always NULL or the skb argument.
So change it to an err value.
Instead of invoking NF_HOOK recursively with threshold to skip already-called hooks
we can now just return NF_ACCEPT to move on to the next hook except for
-EINPROGRESS (which means skb has been queued for reassembly), in which case we
return NF_STOLEN.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
commit 6aafeef03b9d9ecf
("netfilter: push reasm skb through instead of original frag skbs")
changed ipv6 defrag to not use the original skbs anymore.
So rather than keeping the original skbs around just to discard them
afterwards just use the original skbs directly for the fraglist of
the newly assembled skb and remove the extra clone/free operations.
The skb that completes the fragment queue is morphed into a the
reassembled one instead, just like ipv4 defrag.
openvswitch doesn't need any additional skb_morph magic anymore to deal
with this situation so just remove that.
A followup patch can then also remove the NF_HOOK (re)invocation in
the ipv6 netfilter defrag hook.
Cc: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
|
|
nf_ct_frag6_gather() makes a clone of each skb passed to it, and if the
reassembly is successful, expects the caller to free all of the original
skbs using nf_ct_frag6_consume_orig(). This call was previously missing,
meaning that the original fragments were never freed (with the exception
of the last fragment to arrive).
Fix this by ensuring that all original fragments except for the last
fragment are freed via nf_ct_frag6_consume_orig(). The last fragment
will be morphed into the head, so it must not be freed yet. Furthermore,
retain the ->next pointer for the head after skb_morph().
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If ip_defrag() returns an error other than -EINPROGRESS, then the skb is
freed. When handle_fragments() passes this back up to
do_execute_actions(), it will be freed again. Prevent this double free
by never freeing the skb in do_execute_actions() for errors returned by
ovs_ct_execute. Always free it in ovs_ct_execute() error paths instead.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Conflicts:
net/ipv6/xfrm6_output.c
net/openvswitch/flow_netlink.c
net/openvswitch/vport-gre.c
net/openvswitch/vport-vxlan.c
net/openvswitch/vport.c
net/openvswitch/vport.h
The openvswitch conflicts were overlapping changes. One was
the egress tunnel info fix in 'net' and the other was the
vport ->send() op simplification in 'net-next'.
The xfrm6_output.c conflicts was also a simplification
overlapping a bug fix.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If userspace provides a ct action with no nested mark or label, then the
storage for these fields is zeroed. Later when actions are requested,
such zeroed fields are serialized even though userspace didn't
originally specify them. Fix the behaviour by ensuring that no action is
serialized in this case, and reject actions where userspace attempts to
set these fields with mask=0. This should make netlink marshalling
consistent across deserialization/reserialization.
Reported-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
New, related connections are marked as such as part of ovs_ct_lookup(),
but they are not marked as "new" if the commit flag is used. Make this
consistent by setting the "new" flag whenever !nf_ct_is_confirmed(ct).
Reported-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Conflicts:
drivers/net/usb/asix_common.c
net/ipv4/inet_connection_sock.c
net/switchdev/switchdev.c
In the inet_connection_sock.c case the request socket hashing scheme
is completely different in net-next.
The other two conflicts were overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function nf_ct_frag6_gather is called on both the input and the
output paths of the networking stack. In particular ipv6_defrag which
calls nf_ct_frag6_gather is called from both the the PRE_ROUTING chain
on input and the LOCAL_OUT chain on output.
The addition of a net parameter makes it explicit which network
namespace the packets are being reassembled in, and removes the need
for nf_ct_frag6_gather to guess.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function ip_defrag is called on both the input and the output
paths of the networking stack. In particular conntrack when it is
tracking outbound packets from the local machine calls ip_defrag.
So add a struct net parameter and stop making ip_defrag guess which
network namespace it needs to defragment packets in.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Previously, the CT_ATTR_FLAGS attribute, when nested under the
OVS_ACTION_ATTR_CT, encoded a 32-bit bitmask of flags that modify the
semantics of the ct action. It's more extensible to just represent each
flag as a nested attribute, and this requires no additional error
checking to reject flags that aren't currently supported.
Suggested-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The ct_state field was initially added as an 8-bit field, however six of
the bits are already being used and use cases are already starting to
appear that may push the limits of this field. This patch extends the
field to 32 bits while retaining the internal representation of 8 bits.
This should cover forward compatibility of the ABI for the foreseeable
future.
This patch also reorders the OVS_CS_F_* bits to be sequential.
Suggested-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Conntrack LABELS (plural) are exposed by conntrack; rename the OVS name
for these to be consistent with conntrack.
Fixes: c2ac667 "openvswitch: Allow matching on conntrack label"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Conflicts:
net/ipv4/arp.c
The net/ipv4/arp.c conflict was one commit adding a new
local variable while another commit was deleting one.
Signed-off-by: David S. Miller <davem@davemloft.net>
|