Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 7880287981b60a6808f39f297bb66936e8bdf57a upstream.
KMSAN reports use of uninitialized memory in the case when |alen| is
smaller than sizeof(struct sockaddr_nl), and therefore |nladdr| isn't
fully copied from the userspace.
Signed-off-by: Alexander Potapenko <glider@google.com>
Fixes: 1da177e4c3f41524 ("Linux-2.6.12-rc2")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 6e5d58fdc9bedd0255a8781b258f10bbdc63e975 upstream.
When errors are enqueued to the error queue via sock_queue_err_skb()
function, it is possible that the waiting application is not notified.
Calling 'sk->sk_data_ready()' would not notify applications that
selected only POLLERR events in poll() (for example).
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Randy E. Witt <randy.e.witt@intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: sk_data_ready() operation takes a length parameter.
Delete the local variable we used for that.]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit c8d70a700a5b486bfa8e5a7d33d805389f6e59f9 upstream.
ebt_among is special, it has a dynamic match size and is exempt
from the central size checks.
commit c4585a2823edf ("bridge: ebt_among: add missing match size checks")
added validation for pool size, but missed fact that the macros
ebt_among_wh_src/dst can already return out-of-bound result because
they do not check value of wh_src/dst_ofs (an offset) vs. the size
of the match that userspace gave to us.
v2:
check that offset has correct alignment.
Paolo Abeni points out that we should also check that src/dst
wormhash arrays do not overlap, and src + length lines up with
start of dst (or vice versa).
v3: compact wormhash_sizes_valid() part
NB: Fixes tag is intentionally wrong, this bug exists from day
one when match was added for 2.6 kernel. Tag is there so stable
maintainers will notice this one too.
Tested with same rules from the earlier patch.
Fixes: c4585a2823edf ("bridge: ebt_among: add missing match size checks")
Reported-by: <syzbot+bdabab6f1983a03fc009@syzkaller.appspotmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit c4585a2823edf4d1326da44d1524ecbfda26bb37 upstream.
ebt_among is special, it has a dynamic match size and is exempt
from the central size checks.
Therefore it must check that the size of the match structure
provided from userspace is sane by making sure em->match_size
is at least the minimum size of the expected structure.
The module has such a check, but its only done after accessing
a structure that might be out of bounds.
tested with: ebtables -A INPUT ... \
--among-dst fe:fe:fe:fe:fe:fe
--among-dst fe:fe:fe:fe:fe:fe --among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fb,fe:fe:fe:fe:fc:fd,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe
--among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fa,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe,fe:fe:fe:fe:fe:fe
Reported-by: <syzbot+fe0b19af568972814355@syzkaller.appspotmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 17cfe79a65f98abe535261856c5aef14f306dff7 upstream.
syzkaller found an issue caused by lack of sufficient checks
in l2tp_tunnel_create()
RAW sockets can not be considered as UDP ones for instance.
In another patch, we shall replace all pr_err() by less intrusive
pr_debug() so that syzkaller can find other bugs faster.
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Acked-by: James Chapman <jchapman@katalix.com>
==================================================================
BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
dst_release: dst:00000000d53d0d0f refcnt:-1
Write of size 1 at addr ffff8801d013b798 by task syz-executor3/6242
CPU: 1 PID: 6242 Comm: syz-executor3 Not tainted 4.16.0-rc2+ #253
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x23b/0x360 mm/kasan/report.c:412
__asan_report_store1_noabort+0x17/0x20 mm/kasan/report.c:435
setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
l2tp_tunnel_create+0x1354/0x17f0 net/l2tp/l2tp_core.c:1596
pppol2tp_connect+0x14b1/0x1dd0 net/l2tp/l2tp_ppp.c:707
SYSC_connect+0x213/0x4a0 net/socket.c:1640
SyS_connect+0x24/0x30 net/socket.c:1621
do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit d02ba2a6110c530a32926af8ad441111774d2893 upstream.
pppol2tp_release uses call_rcu to put the final ref on its socket. But
the session object doesn't hold a ref on the session socket so may be
freed while the pppol2tp_put_sk RCU callback is scheduled. Fix this by
having the session hold a ref on its socket until the session is
destroyed. It is this ref that is dropped via call_rcu.
Sessions are also deleted via l2tp_tunnel_closeall. This must now also put
the final ref via call_rcu. So move the call_rcu call site into
pppol2tp_session_close so that this happens in both destroy paths. A
common destroy path should really be implemented, perhaps with
l2tp_tunnel_closeall calling l2tp_session_delete like pppol2tp_release
does, but this will be looked at later.
ODEBUG: activate active (active state 1) object type: rcu_head hint: (null)
WARNING: CPU: 3 PID: 13407 at lib/debugobjects.c:291 debug_print_object+0x166/0x220
Modules linked in:
CPU: 3 PID: 13407 Comm: syzbot_19c09769 Not tainted 4.16.0-rc2+ #38
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
RIP: 0010:debug_print_object+0x166/0x220
RSP: 0018:ffff880013647a00 EFLAGS: 00010082
RAX: dffffc0000000008 RBX: 0000000000000003 RCX: ffffffff814d3333
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88001a59f6d0
RBP: ffff880013647a40 R08: 0000000000000000 R09: 0000000000000001
R10: ffff8800136479a8 R11: 0000000000000000 R12: 0000000000000001
R13: ffffffff86161420 R14: ffffffff85648b60 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88001a580000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020e77000 CR3: 0000000006022000 CR4: 00000000000006e0
Call Trace:
debug_object_activate+0x38b/0x530
? debug_object_assert_init+0x3b0/0x3b0
? __mutex_unlock_slowpath+0x85/0x8b0
? pppol2tp_session_destruct+0x110/0x110
__call_rcu.constprop.66+0x39/0x890
? __call_rcu.constprop.66+0x39/0x890
call_rcu_sched+0x17/0x20
pppol2tp_release+0x2c7/0x440
? fcntl_setlk+0xca0/0xca0
? sock_alloc_file+0x340/0x340
sock_release+0x92/0x1e0
sock_close+0x1b/0x20
__fput+0x296/0x6e0
____fput+0x1a/0x20
task_work_run+0x127/0x1a0
do_exit+0x7f9/0x2ce0
? SYSC_connect+0x212/0x310
? mm_update_next_owner+0x690/0x690
? up_read+0x1f/0x40
? __do_page_fault+0x3c8/0xca0
do_group_exit+0x10d/0x330
? do_group_exit+0x330/0x330
SyS_exit_group+0x22/0x30
do_syscall_64+0x1e0/0x730
? trace_hardirqs_off_thunk+0x1a/0x1c
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x7f362e471259
RSP: 002b:00007ffe389abe08 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f362e471259
RDX: 00007f362e471259 RSI: 000000000000002e RDI: 0000000000000000
RBP: 00007ffe389abe30 R08: 0000000000000000 R09: 00007f362e944270
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000400b60
R13: 00007ffe389abf50 R14: 0000000000000000 R15: 0000000000000000
Code: 8d 3c dd a0 8f 64 85 48 89 fa 48 c1 ea 03 80 3c 02 00 75 7b 48 8b 14 dd a0 8f 64 85 4c 89 f6 48 c7 c7 20 85 64 85 e
8 2a 55 14 ff <0f> 0b 83 05 ad 2a 68 04 01 48 83 c4 18 5b 41 5c 41 5d 41 5e 41
Fixes: ee40fb2e1eb5b ("l2tp: protect sock pointer of struct pppol2tp_session with RCU")
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 225eb26489d05c679a4c4197ffcb81c81e9dcaf4 upstream.
Previously, if a ppp session was closed, we called inet_shutdown to mark
the socket as unconnected such that userspace would get errors and
then close the socket. This could race with userspace closing the
socket. Instead, leave userspace to close the socket in its own time
(our session will be detached anyway).
BUG: KASAN: use-after-free in inet_shutdown+0x5d/0x1c0
Read of size 4 at addr ffff880010ea3ac0 by task syzbot_347bd5ac/8296
CPU: 3 PID: 8296 Comm: syzbot_347bd5ac Not tainted 4.16.0-rc1+ #91
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Call Trace:
dump_stack+0x101/0x157
? inet_shutdown+0x5d/0x1c0
print_address_description+0x78/0x260
? inet_shutdown+0x5d/0x1c0
kasan_report+0x240/0x360
__asan_load4+0x78/0x80
inet_shutdown+0x5d/0x1c0
? pppol2tp_show+0x80/0x80
pppol2tp_session_close+0x68/0xb0
l2tp_tunnel_closeall+0x199/0x210
? udp_v6_flush_pending_frames+0x90/0x90
l2tp_udp_encap_destroy+0x6b/0xc0
? l2tp_tunnel_del_work+0x2e0/0x2e0
udpv6_destroy_sock+0x8c/0x90
sk_common_release+0x47/0x190
udp_lib_close+0x15/0x20
inet_release+0x85/0xd0
inet6_release+0x43/0x60
sock_release+0x53/0x100
? sock_alloc_file+0x260/0x260
sock_close+0x1b/0x20
__fput+0x19f/0x380
____fput+0x1a/0x20
task_work_run+0xd2/0x110
exit_to_usermode_loop+0x18d/0x190
do_syscall_64+0x389/0x3b0
entry_SYSCALL_64_after_hwframe+0x26/0x9b
RIP: 0033:0x7fe240a45259
RSP: 002b:00007fe241132df8 EFLAGS: 00000297 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fe240a45259
RDX: 00007fe240a45259 RSI: 0000000000000000 RDI: 00000000000000a5
RBP: 00007fe241132e20 R08: 00007fe241133700 R09: 0000000000000000
R10: 00007fe241133700 R11: 0000000000000297 R12: 0000000000000000
R13: 00007ffc49aff84f R14: 0000000000000000 R15: 00007fe241141040
Allocated by task 8331:
save_stack+0x43/0xd0
kasan_kmalloc+0xad/0xe0
kasan_slab_alloc+0x12/0x20
kmem_cache_alloc+0x144/0x3e0
sock_alloc_inode+0x22/0x130
alloc_inode+0x3d/0xf0
new_inode_pseudo+0x1c/0x90
sock_alloc+0x30/0x110
__sock_create+0xaa/0x4c0
SyS_socket+0xbe/0x130
do_syscall_64+0x128/0x3b0
entry_SYSCALL_64_after_hwframe+0x26/0x9b
Freed by task 8314:
save_stack+0x43/0xd0
__kasan_slab_free+0x11a/0x170
kasan_slab_free+0xe/0x10
kmem_cache_free+0x88/0x2b0
sock_destroy_inode+0x49/0x50
destroy_inode+0x77/0xb0
evict+0x285/0x340
iput+0x429/0x530
dentry_unlink_inode+0x28c/0x2c0
__dentry_kill+0x1e3/0x2f0
dput.part.21+0x500/0x560
dput+0x24/0x30
__fput+0x2aa/0x380
____fput+0x1a/0x20
task_work_run+0xd2/0x110
exit_to_usermode_loop+0x18d/0x190
do_syscall_64+0x389/0x3b0
entry_SYSCALL_64_after_hwframe+0x26/0x9b
Fixes: fd558d186df2c ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: deleted code is formatted a bit differently]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit abd6360591d3f8259f41c34e31ac4826dfe621b8 upstream.
eth_type_trans() internally calls skb_pull(), which does not adjust the
skb checksum; skb_postpull_rcsum() is necessary to avoid log spam of the
form "bat0: hw csum failure" when packets with CHECKSUM_COMPLETE are
received.
Note that in usual setups, packets don't reach batman-adv with
CHECKSUM_COMPLETE (I assume NICs bail out of checksumming when they see
batadv's ethtype?), which is why the log messages do not occur on every
system using batman-adv. I could reproduce this issue by stacking
batman-adv on top of a VXLAN interface.
Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
Tested-by: Maximilian Wilhelm <max@sdn.clinic>
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit cfc2c740533368b96e2be5e0a4e8c3cace7d9814 upstream.
We had one report from syzkaller [1]
First issue is that INIT_WORK() should be done before mod_timer()
or we risk timer being fired too soon, even with a 1 second timer.
Second issue is that we need to reject too big info->timeout
to avoid overflows in msecs_to_jiffies(info->timeout * 1000), or
risk looping, if result after overflow is 0.
[1]
WARNING: CPU: 1 PID: 5129 at kernel/workqueue.c:1444 __queue_work+0xdf4/0x1230 kernel/workqueue.c:1444
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 5129 Comm: syzkaller159866 Not tainted 4.16.0-rc1+ #230
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1dc/0x200 kernel/panic.c:547
report_bug+0x211/0x2d0 lib/bug.c:184
fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
fixup_bug arch/x86/kernel/traps.c:247 [inline]
do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:988
RIP: 0010:__queue_work+0xdf4/0x1230 kernel/workqueue.c:1444
RSP: 0018:ffff8801db507538 EFLAGS: 00010006
RAX: ffff8801aeb46080 RBX: ffff8801db530200 RCX: ffffffff81481404
RDX: 0000000000000100 RSI: ffffffff86b42640 RDI: 0000000000000082
RBP: ffff8801db507758 R08: 1ffff1003b6a0de5 R09: 000000000000000c
R10: ffff8801db5073f0 R11: 0000000000000020 R12: 1ffff1003b6a0eb6
R13: ffff8801b1067ae0 R14: 00000000000001f8 R15: dffffc0000000000
queue_work_on+0x16a/0x1c0 kernel/workqueue.c:1488
queue_work include/linux/workqueue.h:488 [inline]
schedule_work include/linux/workqueue.h:546 [inline]
idletimer_tg_expired+0x44/0x60 net/netfilter/xt_IDLETIMER.c:116
call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
expire_timers kernel/time/timer.c:1363 [inline]
__run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
__do_softirq+0x2d7/0xb85 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1cc/0x200 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:541 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:829
</IRQ>
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:777 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x5e/0xba kernel/locking/spinlock.c:184
RSP: 0018:ffff8801c20173c8 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff12
RAX: dffffc0000000000 RBX: 0000000000000282 RCX: 0000000000000006
RDX: 1ffffffff0d592cd RSI: 1ffff10035d68d23 RDI: 0000000000000282
RBP: ffff8801c20173d8 R08: 1ffff10038402e47 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8820e5c8
R13: ffff8801b1067ad8 R14: ffff8801aea7c268 R15: ffff8801aea7c278
__debug_object_init+0x235/0x1040 lib/debugobjects.c:378
debug_object_init+0x17/0x20 lib/debugobjects.c:391
__init_work+0x2b/0x60 kernel/workqueue.c:506
idletimer_tg_create net/netfilter/xt_IDLETIMER.c:152 [inline]
idletimer_tg_checkentry+0x691/0xb00 net/netfilter/xt_IDLETIMER.c:213
xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:850
check_target net/ipv6/netfilter/ip6_tables.c:533 [inline]
find_check_entry.isra.7+0x935/0xcf0 net/ipv6/netfilter/ip6_tables.c:575
translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:744
do_replace net/ipv6/netfilter/ip6_tables.c:1160 [inline]
do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1686
nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
ipv6_setsockopt+0x10b/0x130 net/ipv6/ipv6_sockglue.c:927
udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2976
SYSC_setsockopt net/socket.c:1850 [inline]
SyS_setsockopt+0x189/0x360 net/socket.c:1829
do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287
Fixes: 0902b469bd25 ("netfilter: xtables: idletimer target implementation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit dfec091439bb2acf763497cfc58f2bdfc67c56b7 upstream.
After commit 3f34cfae1238 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), the caller of nf_{get/set}sockopt() must
not hold any lock, but, in such changeset, I forgot to cope with DECnet.
This commit addresses the issue moving the nf call outside the lock,
in the dn_{get,set}sockopt() with the same schema currently used by
ipv4 and ipv6. Also moves the unhandled sockopts of the end of the main
switch statements, to improve code readability.
Reported-by: Petr Vandrovec <petr@vandrovec.name>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=198791#c2
Fixes: 3f34cfae1238 ("netfilter: on sockopt() acquire sock lock only in the required scope")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit db57ccf0f2f4624b4c4758379f8165277504fbd7 upstream.
syzbot reported a division by 0 bug in the netfilter nat code:
divide error: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 4168 Comm: syzkaller034710 Not tainted 4.16.0-rc1+ #309
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:nf_nat_l4proto_unique_tuple+0x291/0x530
net/netfilter/nf_nat_proto_common.c:88
RSP: 0018:ffff8801b2466778 EFLAGS: 00010246
RAX: 000000000000f153 RBX: ffff8801b2466dd8 RCX: ffff8801b2466c7c
RDX: 0000000000000000 RSI: ffff8801b2466c58 RDI: ffff8801db5293ac
RBP: ffff8801b24667d8 R08: ffff8801b8ba6dc0 R09: ffffffff88af5900
R10: ffff8801b24666f0 R11: 0000000000000000 R12: 000000002990f153
R13: 0000000000000001 R14: 0000000000000000 R15: ffff8801b2466c7c
FS: 00000000017e3880(0000) GS:ffff8801db500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000208fdfe4 CR3: 00000001b5340002 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
dccp_unique_tuple+0x40/0x50 net/netfilter/nf_nat_proto_dccp.c:30
get_unique_tuple+0xc28/0x1c10 net/netfilter/nf_nat_core.c:362
nf_nat_setup_info+0x1c2/0xe00 net/netfilter/nf_nat_core.c:406
nf_nat_redirect_ipv6+0x306/0x730 net/netfilter/nf_nat_redirect.c:124
redirect_tg6+0x7f/0xb0 net/netfilter/xt_REDIRECT.c:34
ip6t_do_table+0xc2a/0x1a30 net/ipv6/netfilter/ip6_tables.c:365
ip6table_nat_do_chain+0x65/0x80 net/ipv6/netfilter/ip6table_nat.c:41
nf_nat_ipv6_fn+0x594/0xa80 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:302
nf_nat_ipv6_local_fn+0x33/0x5d0
net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:407
ip6table_nat_local_fn+0x2c/0x40 net/ipv6/netfilter/ip6table_nat.c:69
nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline]
nf_hook_slow+0xba/0x1a0 net/netfilter/core.c:483
nf_hook include/linux/netfilter.h:243 [inline]
NF_HOOK include/linux/netfilter.h:286 [inline]
ip6_xmit+0x10ec/0x2260 net/ipv6/ip6_output.c:277
inet6_csk_xmit+0x2fc/0x580 net/ipv6/inet6_connection_sock.c:139
dccp_transmit_skb+0x9ac/0x10f0 net/dccp/output.c:142
dccp_connect+0x369/0x670 net/dccp/output.c:564
dccp_v6_connect+0xe17/0x1bf0 net/dccp/ipv6.c:946
__inet_stream_connect+0x2d4/0xf00 net/ipv4/af_inet.c:620
inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:684
SYSC_connect+0x213/0x4a0 net/socket.c:1639
SyS_connect+0x24/0x30 net/socket.c:1620
do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x26/0x9b
RIP: 0033:0x441c69
RSP: 002b:00007ffe50cc0be8 EFLAGS: 00000217 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 0000000000441c69
RDX: 000000000000001c RSI: 00000000208fdfe4 RDI: 0000000000000003
RBP: 00000000006cc018 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000538 R11: 0000000000000217 R12: 0000000000403590
R13: 0000000000403620 R14: 0000000000000000 R15: 0000000000000000
Code: 48 89 f0 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 46 02 00 00 48 8b
45 c8 44 0f b7 20 e8 88 97 04 fd 31 d2 41 0f b7 c4 4c 89 f9 <41> f7 f6 48
c1 e9 03 48 b8 00 00 00 00 00 fc ff df 0f b6 0c 01
RIP: nf_nat_l4proto_unique_tuple+0x291/0x530
net/netfilter/nf_nat_proto_common.c:88 RSP: ffff8801b2466778
The problem is that currently we don't have any check on the
configured port range. A port range == -1 triggers the bug, while
other negative values may require a very long time to complete the
following loop.
This commit addresses the issue swapping the two ends on negative
ranges. The check is performed in nf_nat_l4proto_unique_tuple() since
the nft nat loads the port values from nft registers at runtime.
v1 -> v2: use the correct 'Fixes' tag
v2 -> v3: update commit message, drop unneeded READ_ONCE()
Fixes: 5b1158e909ec ("[NETFILTER]: Add NAT support for nf_conntrack")
Reported-by: syzbot+8012e198bd037f4871e5@syzkaller.appspotmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[bwh: Backported to 3.2: adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 10414014bc085aac9f787a5890b33b5605fbcfc4 upstream.
syzbot reported that xt_LED may try to use the ledinternal->timer
without previously initializing it:
------------[ cut here ]------------
kernel BUG at kernel/time/timer.c:958!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 1826 Comm: kworker/1:2 Not tainted 4.15.0+ #306
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: ipv6_addrconf addrconf_dad_work
RIP: 0010:__mod_timer kernel/time/timer.c:958 [inline]
RIP: 0010:mod_timer+0x7d6/0x13c0 kernel/time/timer.c:1102
RSP: 0018:ffff8801d24fe9f8 EFLAGS: 00010293
RAX: ffff8801d25246c0 RBX: ffff8801aec6cb50 RCX: ffffffff816052c6
RDX: 0000000000000000 RSI: 00000000fffbd14b RDI: ffff8801aec6cb68
RBP: ffff8801d24fec98 R08: 0000000000000000 R09: 1ffff1003a49fd6c
R10: ffff8801d24feb28 R11: 0000000000000005 R12: dffffc0000000000
R13: ffff8801d24fec70 R14: 00000000fffbd14b R15: ffff8801af608f90
FS: 0000000000000000(0000) GS:ffff8801db500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000206d6fd0 CR3: 0000000006a22001 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
led_tg+0x1db/0x2e0 net/netfilter/xt_LED.c:75
ip6t_do_table+0xc2a/0x1a30 net/ipv6/netfilter/ip6_tables.c:365
ip6table_raw_hook+0x65/0x80 net/ipv6/netfilter/ip6table_raw.c:42
nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline]
nf_hook_slow+0xba/0x1a0 net/netfilter/core.c:483
nf_hook.constprop.27+0x3f6/0x830 include/linux/netfilter.h:243
NF_HOOK include/linux/netfilter.h:286 [inline]
ndisc_send_skb+0xa51/0x1370 net/ipv6/ndisc.c:491
ndisc_send_ns+0x38a/0x870 net/ipv6/ndisc.c:633
addrconf_dad_work+0xb9e/0x1320 net/ipv6/addrconf.c:4008
process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
worker_thread+0x223/0x1990 kernel/workqueue.c:2247
kthread+0x33c/0x400 kernel/kthread.c:238
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:429
Code: 85 2a 0b 00 00 4d 8b 3c 24 4d 85 ff 75 9f 4c 8b bd 60 fd ff ff e8 bb
57 10 00 65 ff 0d 94 9a a1 7e e9 d9 fc ff ff e8 aa 57 10 00 <0f> 0b e8 a3
57 10 00 e9 14 fb ff ff e8 99 57 10 00 4c 89 bd 70
RIP: __mod_timer kernel/time/timer.c:958 [inline] RSP: ffff8801d24fe9f8
RIP: mod_timer+0x7d6/0x13c0 kernel/time/timer.c:1102 RSP: ffff8801d24fe9f8
---[ end trace f661ab06f5dd8b3d ]---
The ledinternal struct can be shared between several different
xt_LED targets, but the related timer is currently initialized only
if the first target requires it. Fix it by unconditionally
initializing the timer struct.
v1 -> v2: call del_timer_sync() unconditionally, too.
Fixes: 268cb38e1802 ("netfilter: x_tables: add LED trigger target")
Reported-by: syzbot+10c98dc5725c6c8fc7fb@syzkaller.appspotmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[bwh: Backported to 3.2: Keep using setup_timer()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit db93a3632b0f8773a3899e04a3a3e0aa7a26eb46 upstream.
In clusterip_config_find_get() we hold RCU read lock so it could
run concurrently with clusterip_config_entry_put(), as a result,
the refcnt could go back to 1 from 0, which leads to a double
list_del()... Just replace refcount_inc() with
refcount_inc_not_zero(), as for c->refcount.
Fixes: d73f33b16883 ("netfilter: CLUSTERIP: RCU conversion")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[bwh: Backported to 3.2: s/refcount/atomic/]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 01ea306f2ac2baff98d472da719193e738759d93 upstream.
The Syzbot reported a possible deadlock in the netfilter area caused by
rtnl lock, xt lock and socket lock being acquired with a different order
on different code paths, leading to the following backtrace:
Reviewed-by: Xin Long <lucien.xin@gmail.com>
======================================================
WARNING: possible circular locking dependency detected
4.15.0+ #301 Not tainted
------------------------------------------------------
syzkaller233489/4179 is trying to acquire lock:
(rtnl_mutex){+.+.}, at: [<0000000048e996fd>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74
but task is already holding lock:
(&xt[i].mutex){+.+.}, at: [<00000000328553a2>]
xt_find_table_lock+0x3e/0x3e0 net/netfilter/x_tables.c:1041
which lock already depends on the new lock.
===
Since commit 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), we already acquire the socket lock in
the innermost scope, where needed. In such commit I forgot to remove
the outer-most socket lock from the getsockopt() path, this commit
addresses the issues dropping it now.
v1 -> v2: fix bad subj, added relavant 'fixes' tag
Fixes: 22265a5c3c10 ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Fixes: 202f59afd441 ("netfilter: ipt_CLUSTERIP: do not hold dev")
Fixes: 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock only in the required scope")
Reported-by: syzbot+ddde1c7b7ff7442d7f2d@syzkaller.appspotmail.com
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit ac5b70198adc25c73fba28de4f78adcee8f6be0b upstream.
netif_set_real_num_tx_queues() can be called when netdev is up.
That usually happens when user requests change of number of
channels/rings with ethtool -L. The procedure for changing
the number of queues involves resetting the qdiscs and setting
dev->num_tx_queues to the new value. When the new value is
lower than the old one, extra care has to be taken to ensure
ordering of accesses to the number of queues vs qdisc reset.
Currently the queues are reset before new dev->num_tx_queues
is assigned, leaving a window of time where packets can be
enqueued onto the queues going down, leading to a likely
crash in the drivers, since most drivers don't check if TX
skbs are assigned to an active queue.
Fixes: e6484930d7c7 ("net: allocate tx queues in register_netdevice")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit d97ca5d714a5334aecadadf696875da40f1fbf3e upstream.
The sanity test added in ecd7918745234 can be bypassed, validation
only occurs if XFRM_STATE_ESN flag is set, but rest of code doesn't care
and just checks if the attribute itself is present.
So always validate. Alternative is to reject if we have the attribute
without the flag but that would change abi.
Reported-by: syzbot+0ab777c27d2bb7588f73@syzkaller.appspotmail.com
Cc: Mathias Krause <minipli@googlemail.com>
Fixes: ecd7918745234 ("xfrm_user: ensure user supplied esn replay window is valid")
Fixes: d8647b79c3b7e ("xfrm: Add user interface for esn and big anti-replay windows")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 1b12580af1d0677c3c3a19e35bfe5d59b03f737f upstream.
Now br_sysfs_if file flush doesn't have attr show. To read it will
cause kernel panic after users chmod u+r this file.
Xiong found this issue when running the commands:
ip link add br0 type bridge
ip link add type veth
ip link set veth0 master br0
chmod u+r /sys/devices/virtual/net/veth0/brport/flush
timeout 3 cat /sys/devices/virtual/net/veth0/brport/flush
kernel crashed with NULL a pointer dereference call trace.
This patch is to fix it by return -EINVAL when brport_attr->show
is null, just the same as the check for brport_attr->store in
brport_store().
Fixes: 9cf637473c85 ("bridge: add sysfs hook to flush forwarding table")
Reported-by: Xiong Zhou <xzhou@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 26d99834f89e76514076d9cd06f61e56e6a509b8 upstream.
When a 9p request is successfully flushed, the server is expected to just
mark it as used without sending a 9p reply (ie, without writing data into
the buffer). In this case, virtqueue_get_buf() will return len == 0 and
we must not report a REQ_STATUS_RCVD status to the client, otherwise the
client will erroneously assume the request has not been flushed.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 02a2385f37a7c6594c9d89b64c4a1451276f08eb upstream.
nlmsg_multicast() consumes always the skb, thus the original skb must be
freed only when this function is called with a clone.
Fixes: cb9f7a9a5c96 ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()")
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit cb9f7a9a5c96a773bbc9c70660dc600cfff82f82 upstream.
Nowadays, nlmsg_multicast() returns only 0 or -ESRCH but this was not the
case when commit 134e63756d5f was pushed.
However, there was no reason to stop the loop if a netns does not have
listeners.
Returns -ESRCH only if there was no listeners in all netns.
To avoid having the same problem in the future, I didn't take the
assumption that nlmsg_multicast() returns only 0 or -ESRCH.
Fixes: 134e63756d5f ("genetlink: make netns aware")
CC: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: s/portid/pid/]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 7dc68e98757a8eccf8ca7a53a29b896f1eef1f76 upstream.
rateest_hash is supposed to be protected by xt_rateest_mutex,
and, as suggested by Eric, lookup and insert should be atomic,
so we should acquire the xt_rateest_mutex once for both.
So introduce a non-locking helper for internal use and keep the
locking one for external.
Reported-by: <syzbot+5cb189720978275e4c75@syzkaller.appspotmail.com>
Fixes: 5859034d7eb8 ("[NETFILTER]: x_tables: add RATEEST target")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e7aadb27a5415e8125834b84a74477bfbee4eff5 upstream.
Newly added igmpv3_get_srcaddr() needs to be called under rcu lock.
Timer callbacks do not ensure this locking.
=============================
WARNING: suspicious RCU usage
4.15.0+ #200 Not tainted
-----------------------------
./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syzkaller616973/4074:
#0: (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600
stack backtrace:
CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
__in_dev_get_rcu include/linux/inetdevice.h:216 [inline]
igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline]
igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389
add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432
add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565
igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605
igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722
igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831
call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
expire_timers kernel/time/timer.c:1363 [inline]
__run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
__do_softirq+0x2d7/0xb85 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1cc/0x200 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:541 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 3f34cfae1238848fd53f25e5c8fd59da57901f4b upstream.
Syzbot reported several deadlocks in the netfilter area caused by
rtnl lock and socket lock being acquired with a different order on
different code paths, leading to backtraces like the following one:
======================================================
WARNING: possible circular locking dependency detected
4.15.0-rc9+ #212 Not tainted
------------------------------------------------------
syzkaller041579/3682 is trying to acquire lock:
(sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>] lock_sock
include/net/sock.h:1463 [inline]
(sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>]
do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167
but task is already holding lock:
(rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (rtnl_mutex){+.+.}:
__mutex_lock_common kernel/locking/mutex.c:756 [inline]
__mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
register_netdevice_notifier+0xad/0x860 net/core/dev.c:1607
tee_tg_check+0x1a0/0x280 net/netfilter/xt_TEE.c:106
xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:845
check_target net/ipv6/netfilter/ip6_tables.c:538 [inline]
find_check_entry.isra.7+0x935/0xcf0
net/ipv6/netfilter/ip6_tables.c:580
translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:749
do_replace net/ipv6/netfilter/ip6_tables.c:1165 [inline]
do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1691
nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
ipv6_setsockopt+0x115/0x150 net/ipv6/ipv6_sockglue.c:928
udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
SYSC_setsockopt net/socket.c:1849 [inline]
SyS_setsockopt+0x189/0x360 net/socket.c:1828
entry_SYSCALL_64_fastpath+0x29/0xa0
-> #0 (sk_lock-AF_INET6){+.+.}:
lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3914
lock_sock_nested+0xc2/0x110 net/core/sock.c:2780
lock_sock include/net/sock.h:1463 [inline]
do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167
ipv6_setsockopt+0xd7/0x150 net/ipv6/ipv6_sockglue.c:922
udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
SYSC_setsockopt net/socket.c:1849 [inline]
SyS_setsockopt+0x189/0x360 net/socket.c:1828
entry_SYSCALL_64_fastpath+0x29/0xa0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(rtnl_mutex);
lock(sk_lock-AF_INET6);
lock(rtnl_mutex);
lock(sk_lock-AF_INET6);
*** DEADLOCK ***
1 lock held by syzkaller041579/3682:
#0: (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74
The problem, as Florian noted, is that nf_setsockopt() is always
called with the socket held, even if the lock itself is required only
for very tight scopes and only for some operation.
This patch addresses the issues moving the lock_sock() call only
where really needed, namely in ipv*_getorigdst(), so that nf_setsockopt()
does not need anymore to acquire both locks.
Fixes: 22265a5c3c10 ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Reported-by: syzbot+a4c2dc980ac1af699b36@syzkaller.appspotmail.com
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[bwh: Backported to 3.2: Drop changes to ipv6_getorigdst()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 1a38956cce5eabd7b74f94bab70265e4df83165e upstream.
Commit 136e92bbec0a switched local_nodes from an array to a bitmask
but did not add proper bounds checks. As the result
clusterip_config_init_nodelist() can both over-read
ipt_clusterip_tgt_info.local_nodes and over-write
clusterip_config.local_nodes.
Add bounds checks for both.
Fixes: 136e92bbec0a ("[NETFILTER] CLUSTERIP: use a bitmap to store node responsibility data")
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 820da5357572715c6235ba3b3daa2d5b43a1198f upstream.
Report offset parameter in L2TP_CMD_SESSION_GET command if
it has been configured by userspace
Fixes: 309795f4bec ("l2tp: Add netlink control API for L2TP")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
- Use NLA_PUT_U16, consistent with the rest of the function
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 07f2c7ab6f8d0a7e7c5764c4e6cc9c52951b9d9c upstream.
When SCTP makes INIT or INIT_ACK packet the total chunk length
can exceed SCTP_MAX_CHUNK_LEN which leads to kernel panic when
transmitting these packets, e.g. the crash on sending INIT_ACK:
[ 597.804948] skbuff: skb_over_panic: text:00000000ffae06e4 len:120168
put:120156 head:000000007aa47635 data:00000000d991c2de
tail:0x1d640 end:0xfec0 dev:<NULL>
...
[ 597.976970] ------------[ cut here ]------------
[ 598.033408] kernel BUG at net/core/skbuff.c:104!
[ 600.314841] Call Trace:
[ 600.345829] <IRQ>
[ 600.371639] ? sctp_packet_transmit+0x2095/0x26d0 [sctp]
[ 600.436934] skb_put+0x16c/0x200
[ 600.477295] sctp_packet_transmit+0x2095/0x26d0 [sctp]
[ 600.540630] ? sctp_packet_config+0x890/0x890 [sctp]
[ 600.601781] ? __sctp_packet_append_chunk+0x3b4/0xd00 [sctp]
[ 600.671356] ? sctp_cmp_addr_exact+0x3f/0x90 [sctp]
[ 600.731482] sctp_outq_flush+0x663/0x30d0 [sctp]
[ 600.788565] ? sctp_make_init+0xbf0/0xbf0 [sctp]
[ 600.845555] ? sctp_check_transmitted+0x18f0/0x18f0 [sctp]
[ 600.912945] ? sctp_outq_tail+0x631/0x9d0 [sctp]
[ 600.969936] sctp_cmd_interpreter.isra.22+0x3be1/0x5cb0 [sctp]
[ 601.041593] ? sctp_sf_do_5_1B_init+0x85f/0xc30 [sctp]
[ 601.104837] ? sctp_generate_t1_cookie_event+0x20/0x20 [sctp]
[ 601.175436] ? sctp_eat_data+0x1710/0x1710 [sctp]
[ 601.233575] sctp_do_sm+0x182/0x560 [sctp]
[ 601.284328] ? sctp_has_association+0x70/0x70 [sctp]
[ 601.345586] ? sctp_rcv+0xef4/0x32f0 [sctp]
[ 601.397478] ? sctp6_rcv+0xa/0x20 [sctp]
...
Here the chunk size for INIT_ACK packet becomes too big, mostly
because of the state cookie (INIT packet has large size with
many address parameters), plus additional server parameters.
Later this chunk causes the panic in skb_put_data():
skb_packet_transmit()
sctp_packet_pack()
skb_put_data(nskb, chunk->skb->data, chunk->skb->len);
'nskb' (head skb) was previously allocated with packet->size
from u16 'chunk->chunk_hdr->length'.
As suggested by Marcelo we should check the chunk's length in
_sctp_make_chunk() before trying to allocate skb for it and
discard a chunk if its size bigger than SCTP_MAX_CHUNK_LEN.
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leinter@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
- Keep using WORD_ROUND() instead of SCTP_PAD4()
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 67f93df79aeefc3add4e4b31a752600f834236e2 upstream.
dccp_disconnect() sets 'dp->dccps_hc_tx_ccid' tx handler to NULL,
therefore if DCCP socket is disconnected and dccp_sendmsg() is
called after it, it will cause a NULL pointer dereference in
dccp_write_xmit().
This crash and the reproducer was reported by syzbot. Looks like
it is reproduced if commit 69c64866ce07 ("dccp: CVE-2017-8824:
use-after-free in DCCP code") is applied.
Reported-by: syzbot+f99ab3887ab65d70f816@syzkaller.appspotmail.com
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 932909d9b28d27e807ff8eecb68c7748f6701628 upstream.
The last rule in the blob has next_entry offset that is same as total size.
This made "ebtables32 -A OUTPUT -d de:ad:be:ef:01:02" fail on 64 bit kernel.
Fixes: b71812168571fa ("netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsets")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit b71812168571fa55e44cdd0254471331b9c4c4c6 upstream.
We need to make sure the offsets are not out of range of the
total size.
Also check that they are in ascending order.
The WARN_ON triggered by syzkaller (it sets panic_on_warn) is
changed to also bail out, no point in continuing parsing.
Briefly tested with simple ruleset of
-A INPUT --limit 1/s' --log
plus jump to custom chains using 32bit ebtables binary.
Reported-by: <syzbot+845a53d13171abf8bf29@syzkaller.appspotmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e10b9969f217c948c5523045f44eba4d3a758ff0 upstream.
In this API, we were using sizeof operator for an array
given as function argument, which is invalid.
However this API is not used anywhere.
Signed-off-by: Syam Sidhardhan <s.syam@samsung.com>
Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit f3069c6d33f6ae63a1668737bc78aaaa51bff7ca upstream.
This is a fix for syzkaller719569, where memory registration was
attempted without any underlying transport being loaded.
Analysis of the case reveals that it is the setsockopt() RDS_GET_MR
(2) and RDS_GET_MR_FOR_DEST (7) that are vulnerable.
Here is an example stack trace when the bug is hit:
BUG: unable to handle kernel NULL pointer dereference at 00000000000000c0
IP: __rds_rdma_map+0x36/0x440 [rds]
PGD 2f93d03067 P4D 2f93d03067 PUD 2f93d02067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: bridge stp llc tun rpcsec_gss_krb5 nfsv4
dns_resolver nfs fscache rds binfmt_misc sb_edac intel_powerclamp
coretemp kvm_intel kvm irqbypass crct10dif_pclmul c rc32_pclmul
ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd
iTCO_wdt mei_me sg iTCO_vendor_support ipmi_si mei ipmi_devintf nfsd
shpchp pcspkr i2c_i801 ioatd ma ipmi_msghandler wmi lpc_ich mfd_core
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2
mgag200 i2c_algo_bit drm_kms_helper ixgbe syscopyarea ahci sysfillrect
sysimgblt libahci mdio fb_sys_fops ttm ptp libata sd_mod mlx4_core drm
crc32c_intel pps_core megaraid_sas i2c_core dca dm_mirror
dm_region_hash dm_log dm_mod
CPU: 48 PID: 45787 Comm: repro_set2 Not tainted 4.14.2-3.el7uek.x86_64 #2
Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017
task: ffff882f9190db00 task.stack: ffffc9002b994000
RIP: 0010:__rds_rdma_map+0x36/0x440 [rds]
RSP: 0018:ffffc9002b997df0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff882fa2182580 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffc9002b997e40 RDI: ffff882fa2182580
RBP: ffffc9002b997e30 R08: 0000000000000000 R09: 0000000000000002
R10: ffff885fb29e3838 R11: 0000000000000000 R12: ffff882fa2182580
R13: ffff882fa2182580 R14: 0000000000000002 R15: 0000000020000ffc
FS: 00007fbffa20b700(0000) GS:ffff882fbfb80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000c0 CR3: 0000002f98a66006 CR4: 00000000001606e0
Call Trace:
rds_get_mr+0x56/0x80 [rds]
rds_setsockopt+0x172/0x340 [rds]
? __fget_light+0x25/0x60
? __fdget+0x13/0x20
SyS_setsockopt+0x80/0xe0
do_syscall_64+0x67/0x1b0
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7fbff9b117f9
RSP: 002b:00007fbffa20aed8 EFLAGS: 00000293 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00000000000c84a4 RCX: 00007fbff9b117f9
RDX: 0000000000000002 RSI: 0000400000000114 RDI: 000000000000109b
RBP: 00007fbffa20af10 R08: 0000000000000020 R09: 00007fbff9dd7860
R10: 0000000020000ffc R11: 0000000000000293 R12: 0000000000000000
R13: 00007fbffa20b9c0 R14: 00007fbffa20b700 R15: 0000000000000021
Code: 41 56 41 55 49 89 fd 41 54 53 48 83 ec 18 8b 87 f0 02 00 00 48
89 55 d0 48 89 4d c8 85 c0 0f 84 2d 03 00 00 48 8b 87 00 03 00 00 <48>
83 b8 c0 00 00 00 00 0f 84 25 03 00 0 0 48 8b 06 48 8b 56 08
The fix is to check the existence of an underlying transport in
__rds_rdma_map().
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit dd5684ecae3bd8e44b644f50e2c12c7e57fdfef5 upstream.
ccid2_hc_tx_rto_expire() timer callback always restarts the timer
again and can run indefinitely (unless it is stopped outside), and after
commit 120e9dabaf55 ("dccp: defer ccid_hc_tx_delete() at dismantle time"),
which moved ccid_hc_tx_delete() (also includes sk_stop_timer()) from
dccp_destroy_sock() to sk_destruct(), this started to happen quite often.
The timer prevents releasing the socket, as a result, sk_destruct() won't
be called.
Found with LTP/dccp_ipsec tests running on the bonding device,
which later couldn't be unloaded after the tests were completed:
unregister_netdevice: waiting for bond0 to become free. Usage count = 148
Fixes: 2a91aa396739 ("[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit ad23b750933ea7bf962678972a286c78a8fa36aa upstream.
Commit "net: igmp: Use correct source address on IGMPv3 reports"
introduced a check to validate the source address of locally generated
IGMPv3 packets.
Instead of checking the local interface address directly, it uses
inet_ifa_match(fl4->saddr, ifa), which checks if the address is on the
local subnet (or equal to the point-to-point address if used).
This breaks for point-to-point interfaces, so check against
ifa->ifa_local directly.
Cc: Kevin Cernekee <cernekee@chromium.org>
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Reported-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit a46182b00290839fa3fa159d54fd3237bd8669f0 upstream.
Closing a multicast socket after the final IPv4 address is deleted
from an interface can generate a membership report that uses the
source IP from a different interface. The following test script, run
from an isolated netns, reproduces the issue:
#!/bin/bash
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link set dummy0 up
ip link set dummy1 up
ip addr add 10.1.1.1/24 dev dummy0
ip addr add 192.168.99.99/24 dev dummy1
tcpdump -U -i dummy0 &
socat EXEC:"sleep 2" \
UDP4-DATAGRAM:239.101.1.68:8889,ip-add-membership=239.0.1.68:10.1.1.1 &
sleep 1
ip addr del 10.1.1.1/24 dev dummy0
sleep 5
kill %tcpdump
RFC 3376 specifies that the report must be sent with a valid IP source
address from the destination subnet, or from address 0.0.0.0. Add an
extra check to make sure this is the case.
Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 5762d7d3eda25c03cc2d9d45227be3f5ab6bec9e upstream.
Fix two places where the structure isn't initialized to zero,
and thus can't be filled properly by the driver.
Fixes: 4a4b8169501b ("cfg80211: Accept multiple RSSI thresholds for CQM")
Fixes: 9930380f0bd8 ("cfg80211: implement IWRATE")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
- Drop change in cfg80211_cqm_rssi_update()
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 8cb68751c115d176ec851ca56ecfbb411568c9e8 upstream.
If an invalid CAN frame is received, from a driver or from a tun
interface, a Kernel warning is generated.
This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a
kernel, bootet with panic_on_warn, does not panic. A printk seems to be
more appropriate here.
Reported-by: syzbot+4386709c0c1284dca827@syzkaller.appspotmail.com
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
[bwh: Backported to 3.2:
- Keep using the 'drop' label, as it has another user
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit c5006b8aa74599ce19104b31d322d2ea9ff887cc upstream.
The check in sctp_sockaddr_af is not robust enough to forbid binding a
v4mapped v6 addr on a v4 socket.
The worse thing is that v4 socket's bind_verify would not convert this
v4mapped v6 addr to a v4 addr. syzbot even reported a crash as the v4
socket bound a v6 addr.
This patch is to fix it by doing the common sa.sa_family check first,
then AF_INET check for v4mapped v6 addrs.
Fixes: 7dab83de50c7 ("sctp: Support ipv6only AF_INET6 sockets.")
Reported-by: syzbot+7b7b518b1228d2743963@syzkaller.appspotmail.com
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit a0ff660058b88d12625a783ce9e5c1371c87951f upstream.
After commit cea0cc80a677 ("sctp: use the right sk after waking up from
wait_buf sleep"), it may change to lock another sk if the asoc has been
peeled off in sctp_wait_for_sndbuf.
However, the asoc's new sk could be already closed elsewhere, as it's in
the sendmsg context of the old sk that can't avoid the new sk's closing.
If the sk's last one refcnt is held by this asoc, later on after putting
this asoc, the new sk will be freed, while under it's own lock.
This patch is to revert that commit, but fix the old issue by returning
error under the old sk's lock.
Fixes: cea0cc80a677 ("sctp: use the right sk after waking up from wait_buf sleep")
Reported-by: syzbot+ac6ea7baa4432811eb50@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit cea0cc80a6777beb6eb643d4ad53690e1ad1d4ff upstream.
Commit dfcb9f4f99f1 ("sctp: deny peeloff operation on asocs with threads
sleeping on it") fixed the race between peeloff and wait sndbuf by
checking waitqueue_active(&asoc->wait) in sctp_do_peeloff().
But it actually doesn't work, as even if waitqueue_active returns false
the waiting sndbuf thread may still not yet hold sk lock. After asoc is
peeled off, sk is not asoc->base.sk any more, then to hold the old sk
lock couldn't make assoc safe to access.
This patch is to fix this by changing to hold the new sk lock if sk is
not asoc->base.sk, meanwhile, also set the sk in sctp_sendmsg with the
new sk.
With this fix, there is no more race between peeloff and waitbuf, the
check 'waitqueue_active' in sctp_do_peeloff can be removed.
Thanks Marcelo and Neil for making this clear.
v1->v2:
fix it by changing to lock the new sock instead of adding a flag in asoc.
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 59b179b48ce2a6076448a44531242ac2b3f6cef2 upstream.
syzbot reported a warning from rfkill_alloc(), and after a while
I think that the reason is that it was doing fault injection and
the dev_set_name() failed, leaving the name NULL, and we didn't
check the return value and got to rfkill_alloc() with a NULL name.
Since we really don't want a NULL name, we ought to check the
return value.
Fixes: fb28ad35906a ("net: struct device - replace bus_id with dev_name(), dev_set_name()")
Reported-by: syzbot+1ddfb3357e1d7bb5b5d3@syzkaller.appspotmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 78bbb15f2239bc8e663aa20bbe1987c91a0b75f6 upstream.
A vlan device with vid 0 is allow to creat by not able to be fully
cleaned up by unregister_vlan_dev() which checks for vlan_id!=0.
Also, VLAN 0 is probably not a valid number and it is kinda
"reserved" for HW accelerating devices, but it is probably too
late to reject it from creation even if makes sense. Instead,
just remove the check in unregister_vlan_dev().
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: The vlan driver didn't leak memory itself, but might
cause underlying drivers to leak resources for VID 0. Keep the check for
hardware acceleration.]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 20b50d79974ea3192e8c3ab7faf4e536e5f14d8f upstream.
Commit 8f659a03a0ba ("net: ipv4: fix for a race condition in
raw_sendmsg") fixed the issue of possibly inconsistent ->hdrincl handling
due to concurrent updates by reading this bit-field member into a local
variable and using the thus stabilized value in subsequent tests.
However, aforementioned commit also adds the (correct) comment that
/* hdrincl should be READ_ONCE(inet->hdrincl)
* but READ_ONCE() doesn't work with bit fields
*/
because as it stands, the compiler is free to shortcut or even eliminate
the local variable at its will.
Note that I have not seen anything like this happening in reality and thus,
the concern is a theoretical one.
However, in order to be on the safe side, emulate a READ_ONCE() on the
bit-field by doing it on the local 'hdrincl' variable itself:
int hdrincl = inet->hdrincl;
hdrincl = READ_ONCE(hdrincl);
This breaks the chain in the sense that the compiler is not allowed
to replace subsequent reads from hdrincl with reloads from inet->hdrincl.
Fixes: 8f659a03a0ba ("net: ipv4: fix for a race condition in raw_sendmsg")
Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: use ACCESS_ONCE()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit bcfd09f7837f5240c30fd2f52ee7293516641faa upstream.
Currently esp will happily create an xfrm state with an unknown
encap type for IPv4, without setting the necessary state parameters.
This patch fixes it by returning -EINVAL.
There is a similar problem in IPv6 where if the mode is unknown
we will skip initialisation while returning zero. However, this
is harmless as the mode has already been checked further up the
stack. This patch removes this anomaly by aligning the IPv6
behaviour with IPv4 and treating unknown modes (which cannot
actually happen) as transport mode.
Fixes: 38320c70d282 ("[IPSEC]: Use crypto_aead and authenc in ESP")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit d16b46e4fd8bc6063624605f25b8c0835bb1fbe3 upstream.
We do not need locking in xfrm_trans_queue because it is designed
to use per-CPU buffers. However, the original code incorrectly
used skb_queue_tail which takes the lock. This patch switches
it to __skb_queue_tail instead.
Reported-and-tested-by: Artem Savkov <asavkov@redhat.com>
Fixes: acf568ee859f ("xfrm: Reinject transport-mode packets...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 4e765b4972af7b07adcb1feb16e7a525ce1f6b28 upstream.
If a message sent to a PF_KEY socket ended with an incomplete extension
header (fewer than 4 bytes remaining), then parse_exthdrs() read past
the end of the message, into uninitialized memory. Fix it by returning
-EINVAL in this case.
Reproducer:
#include <linux/pfkeyv2.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2);
char buf[17] = { 0 };
struct sadb_msg *msg = (void *)buf;
msg->sadb_msg_version = PF_KEY_V2;
msg->sadb_msg_type = SADB_DELETE;
msg->sadb_msg_len = 2;
write(sock, buf, 17);
}
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 06b335cb51af018d5feeff5dd4fd53847ddb675a upstream.
If a message sent to a PF_KEY socket ended with one of the extensions
that takes a 'struct sadb_address' but there were not enough bytes
remaining in the message for the ->sa_family member of the 'struct
sockaddr' which is supposed to follow, then verify_address_len() read
past the end of the message, into uninitialized memory. Fix it by
returning -EINVAL in this case.
This bug was found using syzkaller with KMSAN.
Reproducer:
#include <linux/pfkeyv2.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2);
char buf[24] = { 0 };
struct sadb_msg *msg = (void *)buf;
struct sadb_address *addr = (void *)(msg + 1);
msg->sadb_msg_version = PF_KEY_V2;
msg->sadb_msg_type = SADB_DELETE;
msg->sadb_msg_len = 3;
addr->sadb_address_len = 1;
addr->sadb_address_exttype = SADB_EXT_ADDRESS_SRC;
write(sock, buf, 24);
}
Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit acf568ee859f098279eadf551612f103afdacb4e upstream.
This is an old bugbear of mine:
https://www.mail-archive.com/netdev@vger.kernel.org/msg03894.html
By crafting special packets, it is possible to cause recursion
in our kernel when processing transport-mode packets at levels
that are only limited by packet size.
The easiest one is with DNAT, but an even worse one is where
UDP encapsulation is used in which case you just have to insert
an UDP encapsulation header in between each level of recursion.
This patch avoids this problem by reinjecting tranport-mode packets
through a tasklet.
Fixes: b05e106698d9 ("[IPV4/6]: Netfilter IPsec input hooks")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
[bwh: Backported to 3.2:
- netfilter finish callbacks only receive an sk_buff pointer
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 84aeb437ab98a2bce3d4b2111c79723aedfceb33 upstream.
The early call to br_stp_change_bridge_id in bridge's newlink can cause
a memory leak if an error occurs during the newlink because the fdb
entries are not cleaned up if a different lladdr was specified, also
another minor issue is that it generates fdb notifications with
ifindex = 0. Another unrelated memory leak is the bridge sysfs entries
which get added on NETDEV_REGISTER event, but are not cleaned up in the
newlink error path. To remove this special case the call to
br_stp_change_bridge_id is done after netdev register and we cleanup the
bridge on changelink error via br_dev_delete to plug all leaks.
This patch makes netlink bridge destruction on newlink error the same as
dellink and ioctl del which is necessary since at that point we have a
fully initialized bridge device.
To reproduce the issue:
$ ip l add br0 address 00:11:22:33:44:55 type bridge group_fwd_mask 1
RTNETLINK answers: Invalid argument
$ rmmod bridge
[ 1822.142525] =============================================================================
[ 1822.143640] BUG bridge_fdb_cache (Tainted: G O ): Objects remaining in bridge_fdb_cache on __kmem_cache_shutdown()
[ 1822.144821] -----------------------------------------------------------------------------
[ 1822.145990] Disabling lock debugging due to kernel taint
[ 1822.146732] INFO: Slab 0x0000000092a844b2 objects=32 used=2 fp=0x00000000fef011b0 flags=0x1ffff8000000100
[ 1822.147700] CPU: 2 PID: 13584 Comm: rmmod Tainted: G B O 4.15.0-rc2+ #87
[ 1822.148578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 1822.150008] Call Trace:
[ 1822.150510] dump_stack+0x78/0xa9
[ 1822.151156] slab_err+0xb1/0xd3
[ 1822.151834] ? __kmalloc+0x1bb/0x1ce
[ 1822.152546] __kmem_cache_shutdown+0x151/0x28b
[ 1822.153395] shutdown_cache+0x13/0x144
[ 1822.154126] kmem_cache_destroy+0x1c0/0x1fb
[ 1822.154669] SyS_delete_module+0x194/0x244
[ 1822.155199] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1822.155773] entry_SYSCALL_64_fastpath+0x23/0x9a
[ 1822.156343] RIP: 0033:0x7f929bd38b17
[ 1822.156859] RSP: 002b:00007ffd160e9a98 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0
[ 1822.157728] RAX: ffffffffffffffda RBX: 00005578316ba090 RCX: 00007f929bd38b17
[ 1822.158422] RDX: 00007f929bd9ec60 RSI: 0000000000000800 RDI: 00005578316ba0f0
[ 1822.159114] RBP: 0000000000000003 R08: 00007f929bff5f20 R09: 00007ffd160e8a11
[ 1822.159808] R10: 00007ffd160e9860 R11: 0000000000000202 R12: 00007ffd160e8a80
[ 1822.160513] R13: 0000000000000000 R14: 0000000000000000 R15: 00005578316ba090
[ 1822.161278] INFO: Object 0x000000007645de29 @offset=0
[ 1822.161666] INFO: Object 0x00000000d5df2ab5 @offset=128
Fixes: 30313a3d5794 ("bridge: Handle IFLA_ADDRESS correctly when creating bridge device")
Fixes: 5b8d5429daa0 ("bridge: netlink: register netdevice before executing changelink")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: register_netdevice() was the last thing done in
br_dev_newlink(), so no extra cleanup is needed]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit bdcf0a423ea1c40bbb40e7ee483b50fc8aa3d758 upstream.
In testing, we found that nfsd threads may call set_groups in parallel
for the same entry cached in auth.unix.gid, racing in the call of
groups_sort, corrupting the groups for that entry and leading to
permission denials for the client.
This patch:
- Make groups_sort globally visible.
- Move the call to groups_sort to the modifiers of group_info
- Remove the call to groups_sort from set_groups
Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com
Signed-off-by: Thiago Rafael Becker <thiago.becker@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Acked-by: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2:
- Drop change in gss_rpc_xdr.c
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 30791ac41927ebd3e75486f9504b6d2280463bf0 upstream.
The MD5-key that belongs to a connection is identified by the peer's
IP-address. When we are in tcp_v4(6)_reqsk_send_ack(), we are replying
to an incoming segment from tcp_check_req() that failed the seq-number
checks.
Thus, to find the correct key, we need to use the skb's saddr and not
the daddr.
This bug seems to have been there since quite a while, but probably got
unnoticed because the consequences are not catastrophic. We will call
tcp_v4_reqsk_send_ack only to send a challenge-ACK back to the peer,
thus the connection doesn't really fail.
Fixes: 9501f9722922 ("tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash().")
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|