Age | Commit message (Collapse) | Author | Files | Lines |
|
A kernel crash occurrs when defragmented packet is fragmented
in ip_do_fragment().
In defragment routine, skb_orphan() is called and
skb->ip_defrag_offset is set. but skb->sk and
skb->ip_defrag_offset are same union member. so that
frag->sk is not NULL.
Hence crash occurrs in skb->sk check routine in ip_do_fragment() when
defragmented packet is fragmented.
test commands:
%iptables -t nat -I POSTROUTING -j MASQUERADE
%hping3 192.168.4.2 -s 1000 -p 2000 -d 60000
splat looks like:
[ 261.069429] kernel BUG at net/ipv4/ip_output.c:636!
[ 261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3
[ 261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600
[ 261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c
[ 261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202
[ 261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004
[ 261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8
[ 261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395
[ 261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4
[ 261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000
[ 261.174169] FS: 00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
[ 261.183012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0
[ 261.198158] Call Trace:
[ 261.199018] ? dst_output+0x180/0x180
[ 261.205011] ? save_trace+0x300/0x300
[ 261.209018] ? ip_copy_metadata+0xb00/0xb00
[ 261.213034] ? sched_clock_local+0xd4/0x140
[ 261.218158] ? kill_l4proto+0x120/0x120 [nf_conntrack]
[ 261.223014] ? rt_cpu_seq_stop+0x10/0x10
[ 261.227014] ? find_held_lock+0x39/0x1c0
[ 261.233008] ip_finish_output+0x51d/0xb50
[ 261.237006] ? ip_fragment.constprop.56+0x220/0x220
[ 261.243011] ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack]
[ 261.250152] ? rcu_is_watching+0x77/0x120
[ 261.255010] ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4]
[ 261.261033] ? nf_hook_slow+0xb1/0x160
[ 261.265007] ip_output+0x1c7/0x710
[ 261.269005] ? ip_mc_output+0x13f0/0x13f0
[ 261.273002] ? __local_bh_enable_ip+0xe9/0x1b0
[ 261.278152] ? ip_fragment.constprop.56+0x220/0x220
[ 261.282996] ? nf_hook_slow+0xb1/0x160
[ 261.287007] raw_sendmsg+0x21f9/0x4420
[ 261.291008] ? dst_output+0x180/0x180
[ 261.297003] ? sched_clock_cpu+0x126/0x170
[ 261.301003] ? find_held_lock+0x39/0x1c0
[ 261.306155] ? stop_critical_timings+0x420/0x420
[ 261.311004] ? check_flags.part.36+0x450/0x450
[ 261.315005] ? _raw_spin_unlock_irq+0x29/0x40
[ 261.320995] ? _raw_spin_unlock_irq+0x29/0x40
[ 261.326142] ? cyc2ns_read_end+0x10/0x10
[ 261.330139] ? raw_bind+0x280/0x280
[ 261.334138] ? sched_clock_cpu+0x126/0x170
[ 261.338995] ? check_flags.part.36+0x450/0x450
[ 261.342991] ? __lock_acquire+0x4500/0x4500
[ 261.348994] ? inet_sendmsg+0x11c/0x500
[ 261.352989] ? dst_output+0x180/0x180
[ 261.357012] inet_sendmsg+0x11c/0x500
[ ... ]
v2:
- clear skb->sk at reassembly routine.(Eric Dumarzet)
Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tls_sw_sendmsg() allocates plaintext and encrypted SG entries using
function sk_alloc_sg(). In case the number of SG entries hit
MAX_SKB_FRAGS, sk_alloc_sg() returns -ENOSPC and sets the variable for
current SG index to '0'. This leads to calling of function
tls_push_record() with 'sg_encrypted_num_elem = 0' and later causes
kernel crash. To fix this, set the number of SG elements to the number
of elements in plaintext/encrypted SG arrays in case sk_alloc_sg()
returns -ENOSPC.
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
According to the documentation in msg_zerocopy.rst, the SO_ZEROCOPY
flag was introduced because send(2) ignores unknown message flags and
any legacy application which was accidentally passing the equivalent of
MSG_ZEROCOPY earlier should not see any new behaviour.
Before commit f214f915e7db ("tcp: enable MSG_ZEROCOPY"), a send(2) call
which passed the equivalent of MSG_ZEROCOPY without setting SO_ZEROCOPY
would succeed. However, after that commit, it fails with -ENOBUFS. So
it appears that the SO_ZEROCOPY flag fails to fulfill its intended
purpose. Fix it.
Fixes: f214f915e7db ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When nla_put*() fails after nla_nest_start(), we need
to call nla_nest_cancel() to cancel the message, otherwise
we end up calling nla_nest_end() like a success.
Fixes: 0ed5269f9e41 ("net/sched: add tunnel option support to act_tunnel_key")
Cc: Davide Caratti <dcaratti@redhat.com>
Cc: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
__tipc_nl_compat_dumpit() uses a netlink_callback on stack,
so the only way to align it with other ->dumpit() call path
is calling tipc_dump_start() and tipc_dump_done() directly
inside it. Otherwise ->dumpit() would always get NULL from
cb->args[].
But tipc_dump_start() uses sock_net(cb->skb->sk) to retrieve
net pointer, the cb->skb here doesn't set skb->sk, the net pointer
is saved in msg->net instead, so introduce a helper function
__tipc_dump_start() to pass in msg->net.
Ying pointed out cb->args[0...3] are already used by other
callbacks on this call path, so we can't use cb->args[0] any
more, use cb->args[4] instead.
Fixes: 9a07efa9aea2 ("tipc: switch to rhashtable iterator")
Reported-and-tested-by: syzbot+e93a2c41f91b8e2c7d9b@syzkaller.appspotmail.com
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixes a compile warning.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When sending an skb, afiucv_hs_send() bails out on various error
conditions. But currently the caller has no way of telling whether the
skb was freed or not - resulting in potentially either
a) leaked skbs from iucv_send_ctrl(), or
b) double-free's from iucv_sock_sendmsg().
As dev_queue_xmit() will always consume the skb (even on error), be
consistent and also free the skb from all other error paths. This way
callers no longer need to care about managing the skb.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Inbound packets may have any combination of flag bits set in their iucv
header. If we don't know how to handle a specific combination, drop the
skb instead of leaking it.
To clarify what error is returned in this case, replace the hard-coded
0 with the corresponding macro.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If users try to install act_tunnel_key 'set' rules with duplicate values
of 'index', the tunnel metadata are allocated, but never released. Then,
kmemleak complains as follows:
# tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111
# echo clear > /sys/kernel/debug/kmemleak
# tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111
Error: TC IDR already exists.
We have an error talking to the kernel
# echo scan > /sys/kernel/debug/kmemleak
# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8800574e6c80 (size 256):
comm "tc", pid 5617, jiffies 4298118009 (age 57.990s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 1c e8 b0 ff ff ff ff ................
81 24 c2 ad ff ff ff ff 00 00 00 00 00 00 00 00 .$..............
backtrace:
[<00000000b7afbf4e>] tunnel_key_init+0x8a5/0x1800 [act_tunnel_key]
[<000000007d98fccd>] tcf_action_init_1+0x698/0xac0
[<0000000099b8f7cc>] tcf_action_init+0x15c/0x590
[<00000000dc60eebe>] tc_ctl_action+0x336/0x5c2
[<000000002f5a2f7d>] rtnetlink_rcv_msg+0x357/0x8e0
[<000000000bfe7575>] netlink_rcv_skb+0x124/0x350
[<00000000edab656f>] netlink_unicast+0x40f/0x5d0
[<00000000b322cdcb>] netlink_sendmsg+0x6e8/0xba0
[<0000000063d9d490>] sock_sendmsg+0xb3/0xf0
[<00000000f0d3315a>] ___sys_sendmsg+0x654/0x960
[<00000000c06cbd42>] __sys_sendmsg+0xd3/0x170
[<00000000ce72e4b0>] do_syscall_64+0xa5/0x470
[<000000005caa2d97>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<00000000fac1b476>] 0xffffffffffffffff
This problem theoretically happens also in case users attempt to setup a
geneve rule having wrong configuration data, or when the kernel fails to
allocate 'params_new'. Ensure that tunnel_key_init() releases the tunnel
metadata also in the above conditions.
Addresses-Coverity-ID: 1373974 ("Resource leak")
Fixes: d0f6dd8a914f4 ("net/sched: Introduce act_tunnel_key")
Fixes: 0ed5269f9e41f ("net/sched: add tunnel option support to act_tunnel_key")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before we unlock the sock in tipc_release(), we have to
detach sk->sk_socket from sk, otherwise a parallel
tipc_sk_fill_sock_diag() could stil read it after we
free this socket.
Fixes: c30b70deb5f4 ("tipc: implement socket diagnostics for AF_TIPC")
Reported-and-tested-by: syzbot+48804b87c16588ad491d@syzkaller.appspotmail.com
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
1) Must perform TXQ teardown before unregistering interfaces in
mac80211, from Toke Høiland-Jørgensen.
2) Don't allow creating mac80211_hwsim with less than one channel, from
Johannes Berg.
3) Division by zero in cfg80211, fix from Johannes Berg.
4) Fix endian issue in tipc, from Haiqing Bai.
5) BPF sockmap use-after-free fixes from Daniel Borkmann.
6) Spectre-v1 in mac80211_hwsim, from Jinbum Park.
7) Missing rhashtable_walk_exit() in tipc, from Cong Wang.
8) Revert kvzalloc() conversion of AF_PACKET, it breaks mmap() when
kvzalloc() tries to use kmalloc() pages. From Eric Dumazet.
9) Fix deadlock in hv_netvsc, from Dexuan Cui.
10) Do not restart timewait timer on RST, from Florian Westphal.
11) Fix double lwstate refcount grab in ipv6, from Alexey Kodanev.
12) Unsolicit report count handling is off-by-one, fix from Hangbin Liu.
13) Sleep-in-atomic in cadence driver, from Jia-Ju Bai.
14) Respect ttl-inherit in ip6 tunnel driver, from Hangbin Liu.
15) Use-after-free in act_ife, fix from Cong Wang.
16) Missing hold to meta module in act_ife, from Vlad Buslov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits)
net: phy: sfp: Handle unimplemented hwmon limits and alarms
net: sched: action_ife: take reference to meta module
act_ife: fix a potential use-after-free
net/mlx5: Fix SQ offset in QPs with small RQ
tipc: correct spelling errors for tipc_topsrv_queue_evt() comments
tipc: correct spelling errors for struct tipc_bc_base's comment
bnxt_en: Do not adjust max_cp_rings by the ones used by RDMA.
bnxt_en: Clean up unused functions.
bnxt_en: Fix firmware signaled resource change logic in open.
sctp: not traverse asoc trans list if non-ipv6 trans exists for ipv6_flowlabel
sctp: fix invalid reference to the index variable of the iterator
net/ibm/emac: wrong emac_calc_base call was used by typo
net: sched: null actions array pointer before releasing action
vhost: fix VHOST_GET_BACKEND_FEATURES ioctl request definition
r8169: add support for NCube 8168 network card
ip6_tunnel: respect ttl inherit for ip6tnl
mac80211: shorten the IBSS debug messages
mac80211: don't Tx a deauth frame if the AP forbade Tx
mac80211: Fix station bandwidth setting after channel switch
mac80211: fix a race between restart and CSA flows
...
|
|
Recent refactoring of add_metainfo() caused use_all_metadata() to add
metainfo to ife action metalist without taking reference to module. This
causes warning in module_put called from ife action cleanup function.
Implement add_metainfo_and_get_ops() function that returns with reference
to module taken if metainfo was added successfully, and call it from
use_all_metadata(), instead of calling __add_metainfo() directly.
Example warning:
[ 646.344393] WARNING: CPU: 1 PID: 2278 at kernel/module.c:1139 module_put+0x1cb/0x230
[ 646.352437] Modules linked in: act_meta_skbtcindex act_meta_mark act_meta_skbprio act_ife ife veth nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c tun ebtable_filter ebtables ip6table_filter ip6_tables bridge stp llc mlx5_ib ib_uverbs ib_core intel_rapl sb_edac x86_pkg_temp_thermal mlx5_core coretemp kvm_intel kvm nfsd igb irqbypass crct10dif_pclmul devlink crc32_pclmul mei_me joydev ses crc32c_intel enclosure auth_rpcgss i2c_algo_bit ioatdma ptp mei pps_core ghash_clmulni_intel iTCO_wdt iTCO_vendor_support pcspkr dca ipmi_ssif lpc_ich target_core_mod i2c_i801 ipmi_si ipmi_devintf pcc_cpufreq wmi ipmi_msghandler nfs_acl lockd acpi_pad acpi_power_meter grace sunrpc mpt3sas raid_class scsi_transport_sas
[ 646.425631] CPU: 1 PID: 2278 Comm: tc Not tainted 4.19.0-rc1+ #799
[ 646.432187] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 646.440595] RIP: 0010:module_put+0x1cb/0x230
[ 646.445238] Code: f3 66 94 02 e8 26 ff fa ff 85 c0 74 11 0f b6 1d 51 30 94 02 80 fb 01 77 60 83 e3 01 74 13 65 ff 0d 3a 83 db 73 e9 2b ff ff ff <0f> 0b e9 00 ff ff ff e8 59 01 fb ff 85 c0 75 e4 48 c7 c2 20 62 6b
[ 646.464997] RSP: 0018:ffff880354d37068 EFLAGS: 00010286
[ 646.470599] RAX: 0000000000000000 RBX: ffffffffc0a52518 RCX: ffffffff8c2668db
[ 646.478118] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffffffffc0a52518
[ 646.485641] RBP: ffffffffc0a52180 R08: fffffbfff814a4a4 R09: fffffbfff814a4a3
[ 646.493164] R10: ffffffffc0a5251b R11: fffffbfff814a4a4 R12: 1ffff1006a9a6e0d
[ 646.500687] R13: 00000000ffffffff R14: ffff880362bab890 R15: dead000000000100
[ 646.508213] FS: 00007f4164c99800(0000) GS:ffff88036fe40000(0000) knlGS:0000000000000000
[ 646.516961] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 646.523080] CR2: 00007f41638b8420 CR3: 0000000351df0004 CR4: 00000000001606e0
[ 646.530595] Call Trace:
[ 646.533408] ? find_symbol_in_section+0x260/0x260
[ 646.538509] tcf_ife_cleanup+0x11b/0x200 [act_ife]
[ 646.543695] tcf_action_cleanup+0x29/0xa0
[ 646.548078] __tcf_action_put+0x5a/0xb0
[ 646.552289] ? nla_put+0x65/0xe0
[ 646.555889] __tcf_idr_release+0x48/0x60
[ 646.560187] tcf_generic_walker+0x448/0x6b0
[ 646.564764] ? tcf_action_dump_1+0x450/0x450
[ 646.569411] ? __lock_is_held+0x84/0x110
[ 646.573720] ? tcf_ife_walker+0x10c/0x20f [act_ife]
[ 646.578982] tca_action_gd+0x972/0xc40
[ 646.583129] ? tca_get_fill.constprop.17+0x250/0x250
[ 646.588471] ? mark_lock+0xcf/0x980
[ 646.592324] ? check_chain_key+0x140/0x1f0
[ 646.596832] ? debug_show_all_locks+0x240/0x240
[ 646.601839] ? memset+0x1f/0x40
[ 646.605350] ? nla_parse+0xca/0x1a0
[ 646.609217] tc_ctl_action+0x215/0x230
[ 646.613339] ? tcf_action_add+0x220/0x220
[ 646.617748] rtnetlink_rcv_msg+0x56a/0x6d0
[ 646.622227] ? rtnl_fdb_del+0x3f0/0x3f0
[ 646.626466] netlink_rcv_skb+0x18d/0x200
[ 646.630752] ? rtnl_fdb_del+0x3f0/0x3f0
[ 646.634959] ? netlink_ack+0x500/0x500
[ 646.639106] netlink_unicast+0x2d0/0x370
[ 646.643409] ? netlink_attachskb+0x340/0x340
[ 646.648050] ? _copy_from_iter_full+0xe9/0x3e0
[ 646.652870] ? import_iovec+0x11e/0x1c0
[ 646.657083] netlink_sendmsg+0x3b9/0x6a0
[ 646.661388] ? netlink_unicast+0x370/0x370
[ 646.665877] ? netlink_unicast+0x370/0x370
[ 646.670351] sock_sendmsg+0x6b/0x80
[ 646.674212] ___sys_sendmsg+0x4a1/0x520
[ 646.678443] ? copy_msghdr_from_user+0x210/0x210
[ 646.683463] ? lock_downgrade+0x320/0x320
[ 646.687849] ? debug_show_all_locks+0x240/0x240
[ 646.692760] ? do_raw_spin_unlock+0xa2/0x130
[ 646.697418] ? _raw_spin_unlock+0x24/0x30
[ 646.701798] ? __handle_mm_fault+0x1819/0x1c10
[ 646.706619] ? __pmd_alloc+0x320/0x320
[ 646.710738] ? debug_show_all_locks+0x240/0x240
[ 646.715649] ? restore_nameidata+0x7b/0xa0
[ 646.720117] ? check_chain_key+0x140/0x1f0
[ 646.724590] ? check_chain_key+0x140/0x1f0
[ 646.729070] ? __fget_light+0xbc/0xd0
[ 646.733121] ? __sys_sendmsg+0xd7/0x150
[ 646.737329] __sys_sendmsg+0xd7/0x150
[ 646.741359] ? __ia32_sys_shutdown+0x30/0x30
[ 646.746003] ? up_read+0x53/0x90
[ 646.749601] ? __do_page_fault+0x484/0x780
[ 646.754105] ? do_syscall_64+0x1e/0x2c0
[ 646.758320] do_syscall_64+0x72/0x2c0
[ 646.762353] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 646.767776] RIP: 0033:0x7f4163872150
[ 646.771713] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
[ 646.791474] RSP: 002b:00007ffdef7d6b58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 646.799721] RAX: ffffffffffffffda RBX: 0000000000000024 RCX: 00007f4163872150
[ 646.807240] RDX: 0000000000000000 RSI: 00007ffdef7d6bd0 RDI: 0000000000000003
[ 646.814760] RBP: 000000005b8b9482 R08: 0000000000000001 R09: 0000000000000000
[ 646.822286] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffdef7dad20
[ 646.829807] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000679bc0
[ 646.837360] irq event stamp: 6083
[ 646.841043] hardirqs last enabled at (6081): [<ffffffff8c220a7d>] __call_rcu+0x17d/0x500
[ 646.849882] hardirqs last disabled at (6083): [<ffffffff8c004f06>] trace_hardirqs_off_thunk+0x1a/0x1c
[ 646.859775] softirqs last enabled at (5968): [<ffffffff8d4004a1>] __do_softirq+0x4a1/0x6ee
[ 646.868784] softirqs last disabled at (6082): [<ffffffffc0a78759>] tcf_ife_cleanup+0x39/0x200 [act_ife]
[ 646.878845] ---[ end trace b1b8c12ffe51e657 ]---
Fixes: 5ffe57da29b3 ("act_ife: fix a potential deadlock")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Immediately after module_put(), user could delete this
module, so e->ops could be already freed before we call
e->ops->release().
Fix this by moving module_put() after ops->release().
Fixes: ef6980b6becb ("introduce IFE action")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Here are quite a large number of fixes, notably:
* various A-MSDU building fixes (currently only affects mt76)
* syzkaller & spectre fixes in hwsim
* TXQ vs. teardown fix that was causing crashes
* embed WMM info in reg rule, bad code here had been causing crashes
* one compilation issue with fix from Arnd (rfkill-gpio includes)
* fixes for a race and bad data during/after channel switch
* nl80211: a validation fix, attribute type & unit fixes
along with other small fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tipc_conn_queue_evt -> tipc_topsrv_queue_evt
tipc_send_work -> tipc_conn_send_work
tipc_send_to_sock -> tipc_conn_send_to_sock
Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Trivial fix for two spelling mistakes.
Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When users set params.spp_address and get a trans, ipv6_flowlabel flag
should be applied into this trans. But even if this one is not an ipv6
trans, it should not go to apply it into all other transes of the asoc
but simply ignore it.
Fixes: 0b0dce7a36fb ("sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now in sctp_apply_peer_addr_params(), if SPP_IPV6_FLOWLABEL flag is set
and trans is NULL, it would use trans as the index variable to traverse
transport_addr_list, then trans is set as the last transport of it.
Later, if SPP_DSCP flag is set, it would enter into the wrong branch as
trans is actually an invalid reference.
So fix it by using a new index variable to traverse transport_addr_list
for both SPP_DSCP and SPP_IPV6_FLOWLABEL flags process.
Fixes: 0b0dce7a36fb ("sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams")
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, tcf_action_delete() nulls actions array pointer after putting
and deleting it. However, if tcf_idr_delete_index() returns an error,
pointer to action is not set to null. That results it being released second
time in error handling code of tca_action_gd().
Kasan error:
[ 807.367755] ==================================================================
[ 807.375844] BUG: KASAN: use-after-free in tc_setup_cb_call+0x14e/0x250
[ 807.382763] Read of size 8 at addr ffff88033e636000 by task tc/2732
[ 807.391289] CPU: 0 PID: 2732 Comm: tc Tainted: G W 4.19.0-rc1+ #799
[ 807.399542] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 807.407948] Call Trace:
[ 807.410763] dump_stack+0x92/0xeb
[ 807.414456] print_address_description+0x70/0x360
[ 807.419549] kasan_report+0x14d/0x300
[ 807.423582] ? tc_setup_cb_call+0x14e/0x250
[ 807.428150] tc_setup_cb_call+0x14e/0x250
[ 807.432539] ? nla_put+0x65/0xe0
[ 807.436146] fl_dump+0x394/0x3f0 [cls_flower]
[ 807.440890] ? fl_tmplt_dump+0x140/0x140 [cls_flower]
[ 807.446327] ? lock_downgrade+0x320/0x320
[ 807.450702] ? lock_acquire+0xe2/0x220
[ 807.454819] ? is_bpf_text_address+0x5/0x140
[ 807.459475] ? memcpy+0x34/0x50
[ 807.462980] ? nla_put+0x65/0xe0
[ 807.466582] tcf_fill_node+0x341/0x430
[ 807.470717] ? tcf_block_put+0xe0/0xe0
[ 807.474859] tcf_node_dump+0xdb/0xf0
[ 807.478821] fl_walk+0x8e/0x170 [cls_flower]
[ 807.483474] tcf_chain_dump+0x35a/0x4d0
[ 807.487703] ? tfilter_notify+0x170/0x170
[ 807.492091] ? tcf_fill_node+0x430/0x430
[ 807.496411] tc_dump_tfilter+0x362/0x3f0
[ 807.500712] ? tc_del_tfilter+0x850/0x850
[ 807.505104] ? kasan_unpoison_shadow+0x30/0x40
[ 807.509940] ? __mutex_unlock_slowpath+0xcf/0x410
[ 807.515031] netlink_dump+0x263/0x4f0
[ 807.519077] __netlink_dump_start+0x2a0/0x300
[ 807.523817] ? tc_del_tfilter+0x850/0x850
[ 807.528198] rtnetlink_rcv_msg+0x46a/0x6d0
[ 807.532671] ? rtnl_fdb_del+0x3f0/0x3f0
[ 807.536878] ? tc_del_tfilter+0x850/0x850
[ 807.541280] netlink_rcv_skb+0x18d/0x200
[ 807.545570] ? rtnl_fdb_del+0x3f0/0x3f0
[ 807.549773] ? netlink_ack+0x500/0x500
[ 807.553913] netlink_unicast+0x2d0/0x370
[ 807.558212] ? netlink_attachskb+0x340/0x340
[ 807.562855] ? _copy_from_iter_full+0xe9/0x3e0
[ 807.567677] ? import_iovec+0x11e/0x1c0
[ 807.571890] netlink_sendmsg+0x3b9/0x6a0
[ 807.576192] ? netlink_unicast+0x370/0x370
[ 807.580684] ? netlink_unicast+0x370/0x370
[ 807.585154] sock_sendmsg+0x6b/0x80
[ 807.589015] ___sys_sendmsg+0x4a1/0x520
[ 807.593230] ? copy_msghdr_from_user+0x210/0x210
[ 807.598232] ? do_wp_page+0x174/0x880
[ 807.602276] ? __handle_mm_fault+0x749/0x1c10
[ 807.607021] ? __handle_mm_fault+0x1046/0x1c10
[ 807.611849] ? __pmd_alloc+0x320/0x320
[ 807.615973] ? check_chain_key+0x140/0x1f0
[ 807.620450] ? check_chain_key+0x140/0x1f0
[ 807.624929] ? __fget_light+0xbc/0xd0
[ 807.628970] ? __sys_sendmsg+0xd7/0x150
[ 807.633172] __sys_sendmsg+0xd7/0x150
[ 807.637201] ? __ia32_sys_shutdown+0x30/0x30
[ 807.641846] ? up_read+0x53/0x90
[ 807.645442] ? __do_page_fault+0x484/0x780
[ 807.649949] ? do_syscall_64+0x1e/0x2c0
[ 807.654164] do_syscall_64+0x72/0x2c0
[ 807.658198] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 807.663625] RIP: 0033:0x7f42e9870150
[ 807.667568] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
[ 807.687328] RSP: 002b:00007ffdbf595b58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 807.695564] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f42e9870150
[ 807.703083] RDX: 0000000000000000 RSI: 00007ffdbf595b80 RDI: 0000000000000003
[ 807.710605] RBP: 00007ffdbf599d90 R08: 0000000000679bc0 R09: 000000000000000f
[ 807.718127] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffdbf599d88
[ 807.725651] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 807.735048] Allocated by task 2687:
[ 807.738902] kasan_kmalloc+0xa0/0xd0
[ 807.742852] __kmalloc+0x118/0x2d0
[ 807.746615] tcf_idr_create+0x44/0x320
[ 807.750738] tcf_nat_init+0x41e/0x530 [act_nat]
[ 807.755638] tcf_action_init_1+0x4e0/0x650
[ 807.760104] tcf_action_init+0x1ce/0x2d0
[ 807.764395] tcf_exts_validate+0x1d8/0x200
[ 807.768861] fl_change+0x55a/0x26b4 [cls_flower]
[ 807.773845] tc_new_tfilter+0x748/0xa20
[ 807.778051] rtnetlink_rcv_msg+0x56a/0x6d0
[ 807.782517] netlink_rcv_skb+0x18d/0x200
[ 807.786804] netlink_unicast+0x2d0/0x370
[ 807.791095] netlink_sendmsg+0x3b9/0x6a0
[ 807.795387] sock_sendmsg+0x6b/0x80
[ 807.799240] ___sys_sendmsg+0x4a1/0x520
[ 807.803445] __sys_sendmsg+0xd7/0x150
[ 807.807473] do_syscall_64+0x72/0x2c0
[ 807.811506] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 807.818776] Freed by task 2728:
[ 807.822283] __kasan_slab_free+0x122/0x180
[ 807.826752] kfree+0xf4/0x2f0
[ 807.830080] __tcf_action_put+0x5a/0xb0
[ 807.834281] tcf_action_put_many+0x46/0x70
[ 807.838747] tca_action_gd+0x232/0xc40
[ 807.842862] tc_ctl_action+0x215/0x230
[ 807.846977] rtnetlink_rcv_msg+0x56a/0x6d0
[ 807.851444] netlink_rcv_skb+0x18d/0x200
[ 807.855731] netlink_unicast+0x2d0/0x370
[ 807.860021] netlink_sendmsg+0x3b9/0x6a0
[ 807.864312] sock_sendmsg+0x6b/0x80
[ 807.868166] ___sys_sendmsg+0x4a1/0x520
[ 807.872372] __sys_sendmsg+0xd7/0x150
[ 807.876401] do_syscall_64+0x72/0x2c0
[ 807.880431] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 807.887704] The buggy address belongs to the object at ffff88033e636000
which belongs to the cache kmalloc-256 of size 256
[ 807.900909] The buggy address is located 0 bytes inside of
256-byte region [ffff88033e636000, ffff88033e636100)
[ 807.913155] The buggy address belongs to the page:
[ 807.918322] page:ffffea000cf98d80 count:1 mapcount:0 mapping:ffff88036f80ee00 index:0x0 compound_mapcount: 0
[ 807.928831] flags: 0x5fff8000008100(slab|head)
[ 807.933647] raw: 005fff8000008100 ffffea000db44f00 0000000400000004 ffff88036f80ee00
[ 807.942050] raw: 0000000000000000 0000000080190019 00000001ffffffff 0000000000000000
[ 807.950456] page dumped because: kasan: bad access detected
[ 807.958240] Memory state around the buggy address:
[ 807.963405] ffff88033e635f00: fc fc fc fc fb fb fb fb fb fb fb fc fc fc fc fb
[ 807.971288] ffff88033e635f80: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
[ 807.979166] >ffff88033e636000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 807.994882] ^
[ 807.998477] ffff88033e636080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 808.006352] ffff88033e636100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[ 808.014230] ==================================================================
[ 808.022108] Disabling lock debugging due to kernel taint
Fixes: edfaf94fa705 ("net_sched: improve and refactor tcf_action_put_many()")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
man ip-tunnel ttl section says:
0 is a special value meaning that packets inherit the TTL value.
IPv4 tunnel respect this in ip_tunnel_xmit(), but IPv6 tunnel has not
implement it yet. To make IPv6 behave consistently with IP tunnel,
add ipv6 tunnel inherit support.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When tracing is enabled, all the debug messages are recorded and must
not exceed MAX_MSG_LEN (100) columns. Longer debug messages grant the
user with:
WARNING: CPU: 3 PID: 32642 at /tmp/wifi-core-20180806094828/src/iwlwifi-stack-dev/net/mac80211/./trace_msg.h:32 trace_event_raw_event_mac80211_msg_event+0xab/0xc0 [mac80211]
Workqueue: phy1 ieee80211_iface_work [mac80211]
RIP: 0010:trace_event_raw_event_mac80211_msg_event+0xab/0xc0 [mac80211]
Call Trace:
__sdata_dbg+0xbd/0x120 [mac80211]
ieee80211_ibss_rx_queued_mgmt+0x15f/0x510 [mac80211]
ieee80211_iface_work+0x21d/0x320 [mac80211]
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If the driver fails to properly prepare for the channel
switch, mac80211 will disconnect. If the CSA IE had mode
set to 1, it means that the clients are not allowed to send
any Tx on the current channel, and that includes the
deauthentication frame.
Make sure that we don't send the deauthentication frame in
this case.
In iwlwifi, this caused a failure to flush queues since the
firmware already closed the queues after having parsed the
CSA IE. Then mac80211 would wait until the deauthentication
frame would go out (drv_flush(drop=false)) and that would
never happen.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When performing a channel switch flow for a managed interface, the
flow did not update the bandwidth of the AP station and the rate
scale algorithm. In case of a channel width downgrade, this would
result with the rate scale algorithm using a bandwidth that does not
match the interface channel configuration.
Fix this by updating the AP station bandwidth and rate scaling algorithm
before the actual channel change in case of a bandwidth downgrade, or
after the actual channel change in case of a bandwidth upgrade.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
We hit a problem with iwlwifi that was caused by a bug in
mac80211. A bug in iwlwifi caused the firwmare to crash in
certain cases in channel switch. Because of that bug,
drv_pre_channel_switch would fail and trigger the restart
flow.
Now we had the hw restart worker which runs on the system's
workqueue and the csa_connection_drop_work worker that runs
on mac80211's workqueue that can run together. This is
obviously problematic since the restart work wants to
reconfigure the connection, while the csa_connection_drop_work
worker does the exact opposite: it tries to disconnect.
Fix this by cancelling the csa_connection_drop_work worker
in the restart worker.
Note that this can sound racy: we could have:
driver iface_work CSA_work restart_work
+++++++++++++++++++++++++++++++++++++++++++++
|
<--drv_cs ---|
<FW CRASH!>
-CS FAILED-->
| |
| cancel_work(CSA)
schedule |
CSA work |
| |
Race between those 2
But this is not possible because we flush the workqueue
in the restart worker before we cancel the CSA worker.
That would be bullet proof if we could guarantee that
we schedule the CSA worker only from the iface_work
which runs on the workqueue (and not on the system's
workqueue), but unfortunately we do have an instance
in which we schedule the CSA work outside the context
of the workqueue (ieee80211_chswitch_done).
Note also that we should probably cancel other workers
like beacon_connection_loss_work and possibly others
for different types of interfaces, at the very least,
IBSS should suffer from the exact same problem, but for
now, do the minimum to fix the actual bug that was actually
experienced and reproduced.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
In commit 9236c4523e5b ("mac80211: limit wmm params to comply
with ETSI requirements"), we have limited the WMM parameters to
comply with 802.11 and ETSI standard. Mistakenly the TXOP value
was caluclated wrong. Fix it by taking the minimum between
802.11 to ETSI to make sure we are not violating both.
Fixes: e552af058148 ("mac80211: limit wmm params to comply with ETSI requirements")
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The "chandef->center_freq1" variable is a u32 but "freq" is a u16 so we
are truncating away the high bits. I noticed this bug because in commit
9cf0a0b4b64a ("cfg80211: Add support for 60GHz band channels 5 and 6")
we made "freq <= 56160 + 2160 * 6" a valid requency when before it was
only "freq <= 56160 + 2160 * 4" that was valid. It introduces a static
checker warning:
net/wireless/util.c:1571 ieee80211_chandef_to_operating_class()
warn: always true condition '(freq <= 56160 + 2160 * 6) => (0-u16max <= 69120)'
But really we probably shouldn't have been truncating the high bits
away to begin with.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Initialize 'n' to 2 in order to take into account also the first
packet in the estimation of max_subframe limit for a given A-MSDU
since frag_tail pointer is NULL when ieee80211_amsdu_aggregate
routine analyzes the second frame.
Fixes: 6e0456b54545 ("mac80211: add A-MSDU tx support")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2018-09-02
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix one remaining buggy offset override in sockmap's bpf_msg_pull_data()
when linearizing multiple scatterlist elements, from Tushar.
2) Fix BPF sockmap's misuse of ULP when a collision with another ULP is
found on map update where it would release existing ULP. syzbot found and
triggered this couple of times now, fix from John.
3) Add missing xskmap type to bpftool so it will properly show the type
on map dump, from Prashant.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jan reported a regression after an update to 4.18.5. In this case ipv6
default route is setup by systemd-networkd based on data from an RA. The
RA contains an MTU of 1492 which is used when the route is first inserted
but then systemd-networkd pushes down updates to the default route
without the mtu set.
Prior to the change to fib6_info, metrics such as MTU were held in the
dst_entry and rt6i_pmtu in rt6_info contained an update to the mtu if
any. ip6_mtu would look at rt6i_pmtu first and use it if set. If not,
the value from the metrics is used if it is set and finally falling
back to the idev value.
After the fib6_info change metrics are contained in the fib6_info struct
and there is no equivalent to rt6i_pmtu. To maintain consistency with
the old behavior the new code should only reset the MTU in the metrics
if the route update has it set.
Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
Reported-by: Jan Janssen <medhefgo@web.de>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After link down and up, i.e. when call ip_mc_up(), we doesn't init
im->unsolicit_count. So after igmp_timer_expire(), we will not start
timer again and only send one unsolicit report at last.
Fix it by initializing im->unsolicit_count in igmp_group_added(), so
we can respect igmp robustness value.
Fixes: 24803f38a5c0b ("igmp: do not remove igmp souce list info when set link down")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We should not start timer if im->unsolicit_count equal to 0 after decrease.
Or we will send one more unsolicit report message. i.e. 3 instead of 2 by
default.
Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Helper bpf_msg_pull_data() mistakenly reuses variable 'offset' while
linearizing multiple scatterlist elements. Variable 'offset' is used
to find first starting scatterlist element
i.e. msg->data = sg_virt(&sg[first_sg]) + start - offset"
Use different variable name while linearizing multiple scatterlist
elements so that value contained in variable 'offset' won't get
overwritten.
Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
"A small set of updates for core code:
- Prevent tracing in functions which are called from trace patching
via stop_machine() to prevent executing half patched function trace
entries.
- Remove old GCC workarounds
- Remove pointless includes of notifier.h"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Remove workaround for unreachable warnings from old GCC
notifier: Remove notifier header file wherever not used
watchdog: Mark watchdog touch functions as notrace
|
|
Commit 80f1a0f4e0cd ("net/ipv6: Put lwtstate when destroying fib6_info")
partially fixed the kmemleak [1], lwtstate can be copied from fib6_info,
with ip6_rt_copy_init(), and it should be done only once there.
rt->dst.lwtstate is set by ip6_rt_init_dst(), at the start of the function
ip6_rt_copy_init(), so there is no need to get it again at the end.
With this patch, lwtstate also isn't copied from RTF_REJECT routes.
[1]:
unreferenced object 0xffff880b6aaa14e0 (size 64):
comm "ip", pid 10577, jiffies 4295149341 (age 1273.903s)
hex dump (first 32 bytes):
01 00 04 00 04 00 00 00 10 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<0000000018664623>] lwtunnel_build_state+0x1bc/0x420
[<00000000b73aa29a>] ip6_route_info_create+0x9f7/0x1fd0
[<00000000ee2c5d1f>] ip6_route_add+0x14/0x70
[<000000008537b55c>] inet6_rtm_newroute+0xd9/0xe0
[<000000002acc50f5>] rtnetlink_rcv_msg+0x66f/0x8e0
[<000000008d9cd381>] netlink_rcv_skb+0x268/0x3b0
[<000000004c893c76>] netlink_unicast+0x417/0x5a0
[<00000000f2ab1afb>] netlink_sendmsg+0x70b/0xc30
[<00000000890ff0aa>] sock_sendmsg+0xb1/0xf0
[<00000000a2e7b66f>] ___sys_sendmsg+0x659/0x950
[<000000001e7426c8>] __sys_sendmsg+0xde/0x170
[<00000000fe411443>] do_syscall_64+0x9f/0x4a0
[<000000001be7b28b>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<000000006d21f353>] 0xffffffffffffffff
Fixes: 6edb3c96a5f0 ("net/ipv6: Defer initialization of dst to data path")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RFC 1337 says:
''Ignore RST segments in TIME-WAIT state.
If the 2 minute MSL is enforced, this fix avoids all three hazards.''
So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk
expire rather than removing it instantly when a reset is received.
However, Linux will also re-start the TIME-WAIT timer.
This causes connect to fail when tying to re-use ports or very long
delays (until syn retry interval exceeds MSL).
packetdrill test case:
// Demonstrate bogus rearming of TIME-WAIT timer in rfc1337 mode.
`sysctl net.ipv4.tcp_rfc1337=1`
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0
0.100 < S 0:0(0) win 29200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.200 < . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
// Receive first segment
0.310 < P. 1:1001(1000) ack 1 win 46
// Send one ACK
0.310 > . 1:1(0) ack 1001
// read 1000 byte
0.310 read(4, ..., 1000) = 1000
// Application writes 100 bytes
0.350 write(4, ..., 100) = 100
0.350 > P. 1:101(100) ack 1001
// ACK
0.500 < . 1001:1001(0) ack 101 win 257
// close the connection
0.600 close(4) = 0
0.600 > F. 101:101(0) ack 1001 win 244
// Our side is in FIN_WAIT_1 & waits for ack to fin
0.7 < . 1001:1001(0) ack 102 win 244
// Our side is in FIN_WAIT_2 with no outstanding data.
0.8 < F. 1001:1001(0) ack 102 win 244
0.8 > . 102:102(0) ack 1002 win 244
// Our side is now in TIME_WAIT state, send ack for fin.
0.9 < F. 1002:1002(0) ack 102 win 244
0.9 > . 102:102(0) ack 1002 win 244
// Peer reopens with in-window SYN:
1.000 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
// Therefore, reply with ACK.
1.000 > . 102:102(0) ack 1002 win 244
// Peer sends RST for this ACK. Normally this RST results
// in tw socket removal, but rfc1337=1 setting prevents this.
1.100 < R 1002:1002(0) win 244
// second syn. Due to rfc1337=1 expect another pure ACK.
31.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
31.0 > . 102:102(0) ack 1002 win 244
// .. and another RST from peer.
31.1 < R 1002:1002(0) win 244
31.2 `echo no timer restart;ss -m -e -a -i -n -t -o state TIME-WAIT`
// third syn after one minute. Time-Wait socket should have expired by now.
63.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
// so we expect a syn-ack & 3whs to proceed from here on.
63.0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
Without this patch, 'ss' shows restarts of tw timer and last packet is
thus just another pure ack, more than one minute later.
This restores the original code from commit 283fd6cf0be690a83
("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git .
For some reason the else branch was removed/lost in 1f28b683339f7
("Merge in TCP/UDP optimizations and [..]") and timer restart became
unconditional.
Reported-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Getting prompt "The RDS Protocol" (RDS) is not too helpful, and it is
easily confused with Radio Data System (which we may want to support
in kernel, too).
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 71e41286203c017d24f041a7cd71abea7ca7b1e0.
mmap()/munmap() can not be backed by kmalloced pages :
We fault in :
VM_BUG_ON_PAGE(PageSlab(page), page);
unmap_single_vma+0x8a/0x110
unmap_vmas+0x4b/0x90
unmap_region+0xc9/0x140
do_munmap+0x274/0x360
vm_munmap+0x81/0xc0
SyS_munmap+0x2b/0x40
do_syscall_64+0x13e/0x1c0
entry_SYSCALL_64_after_hwframe+0x42/0xb7
Fixes: 71e41286203c ("packet: switch kvzalloc to allocate memory")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: John Sperbeck <jsperbeck@google.com>
Bisected-by: John Sperbeck <jsperbeck@google.com>
Cc: Zhang Yu <zhangyu31@baidu.com>
Cc: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The conversion of the hotplug notifiers to a state machine left the
notifier.h includes around in some places. Remove them.
Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1535114033-4605-1-git-send-email-mojha@codeaurora.org
|
|
In the error path of changing the SKB headroom of the second
A-MSDU subframe, we would not account for the already-changed
length of the first frame that just got converted to be in
A-MSDU format and thus is a bit longer now.
Fix this by doing the necessary accounting.
It would be possible to reorder the operations, but that would
make the code more complex (to calculate the necessary pad),
and the headroom expansion should not fail frequently enough
to make that worthwhile.
Fixes: 6e0456b54545 ("mac80211: add A-MSDU tx support")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Do not start to aggregate packets in a A-MSDU frame (converting the
first subframe to A-MSDU, adding the header) if max_tx_fragments or
max_amsdu_subframes limits are already exceeded by it. In particular,
this happens when drivers set the limit to 1 to avoid A-MSDUs at all.
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
[reword commit message to be more precise]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
nl80211_update_ft_ies() tried to validate NL80211_ATTR_IE with
is_valid_ie_attr() before dereferencing it, but that helper function
returns true in case of NULL pointer (i.e., attribute not included).
This can result to dereferencing a NULL pointer. Fix that by explicitly
checking that NL80211_ATTR_IE is included.
Fixes: 355199e02b83 ("cfg80211: Extend support for IEEE 802.11r Fast BSS Transition")
Signed-off-by: Arunk Khandavalli <akhandav@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
After the commit 802bfb19152c ("net/sched: user-space can't set
unknown tcfa_action values"), unknown tcfa_action values are
converted to TC_ACT_UNSPEC, but the common agreement is instead
rejecting such configurations.
This change also introduces a helper to simplify the destruction
of a single action, avoiding code duplication.
v1 -> v2:
- helper is now static and renamed according to act_* convention
- updated extack message, according to the new behavior
Fixes: 802bfb19152c ("net/sched: user-space can't set unknown tcfa_action values")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
rtnl_unregister_all(PF_INET6) gets called from inet6_init in cases when
no handler has been registered for PF_INET6 yet, for example if
ip6_mr_init() fails. Abort and avoid a NULL pointer deref in that case.
Example of panic (triggered by faking a failure of
register_pernet_subsys):
general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
[...]
RIP: 0010:rtnl_unregister_all+0x17e/0x2a0
[...]
Call Trace:
? rtnetlink_net_init+0x250/0x250
? sock_unregister+0x103/0x160
? kernel_getsockopt+0x200/0x200
inet6_init+0x197/0x20d
Fixes: e2fddf5e96df ("[IPV6]: Make af_inet6 to check ip6_route_init return value.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.")
contains an error in the cleanup path of inet6_init(): when
proto_register(&pingv6_prot, 1) fails, we try to unregister
&pingv6_prot. When rawv6_init() fails, we skip unregistering
&pingv6_prot.
Example of panic (triggered by faking a failure of
proto_register(&pingv6_prot, 1)):
general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
[...]
RIP: 0010:__list_del_entry_valid+0x79/0x160
[...]
Call Trace:
proto_unregister+0xbb/0x550
? trace_preempt_on+0x6f0/0x6f0
? sock_no_shutdown+0x10/0x10
inet6_init+0x153/0x1b8
Fixes: 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 15e668070a64 ("ipv6: reorder icmpv6_init() and ip6_mr_init()")
moved the cleanup label for ipmr_fail, but should have changed the
contents of the cleanup labels as well. Now we can end up cleaning up
icmpv6 even though it hasn't been initialized (jump to icmp_fail or
ipmr_fail).
Simply undo things in the reverse order of their initialization.
Example of panic (triggered by faking a failure of icmpv6_init):
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
[...]
RIP: 0010:__list_del_entry_valid+0x79/0x160
[...]
Call Trace:
? lock_release+0x8a0/0x8a0
unregister_pernet_operations+0xd4/0x560
? ops_free_list+0x480/0x480
? down_write+0x91/0x130
? unregister_pernet_subsys+0x15/0x30
? down_read+0x1b0/0x1b0
? up_read+0x110/0x110
? kmem_cache_create_usercopy+0x1b4/0x240
unregister_pernet_subsys+0x1d/0x30
icmpv6_cleanup+0x1d/0x30
inet6_init+0x1b5/0x23f
Fixes: 15e668070a64 ("ipv6: reorder icmpv6_init() and ip6_mr_init()")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
in the (rare) case of failure in nla_nest_start(), missing NULL checks in
tcf_pedit_key_ex_dump() can make the following command
# tc action add action pedit ex munge ip ttl set 64
dereference a NULL pointer:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 800000007d1cd067 P4D 800000007d1cd067 PUD 7acd3067 PMD 0
Oops: 0002 [#1] SMP PTI
CPU: 0 PID: 3336 Comm: tc Tainted: G E 4.18.0.pedit+ #425
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:tcf_pedit_dump+0x19d/0x358 [act_pedit]
Code: be 02 00 00 00 48 89 df 66 89 44 24 20 e8 9b b1 fd e0 85 c0 75 46 8b 83 c8 00 00 00 49 83 c5 08 48 03 83 d0 00 00 00 4d 39 f5 <66> 89 04 25 00 00 00 00 0f 84 81 01 00 00 41 8b 45 00 48 8d 4c 24
RSP: 0018:ffffb5d4004478a8 EFLAGS: 00010246
RAX: ffff8880fcda2070 RBX: ffff8880fadd2900 RCX: 0000000000000000
RDX: 0000000000000002 RSI: ffffb5d4004478ca RDI: ffff8880fcda206e
RBP: ffff8880fb9cb900 R08: 0000000000000008 R09: ffff8880fcda206e
R10: ffff8880fadd2900 R11: 0000000000000000 R12: ffff8880fd26cf40
R13: ffff8880fc957430 R14: ffff8880fc957430 R15: ffff8880fb9cb988
FS: 00007f75a537a740(0000) GS:ffff8880fda00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000007a2fa005 CR4: 00000000001606f0
Call Trace:
? __nla_reserve+0x38/0x50
tcf_action_dump_1+0xd2/0x130
tcf_action_dump+0x6a/0xf0
tca_get_fill.constprop.31+0xa3/0x120
tcf_action_add+0xd1/0x170
tc_ctl_action+0x137/0x150
rtnetlink_rcv_msg+0x263/0x2d0
? _cond_resched+0x15/0x40
? rtnl_calcit.isra.30+0x110/0x110
netlink_rcv_skb+0x4d/0x130
netlink_unicast+0x1a3/0x250
netlink_sendmsg+0x2ae/0x3a0
sock_sendmsg+0x36/0x40
___sys_sendmsg+0x26f/0x2d0
? do_wp_page+0x8e/0x5f0
? handle_pte_fault+0x6c3/0xf50
? __handle_mm_fault+0x38e/0x520
? __sys_sendmsg+0x5e/0xa0
__sys_sendmsg+0x5e/0xa0
do_syscall_64+0x5b/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f75a4583ba0
Code: c3 48 8b 05 f2 62 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d fd c3 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24
RSP: 002b:00007fff60ee7418 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fff60ee7540 RCX: 00007f75a4583ba0
RDX: 0000000000000000 RSI: 00007fff60ee7490 RDI: 0000000000000003
RBP: 000000005b842d3e R08: 0000000000000002 R09: 0000000000000000
R10: 00007fff60ee6ea0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff60ee7554 R14: 0000000000000001 R15: 000000000066c100
Modules linked in: act_pedit(E) ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul ext4 crc32_pclmul mbcache ghash_clmulni_intel jbd2 pcbc snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer cryptd glue_helper snd joydev pcspkr soundcore virtio_balloon i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk virtio_console failover qxl crc32c_intel drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix virtio_pci libata virtio_ring i2c_core virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_pedit]
CR2: 0000000000000000
Like it's done for other TC actions, give up dumping pedit rules and return
an error if nla_nest_start() returns NULL.
Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the conventional network headers")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzbot reported a use-after-free in tipc_group_fill_sock_diag(),
where tipc_group_fill_sock_diag() still reads tsk->group meanwhile
tipc_group_delete() just deletes it in tipc_release().
tipc_nl_sk_walk() aims to lock this sock when walking each sock
in the hash table to close race conditions with sock changes like
this one, by acquiring tsk->sk.sk_lock.slock spinlock, unfortunately
this doesn't work at all. All non-BH call path should take
lock_sock() instead to make it work.
tipc_nl_sk_walk() brutally iterates with raw rht_for_each_entry_rcu()
where RCU read lock is required, this is the reason why lock_sock()
can't be taken on this path. This could be resolved by switching to
rhashtable iterator API's, where taking a sleepable lock is possible.
Also, the iterator API's are friendly for restartable calls like
diag dump, the last position is remembered behind the scence,
all we need to do here is saving the iterator into cb->args[].
I tested this with parallel tipc diag dump and thousands of tipc
socket creation and release, no crash or memory leak.
Reported-by: syzbot+b9c8f3ab2994b7cd1625@syzkaller.appspotmail.com
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
rhashtable_walk_exit() must be paired with rhashtable_walk_enter().
Fixes: 40f9f4397060 ("tipc: Fix tipc_sk_reinit race conditions")
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before the commit d6990976af7c ("vti6: fix PMTU caching and reporting
on xmit") '!skb->ignore_df' check was always true because the function
skb_scrub_packet() was called before it, resetting ignore_df to zero.
In the commit, skb_scrub_packet() was moved below, and now this check
can be false for the packet, e.g. when sending it in the two fragments,
this prevents successful PMTU updates in such case. The next attempts
to send the packet lead to the same tx error. Moreover, vti6 initial
MTU value relies on PMTU adjustments.
This issue can be reproduced with the following LTP test script:
udp_ipsec_vti.sh -6 -p ah -m tunnel -s 2000
Fixes: ccd740cbc6e0 ("vti6: Add pmtu handling to vti6_xmit.")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When we perform the sg shift repair for the scatterlist ring, we
currently start out at i = first_sg + 1. However, this is not
correct since the first_sg could point to the sge sitting at slot
MAX_SKB_FRAGS - 1, and a subsequent i = MAX_SKB_FRAGS will access
the scatterlist ring (sg) out of bounds. Add the sk_msg_iter_var()
helper for iterating through the ring, and apply the same rule
for advancing to the next ring element as we do elsewhere. Later
work will use this helper also in other places.
Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|