Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 9410d386d0a829ace9558336263086c2fbbe8aed ]
__qdisc_drop_all() accesses skb->prev to get to the tail of the
segment-list.
With commit 68d2f84a1368 ("net: gro: properly remove skb from list")
the skb-list handling has been changed to set skb->next to NULL and set
the list-poison on skb->prev.
With that change, __qdisc_drop_all() will panic when it tries to
dereference skb->prev.
Since commit 992cba7e276d ("net: Add and use skb_list_del_init().")
__list_del_entry is used, leaving skb->prev unchanged (thus,
pointing to the list-head if it's the first skb of the list).
This will make __qdisc_drop_all modify the next-pointer of the list-head
and result in a panic later on:
[ 34.501053] general protection fault: 0000 [#1] SMP KASAN PTI
[ 34.501968] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.20.0-rc2.mptcp #108
[ 34.502887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
[ 34.504074] RIP: 0010:dev_gro_receive+0x343/0x1f90
[ 34.504751] Code: e0 48 c1 e8 03 42 80 3c 30 00 0f 85 4a 1c 00 00 4d 8b 24 24 4c 39 65 d0 0f 84 0a 04 00 00 49 8d 7c 24 38 48 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 04
[ 34.507060] RSP: 0018:ffff8883af507930 EFLAGS: 00010202
[ 34.507761] RAX: 0000000000000007 RBX: ffff8883970b2c80 RCX: 1ffff11072e165a6
[ 34.508640] RDX: 1ffff11075867008 RSI: ffff8883ac338040 RDI: 0000000000000038
[ 34.509493] RBP: ffff8883af5079d0 R08: ffff8883970b2d40 R09: 0000000000000062
[ 34.510346] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000000
[ 34.511215] R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8883ac338008
[ 34.512082] FS: 0000000000000000(0000) GS:ffff8883af500000(0000) knlGS:0000000000000000
[ 34.513036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 34.513741] CR2: 000055ccc3e9d020 CR3: 00000003abf32000 CR4: 00000000000006e0
[ 34.514593] Call Trace:
[ 34.514893] <IRQ>
[ 34.515157] napi_gro_receive+0x93/0x150
[ 34.515632] receive_buf+0x893/0x3700
[ 34.516094] ? __netif_receive_skb+0x1f/0x1a0
[ 34.516629] ? virtnet_probe+0x1b40/0x1b40
[ 34.517153] ? __stable_node_chain+0x4d0/0x850
[ 34.517684] ? kfree+0x9a/0x180
[ 34.518067] ? __kasan_slab_free+0x171/0x190
[ 34.518582] ? detach_buf+0x1df/0x650
[ 34.519061] ? lapic_next_event+0x5a/0x90
[ 34.519539] ? virtqueue_get_buf_ctx+0x280/0x7f0
[ 34.520093] virtnet_poll+0x2df/0xd60
[ 34.520533] ? receive_buf+0x3700/0x3700
[ 34.521027] ? qdisc_watchdog_schedule_ns+0xd5/0x140
[ 34.521631] ? htb_dequeue+0x1817/0x25f0
[ 34.522107] ? sch_direct_xmit+0x142/0xf30
[ 34.522595] ? virtqueue_napi_schedule+0x26/0x30
[ 34.523155] net_rx_action+0x2f6/0xc50
[ 34.523601] ? napi_complete_done+0x2f0/0x2f0
[ 34.524126] ? kasan_check_read+0x11/0x20
[ 34.524608] ? _raw_spin_lock+0x7d/0xd0
[ 34.525070] ? _raw_spin_lock_bh+0xd0/0xd0
[ 34.525563] ? kvm_guest_apic_eoi_write+0x6b/0x80
[ 34.526130] ? apic_ack_irq+0x9e/0xe0
[ 34.526567] __do_softirq+0x188/0x4b5
[ 34.527015] irq_exit+0x151/0x180
[ 34.527417] do_IRQ+0xdb/0x150
[ 34.527783] common_interrupt+0xf/0xf
[ 34.528223] </IRQ>
This patch makes sure that skb->prev is set to NULL when entering
netem_enqueue.
Cc: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: 68d2f84a1368 ("net: gro: properly remove skb from list")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e72bde6b66299602087c8c2350d36a525e75d06e upstream.
Marco reported an error with hfsc:
root@Calimero:~# tc qdisc add dev eth0 root handle 1:0 hfsc default 1
Error: Attribute failed policy validation.
Apparently a few implementations pass TCA_OPTIONS as a binary instead
of nested attribute, so drop TCA_OPTIONS from the policy.
Fixes: 8b4c3cdd9dd8 ("net: sched: Add policy validation for tc attributes")
Reported-by: Marco Berizzi <pupilla@libero.it>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit e331473fee3d500bb0d2582a1fe598df3326d8cd ]
Similarly to what has been done in 8b4c3cdd9dd8 ("net: sched: Add policy
validation for tc attributes"), fix classifier code to add validation of
TCA_CHAIN and TCA_KIND netlink attributes.
tested with:
# ./tdc.py -c filter
v2: Let sch_api and cls_api share nla_policy they have in common, thanks
to David Ahern.
v3: Avoid EXPORT_SYMBOL(), as validation of those attributes is not done
by TC modules, thanks to Cong Wang.
While at it, restore the 'Delete / get qdisc' comment to its orginal
position, just above tc_get_qdisc() function prototype.
Fixes: 5bc1701881e39 ("net: sched: introduce multichain support for filters")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 3c53ed8fef6881a864f0ee8240ed2793ef73ad0d ]
When dumping classes by parent, kernel would return classes twice:
| # tc qdisc add dev lo root prio
| # tc class show dev lo
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:
| # tc class show dev lo parent 8001:
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:
This comes from qdisc_match_from_root() potentially returning the root
qdisc itself if its handle matched. Though in that case, root's classes
were already dumped a few lines above.
Fixes: cb395b2010879 ("net: sched: optimize class dumps")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 38b4f18d56372e1e21771ab7b0357b853330186c ]
gred_change_table_def() takes a pointer to TCA_GRED_DPS attribute,
and expects it will be able to interpret its contents as
struct tc_gred_sopt. Pass the correct gred attribute, instead of
TCA_OPTIONS.
This bug meant the table definition could never be changed after
Qdisc was initialized (unless whatever TCA_OPTIONS contained both
passed netlink validation and was a valid struct tc_gred_sopt...).
Old behaviour:
$ ip link add type dummy
$ tc qdisc replace dev dummy0 parent root handle 7: \
gred setup vqs 4 default 0
$ tc qdisc replace dev dummy0 parent root handle 7: \
gred setup vqs 4 default 0
RTNETLINK answers: Invalid argument
Now:
$ ip link add type dummy
$ tc qdisc replace dev dummy0 parent root handle 7: \
gred setup vqs 4 default 0
$ tc qdisc replace dev dummy0 parent root handle 7: \
gred setup vqs 4 default 0
$ tc qdisc replace dev dummy0 parent root handle 7: \
gred setup vqs 4 default 0
Fixes: f62d6b936df5 ("[PKT_SCHED]: GRED: Use central VQ change procedure")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Upstream commit bffa72cf7f9d ("net: sk_buff rbnode reorg") got
backported as commit 6b921536f170 ("net: sk_buff rbnode reorg") into the
v4.14.x-tree.
However, the backport does not include the changes in sch_netem.c
We need these, as otherwise the skb->dev pointer is not set when
dequeueing from the netem rbtree, resulting in a panic:
[ 15.427748] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0
[ 15.428863] IP: netif_skb_features+0x24/0x230
[ 15.429402] PGD 0 P4D 0
[ 15.429733] Oops: 0000 [#1] SMP PTI
[ 15.430169] Modules linked in:
[ 15.430614] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.14.77.mptcp #77
[ 15.431497] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
[ 15.432568] task: ffff88042db19680 task.stack: ffffc90000070000
[ 15.433356] RIP: 0010:netif_skb_features+0x24/0x230
[ 15.433977] RSP: 0018:ffff88043fd83e70 EFLAGS: 00010286
[ 15.434665] RAX: ffff880429ad80c0 RBX: ffff88042bd0e400 RCX: ffff880429ad8000
[ 15.435585] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88042bd0e400
[ 15.436551] RBP: ffff88042bd0e400 R08: ffff88042a4b6c9c R09: 0000000000000001
[ 15.437485] R10: 0000000000000004 R11: 0000000000000000 R12: ffff88042c700000
[ 15.438393] R13: ffff88042c700000 R14: ffff88042a4b6c00 R15: ffff88042c6bb000
[ 15.439315] FS: 0000000000000000(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000
[ 15.440314] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.441084] CR2: 00000000000000d0 CR3: 000000042c374000 CR4: 00000000000006e0
[ 15.442016] Call Trace:
[ 15.442333] <IRQ>
[ 15.442596] validate_xmit_skb+0x17/0x270
[ 15.443134] validate_xmit_skb_list+0x38/0x60
[ 15.443698] sch_direct_xmit+0x102/0x190
[ 15.444198] __qdisc_run+0xe3/0x240
[ 15.444671] net_tx_action+0x121/0x140
[ 15.445177] __do_softirq+0xe2/0x224
[ 15.445654] irq_exit+0xbf/0xd0
[ 15.446072] smp_apic_timer_interrupt+0x5d/0x90
[ 15.446654] apic_timer_interrupt+0x7d/0x90
[ 15.447185] </IRQ>
[ 15.447460] RIP: 0010:native_safe_halt+0x2/0x10
[ 15.447992] RSP: 0018:ffffc90000073f10 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
[ 15.449008] RAX: ffffffff816667d0 RBX: ffffffff820946b0 RCX: 0000000000000000
[ 15.449895] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 15.450768] RBP: ffffffff82026940 R08: 00000004e858e5e1 R09: ffff88042a4b6d58
[ 15.451643] R10: 0000000000000000 R11: 000000d0d56879bb R12: 0000000000000000
[ 15.452478] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 15.453340] ? __sched_text_end+0x2/0x2
[ 15.453835] default_idle+0xf/0x20
[ 15.454259] do_idle+0x170/0x200
[ 15.454653] cpu_startup_entry+0x14/0x20
[ 15.455142] secondary_startup_64+0xa5/0xb0
[ 15.455715] Code: 1f 84 00 00 00 00 00 55 53 48 89 fd 48 83 ec 08 8b 87 bc 00 00 00 48 8b 8f c0 00 00 00 0f b6 97 81 00 00 00 48 8b 77 10 48 01 c8 <48> 8b 9
[ 15.458138] RIP: netif_skb_features+0x24/0x230 RSP: ffff88043fd83e70
[ 15.458933] CR2: 00000000000000d0
[ 15.459352] ---[ end trace 083925903ae60570 ]---
Fixes: 6b921536f170 ("net: sk_buff rbnode reorg")
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Wei Wang <weiwan@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8b4c3cdd9dd8290343ce959a132d3b334062c5b9 ]
A number of TC attributes are processed without proper validation
(e.g., length checks). Add a tca policy for all input attributes and use
when invoking nlmsg_parse.
The 2 Fixes tags below cover the latest additions. The other attributes
are a string (KIND), nested attribute (OPTIONS which does seem to have
validation in most cases), for dumps only or a flag.
Fixes: 5bc1701881e39 ("net: sched: introduce multichain support for filters")
Fixes: d47a6b0e7c492 ("net: sched: introduce ingress/egress block index attributes for qdisc")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 34043d250f51368f214aed7f54c2dc29c819a8c7 ]
Matteo reported the following splat, testing the datapath of TC 'sample':
BUG: KASAN: null-ptr-deref in tcf_sample_act+0xc4/0x310
Read of size 8 at addr 0000000000000000 by task nc/433
CPU: 0 PID: 433 Comm: nc Not tainted 4.19.0-rc3-kvm #17
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
Call Trace:
kasan_report.cold.6+0x6c/0x2fa
tcf_sample_act+0xc4/0x310
? dev_hard_start_xmit+0x117/0x180
tcf_action_exec+0xa3/0x160
tcf_classify+0xdd/0x1d0
htb_enqueue+0x18e/0x6b0
? deref_stack_reg+0x7a/0xb0
? htb_delete+0x4b0/0x4b0
? unwind_next_frame+0x819/0x8f0
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
__dev_queue_xmit+0x722/0xca0
? unwind_get_return_address_ptr+0x50/0x50
? netdev_pick_tx+0xe0/0xe0
? save_stack+0x8c/0xb0
? kasan_kmalloc+0xbe/0xd0
? __kmalloc_track_caller+0xe4/0x1c0
? __kmalloc_reserve.isra.45+0x24/0x70
? __alloc_skb+0xdd/0x2e0
? sk_stream_alloc_skb+0x91/0x3b0
? tcp_sendmsg_locked+0x71b/0x15a0
? tcp_sendmsg+0x22/0x40
? __sys_sendto+0x1b0/0x250
? __x64_sys_sendto+0x6f/0x80
? do_syscall_64+0x5d/0x150
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
? __sys_sendto+0x1b0/0x250
? __x64_sys_sendto+0x6f/0x80
? do_syscall_64+0x5d/0x150
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
ip_finish_output2+0x495/0x590
? ip_copy_metadata+0x2e0/0x2e0
? skb_gso_validate_network_len+0x6f/0x110
? ip_finish_output+0x174/0x280
__tcp_transmit_skb+0xb17/0x12b0
? __tcp_select_window+0x380/0x380
tcp_write_xmit+0x913/0x1de0
? __sk_mem_schedule+0x50/0x80
tcp_sendmsg_locked+0x49d/0x15a0
? tcp_rcv_established+0x8da/0xa30
? tcp_set_state+0x220/0x220
? clear_user+0x1f/0x50
? iov_iter_zero+0x1ae/0x590
? __fget_light+0xa0/0xe0
tcp_sendmsg+0x22/0x40
__sys_sendto+0x1b0/0x250
? __ia32_sys_getpeername+0x40/0x40
? _copy_to_user+0x58/0x70
? poll_select_copy_remaining+0x176/0x200
? __pollwait+0x1c0/0x1c0
? ktime_get_ts64+0x11f/0x140
? kern_select+0x108/0x150
? core_sys_select+0x360/0x360
? vfs_read+0x127/0x150
? kernel_write+0x90/0x90
__x64_sys_sendto+0x6f/0x80
do_syscall_64+0x5d/0x150
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fefef2b129d
Code: ff ff ff ff eb b6 0f 1f 80 00 00 00 00 48 8d 05 51 37 0c 00 41 89 ca 8b 00 85 c0 75 20 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 6b f3 c3 66 0f 1f 84 00 00 00 00 00 41 56 41
RSP: 002b:00007fff2f5350c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 000056118d60c120 RCX: 00007fefef2b129d
RDX: 0000000000002000 RSI: 000056118d629320 RDI: 0000000000000003
RBP: 000056118d530370 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000002000
R13: 000056118d5c2a10 R14: 000056118d5c2a10 R15: 000056118d5303b8
tcf_sample_act() tried to update its per-cpu stats, but tcf_sample_init()
forgot to allocate them, because tcf_idr_create() was called with a wrong
value of 'cpustats'. Setting it to true proved to fix the reported crash.
Reported-by: Matteo Croce <mcroce@redhat.com>
Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
Fixes: 5c5670fae430 ("net/sched: Introduce sample tc action")
Tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Geeralize private netem_rb_to_skb()
TCP rtx queue will soon be converted to rb-tree,
so we will need skb_rbtree_walk() helpers.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 18a4c0eab2623cc95be98a1e6af1ad18e7695977)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 84cb8eb26cb9ce3c79928094962a475a9d850a53 ]
Recent refactoring of add_metainfo() caused use_all_metadata() to add
metainfo to ife action metalist without taking reference to module. This
causes warning in module_put called from ife action cleanup function.
Implement add_metainfo_and_get_ops() function that returns with reference
to module taken if metainfo was added successfully, and call it from
use_all_metadata(), instead of calling __add_metainfo() directly.
Example warning:
[ 646.344393] WARNING: CPU: 1 PID: 2278 at kernel/module.c:1139 module_put+0x1cb/0x230
[ 646.352437] Modules linked in: act_meta_skbtcindex act_meta_mark act_meta_skbprio act_ife ife veth nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c tun ebtable_filter ebtables ip6table_filter ip6_tables bridge stp llc mlx5_ib ib_uverbs ib_core intel_rapl sb_edac x86_pkg_temp_thermal mlx5_core coretemp kvm_intel kvm nfsd igb irqbypass crct10dif_pclmul devlink crc32_pclmul mei_me joydev ses crc32c_intel enclosure auth_rpcgss i2c_algo_bit ioatdma ptp mei pps_core ghash_clmulni_intel iTCO_wdt iTCO_vendor_support pcspkr dca ipmi_ssif lpc_ich target_core_mod i2c_i801 ipmi_si ipmi_devintf pcc_cpufreq wmi ipmi_msghandler nfs_acl lockd acpi_pad acpi_power_meter grace sunrpc mpt3sas raid_class scsi_transport_sas
[ 646.425631] CPU: 1 PID: 2278 Comm: tc Not tainted 4.19.0-rc1+ #799
[ 646.432187] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 646.440595] RIP: 0010:module_put+0x1cb/0x230
[ 646.445238] Code: f3 66 94 02 e8 26 ff fa ff 85 c0 74 11 0f b6 1d 51 30 94 02 80 fb 01 77 60 83 e3 01 74 13 65 ff 0d 3a 83 db 73 e9 2b ff ff ff <0f> 0b e9 00 ff ff ff e8 59 01 fb ff 85 c0 75 e4 48 c7 c2 20 62 6b
[ 646.464997] RSP: 0018:ffff880354d37068 EFLAGS: 00010286
[ 646.470599] RAX: 0000000000000000 RBX: ffffffffc0a52518 RCX: ffffffff8c2668db
[ 646.478118] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffffffffc0a52518
[ 646.485641] RBP: ffffffffc0a52180 R08: fffffbfff814a4a4 R09: fffffbfff814a4a3
[ 646.493164] R10: ffffffffc0a5251b R11: fffffbfff814a4a4 R12: 1ffff1006a9a6e0d
[ 646.500687] R13: 00000000ffffffff R14: ffff880362bab890 R15: dead000000000100
[ 646.508213] FS: 00007f4164c99800(0000) GS:ffff88036fe40000(0000) knlGS:0000000000000000
[ 646.516961] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 646.523080] CR2: 00007f41638b8420 CR3: 0000000351df0004 CR4: 00000000001606e0
[ 646.530595] Call Trace:
[ 646.533408] ? find_symbol_in_section+0x260/0x260
[ 646.538509] tcf_ife_cleanup+0x11b/0x200 [act_ife]
[ 646.543695] tcf_action_cleanup+0x29/0xa0
[ 646.548078] __tcf_action_put+0x5a/0xb0
[ 646.552289] ? nla_put+0x65/0xe0
[ 646.555889] __tcf_idr_release+0x48/0x60
[ 646.560187] tcf_generic_walker+0x448/0x6b0
[ 646.564764] ? tcf_action_dump_1+0x450/0x450
[ 646.569411] ? __lock_is_held+0x84/0x110
[ 646.573720] ? tcf_ife_walker+0x10c/0x20f [act_ife]
[ 646.578982] tca_action_gd+0x972/0xc40
[ 646.583129] ? tca_get_fill.constprop.17+0x250/0x250
[ 646.588471] ? mark_lock+0xcf/0x980
[ 646.592324] ? check_chain_key+0x140/0x1f0
[ 646.596832] ? debug_show_all_locks+0x240/0x240
[ 646.601839] ? memset+0x1f/0x40
[ 646.605350] ? nla_parse+0xca/0x1a0
[ 646.609217] tc_ctl_action+0x215/0x230
[ 646.613339] ? tcf_action_add+0x220/0x220
[ 646.617748] rtnetlink_rcv_msg+0x56a/0x6d0
[ 646.622227] ? rtnl_fdb_del+0x3f0/0x3f0
[ 646.626466] netlink_rcv_skb+0x18d/0x200
[ 646.630752] ? rtnl_fdb_del+0x3f0/0x3f0
[ 646.634959] ? netlink_ack+0x500/0x500
[ 646.639106] netlink_unicast+0x2d0/0x370
[ 646.643409] ? netlink_attachskb+0x340/0x340
[ 646.648050] ? _copy_from_iter_full+0xe9/0x3e0
[ 646.652870] ? import_iovec+0x11e/0x1c0
[ 646.657083] netlink_sendmsg+0x3b9/0x6a0
[ 646.661388] ? netlink_unicast+0x370/0x370
[ 646.665877] ? netlink_unicast+0x370/0x370
[ 646.670351] sock_sendmsg+0x6b/0x80
[ 646.674212] ___sys_sendmsg+0x4a1/0x520
[ 646.678443] ? copy_msghdr_from_user+0x210/0x210
[ 646.683463] ? lock_downgrade+0x320/0x320
[ 646.687849] ? debug_show_all_locks+0x240/0x240
[ 646.692760] ? do_raw_spin_unlock+0xa2/0x130
[ 646.697418] ? _raw_spin_unlock+0x24/0x30
[ 646.701798] ? __handle_mm_fault+0x1819/0x1c10
[ 646.706619] ? __pmd_alloc+0x320/0x320
[ 646.710738] ? debug_show_all_locks+0x240/0x240
[ 646.715649] ? restore_nameidata+0x7b/0xa0
[ 646.720117] ? check_chain_key+0x140/0x1f0
[ 646.724590] ? check_chain_key+0x140/0x1f0
[ 646.729070] ? __fget_light+0xbc/0xd0
[ 646.733121] ? __sys_sendmsg+0xd7/0x150
[ 646.737329] __sys_sendmsg+0xd7/0x150
[ 646.741359] ? __ia32_sys_shutdown+0x30/0x30
[ 646.746003] ? up_read+0x53/0x90
[ 646.749601] ? __do_page_fault+0x484/0x780
[ 646.754105] ? do_syscall_64+0x1e/0x2c0
[ 646.758320] do_syscall_64+0x72/0x2c0
[ 646.762353] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 646.767776] RIP: 0033:0x7f4163872150
[ 646.771713] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
[ 646.791474] RSP: 002b:00007ffdef7d6b58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 646.799721] RAX: ffffffffffffffda RBX: 0000000000000024 RCX: 00007f4163872150
[ 646.807240] RDX: 0000000000000000 RSI: 00007ffdef7d6bd0 RDI: 0000000000000003
[ 646.814760] RBP: 000000005b8b9482 R08: 0000000000000001 R09: 0000000000000000
[ 646.822286] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffdef7dad20
[ 646.829807] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000679bc0
[ 646.837360] irq event stamp: 6083
[ 646.841043] hardirqs last enabled at (6081): [<ffffffff8c220a7d>] __call_rcu+0x17d/0x500
[ 646.849882] hardirqs last disabled at (6083): [<ffffffff8c004f06>] trace_hardirqs_off_thunk+0x1a/0x1c
[ 646.859775] softirqs last enabled at (5968): [<ffffffff8d4004a1>] __do_softirq+0x4a1/0x6ee
[ 646.868784] softirqs last disabled at (6082): [<ffffffffc0a78759>] tcf_ife_cleanup+0x39/0x200 [act_ife]
[ 646.878845] ---[ end trace b1b8c12ffe51e657 ]---
Fixes: 5ffe57da29b3 ("act_ife: fix a potential deadlock")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5ffe57da29b3802baeddaa40909682bbb4cb4d48 ]
use_all_metadata() acquires read_lock(&ife_mod_lock), then calls
add_metainfo() which calls find_ife_oplist() which acquires the same
lock again. Deadlock!
Introduce __add_metainfo() which accepts struct tcf_meta_ops *ops
as an additional parameter and let its callers to decide how
to find it. For use_all_metadata(), it already has ops, no
need to find it again, just call __add_metainfo() directly.
And, as ife_mod_lock is only needed for find_ife_oplist(),
this means we can make non-atomic allocation for populate_metalist()
now.
Fixes: 817e9f2c5c26 ("act_ife: acquire ife_mod_lock before reading ifeoplist")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 4e407ff5cd67ec76eeeea1deec227b7982dc7f66 ]
The only time we need to take tcfa_lock is when adding
a new metainfo to an existing ife->metalist. We don't need
to take tcfa_lock so early and so broadly in tcf_ife_init().
This means we can always take ife_mod_lock first, avoid the
reverse locking ordering warning as reported by Vlad.
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Tested-by: Vlad Buslov <vladbu@mellanox.com>
Cc: Vlad Buslov <vladbu@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 85eb9af182243ce9a8b72410d5321c440ac5f8d7 ]
in the (rare) case of failure in nla_nest_start(), missing NULL checks in
tcf_pedit_key_ex_dump() can make the following command
# tc action add action pedit ex munge ip ttl set 64
dereference a NULL pointer:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 800000007d1cd067 P4D 800000007d1cd067 PUD 7acd3067 PMD 0
Oops: 0002 [#1] SMP PTI
CPU: 0 PID: 3336 Comm: tc Tainted: G E 4.18.0.pedit+ #425
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:tcf_pedit_dump+0x19d/0x358 [act_pedit]
Code: be 02 00 00 00 48 89 df 66 89 44 24 20 e8 9b b1 fd e0 85 c0 75 46 8b 83 c8 00 00 00 49 83 c5 08 48 03 83 d0 00 00 00 4d 39 f5 <66> 89 04 25 00 00 00 00 0f 84 81 01 00 00 41 8b 45 00 48 8d 4c 24
RSP: 0018:ffffb5d4004478a8 EFLAGS: 00010246
RAX: ffff8880fcda2070 RBX: ffff8880fadd2900 RCX: 0000000000000000
RDX: 0000000000000002 RSI: ffffb5d4004478ca RDI: ffff8880fcda206e
RBP: ffff8880fb9cb900 R08: 0000000000000008 R09: ffff8880fcda206e
R10: ffff8880fadd2900 R11: 0000000000000000 R12: ffff8880fd26cf40
R13: ffff8880fc957430 R14: ffff8880fc957430 R15: ffff8880fb9cb988
FS: 00007f75a537a740(0000) GS:ffff8880fda00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000007a2fa005 CR4: 00000000001606f0
Call Trace:
? __nla_reserve+0x38/0x50
tcf_action_dump_1+0xd2/0x130
tcf_action_dump+0x6a/0xf0
tca_get_fill.constprop.31+0xa3/0x120
tcf_action_add+0xd1/0x170
tc_ctl_action+0x137/0x150
rtnetlink_rcv_msg+0x263/0x2d0
? _cond_resched+0x15/0x40
? rtnl_calcit.isra.30+0x110/0x110
netlink_rcv_skb+0x4d/0x130
netlink_unicast+0x1a3/0x250
netlink_sendmsg+0x2ae/0x3a0
sock_sendmsg+0x36/0x40
___sys_sendmsg+0x26f/0x2d0
? do_wp_page+0x8e/0x5f0
? handle_pte_fault+0x6c3/0xf50
? __handle_mm_fault+0x38e/0x520
? __sys_sendmsg+0x5e/0xa0
__sys_sendmsg+0x5e/0xa0
do_syscall_64+0x5b/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f75a4583ba0
Code: c3 48 8b 05 f2 62 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d fd c3 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24
RSP: 002b:00007fff60ee7418 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fff60ee7540 RCX: 00007f75a4583ba0
RDX: 0000000000000000 RSI: 00007fff60ee7490 RDI: 0000000000000003
RBP: 000000005b842d3e R08: 0000000000000002 R09: 0000000000000000
R10: 00007fff60ee6ea0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff60ee7554 R14: 0000000000000001 R15: 000000000066c100
Modules linked in: act_pedit(E) ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul ext4 crc32_pclmul mbcache ghash_clmulni_intel jbd2 pcbc snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer cryptd glue_helper snd joydev pcspkr soundcore virtio_balloon i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk virtio_console failover qxl crc32c_intel drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix virtio_pci libata virtio_ring i2c_core virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_pedit]
CR2: 0000000000000000
Like it's done for other TC actions, give up dumping pedit rules and return
an error if nla_nest_start() returns NULL.
Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the conventional network headers")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 98c8f125fd8a6240ea343c1aa50a1be9047791b8 ]
Via u32_change(), TCA_U32_SEL has an unspecified type in the netlink
policy, so max length isn't enforced, only minimum. This means nkeys
(from userspace) was being trusted without checking the actual size of
nla_len(), which could lead to a memory over-read, and ultimately an
exposure via a call to u32_dump(). Reachability is CAP_NET_ADMIN within
a namespace.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6d784f1625ea68783cc1fb17de8f6cd3e1660c3f ]
Immediately after module_put(), user could delete this
module, so e->ops could be already freed before we call
e->ops->release().
Fix this by moving module_put() after ops->release().
Fixes: ef6980b6becb ("introduce IFE action")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 38230a3e0e0933bbcf5df6fa469ba0667f667568 ]
the control action in the common member of struct tcf_tunnel_key must be a
valid value, as it can contain the chain index when 'goto chain' is used.
Ensure that the control action can be read as x->tcfa_action, when x is a
pointer to struct tc_action and x->ops->type is TCA_ACT_TUNNEL_KEY, to
prevent the following command:
# tc filter add dev $h2 ingress protocol ip pref 1 handle 101 flower \
> $tcflags dst_mac $h2mac action tunnel_key unset goto chain 1
from causing a NULL dereference when a matching packet is received:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 80000001097ac067 P4D 80000001097ac067 PUD 103b0a067 PMD 0
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 3491 Comm: mausezahn Tainted: G E 4.18.0-rc2.auguri+ #421
Hardware name: Hewlett-Packard HP Z220 CMT Workstation/1790, BIOS K51 v01.58 02/07/2013
RIP: 0010:tcf_action_exec+0xb8/0x100
Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3
RSP: 0018:ffff95145ea03c40 EFLAGS: 00010246
RAX: 0000000020000001 RBX: ffff9514499e5800 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
RBP: ffff95145ea03e60 R08: 0000000000000000 R09: ffff95145ea03c9c
R10: ffff95145ea03c78 R11: 0000000000000008 R12: ffff951456a69800
R13: ffff951456a69808 R14: 0000000000000001 R15: ffff95144965ee40
FS: 00007fd67ee11740(0000) GS:ffff95145ea00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001038a2006 CR4: 00000000001606f0
Call Trace:
<IRQ>
fl_classify+0x1ad/0x1c0 [cls_flower]
? __update_load_avg_se.isra.47+0x1ca/0x1d0
? __update_load_avg_se.isra.47+0x1ca/0x1d0
? update_load_avg+0x665/0x690
? update_load_avg+0x665/0x690
? kmem_cache_alloc+0x38/0x1c0
tcf_classify+0x89/0x140
__netif_receive_skb_core+0x5ea/0xb70
? enqueue_entity+0xd0/0x270
? process_backlog+0x97/0x150
process_backlog+0x97/0x150
net_rx_action+0x14b/0x3e0
__do_softirq+0xde/0x2b4
do_softirq_own_stack+0x2a/0x40
</IRQ>
do_softirq.part.18+0x49/0x50
__local_bh_enable_ip+0x49/0x50
__dev_queue_xmit+0x4ab/0x8a0
? wait_woken+0x80/0x80
? packet_sendmsg+0x38f/0x810
? __dev_queue_xmit+0x8a0/0x8a0
packet_sendmsg+0x38f/0x810
sock_sendmsg+0x36/0x40
__sys_sendto+0x10e/0x140
? do_vfs_ioctl+0xa4/0x630
? syscall_trace_enter+0x1df/0x2e0
? __audit_syscall_exit+0x22a/0x290
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x5b/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fd67e18dc93
Code: 48 8b 0d 18 83 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 59 c7 20 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 2b f7 ff ff 48 89 04 24
RSP: 002b:00007ffe0189b748 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00000000020ca010 RCX: 00007fd67e18dc93
RDX: 0000000000000062 RSI: 00000000020ca322 RDI: 0000000000000003
RBP: 00007ffe0189b780 R08: 00007ffe0189b760 R09: 0000000000000014
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000062
R13: 00000000020ca322 R14: 00007ffe0189b760 R15: 0000000000000003
Modules linked in: act_tunnel_key act_gact cls_flower sch_ingress vrf veth act_csum(E) xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek coretemp snd_hda_codec_generic kvm_intel kvm irqbypass snd_hda_intel crct10dif_pclmul crc32_pclmul hp_wmi ghash_clmulni_intel pcbc snd_hda_codec aesni_intel sparse_keymap rfkill snd_hda_core snd_hwdep snd_seq crypto_simd iTCO_wdt gpio_ich iTCO_vendor_support wmi_bmof cryptd mei_wdt glue_helper snd_seq_device snd_pcm pcspkr snd_timer snd i2c_i801 lpc_ich sg soundcore wmi mei_me
mei ie31200_edac nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod sr_mod cdrom i915 video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci crc32c_intel libahci serio_raw sfc libata mtd drm ixgbe mdio i2c_core e1000e dca
CR2: 0000000000000000
---[ end trace 1ab8b5b5d4639dfc ]---
RIP: 0010:tcf_action_exec+0xb8/0x100
Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3
RSP: 0018:ffff95145ea03c40 EFLAGS: 00010246
RAX: 0000000020000001 RBX: ffff9514499e5800 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
RBP: ffff95145ea03e60 R08: 0000000000000000 R09: ffff95145ea03c9c
R10: ffff95145ea03c78 R11: 0000000000000008 R12: ffff951456a69800
R13: ffff951456a69808 R14: 0000000000000001 R15: ffff95144965ee40
FS: 00007fd67ee11740(0000) GS:ffff95145ea00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001038a2006 CR4: 00000000001606f0
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x11400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
Fixes: d0f6dd8a914f ("net/sched: Introduce act_tunnel_key")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a51c76b4dfb30496dc65396a957ef0f06af7fb22 ]
Fix tcf_unbind_filter missing in cls_matchall as this will trigger
WARN_ON() in cbq_destroy_class().
Fixes: fd62d9f5c575f ("net/sched: matchall: Fix configuration race")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 008369dcc5f7bfba526c98054f8525322acf0ea3 ]
Li Shuang reported the following warn:
[ 733.484610] WARNING: CPU: 6 PID: 21123 at net/sched/sch_cbq.c:1418 cbq_destroy_class+0x5d/0x70 [sch_cbq]
[ 733.495190] Modules linked in: sch_cbq cls_tcindex sch_dsmark rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat l
[ 733.574155] syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb ixgbe ahci libahci i2c_algo_bit libata i40e i2c_core dca mdio megaraid_sas dm_mirror dm_region_hash dm_log dm_mod
[ 733.592500] CPU: 6 PID: 21123 Comm: tc Not tainted 4.18.0-rc8.latest+ #131
[ 733.600169] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.1.5 04/11/2016
[ 733.608518] RIP: 0010:cbq_destroy_class+0x5d/0x70 [sch_cbq]
[ 733.614734] Code: e7 d9 d2 48 8b 7b 48 e8 61 05 da d2 48 8d bb f8 00 00 00 e8 75 ae d5 d2 48 39 eb 74 0a 48 89 df 5b 5d e9 16 6c 94 d2 5b 5d c3 <0f> 0b eb b6 0f 1f 44 00 00 66 2e 0f 1f 84
[ 733.635798] RSP: 0018:ffffbfbb066bb9d8 EFLAGS: 00010202
[ 733.641627] RAX: 0000000000000001 RBX: ffff9cdd17392800 RCX: 000000008010000f
[ 733.649588] RDX: ffff9cdd1df547e0 RSI: ffff9cdd17392800 RDI: ffff9cdd0f84c800
[ 733.657547] RBP: ffff9cdd0f84c800 R08: 0000000000000001 R09: 0000000000000000
[ 733.665508] R10: ffff9cdd0f84d000 R11: 0000000000000001 R12: 0000000000000001
[ 733.673469] R13: 0000000000000000 R14: 0000000000000001 R15: ffff9cdd17392200
[ 733.681430] FS: 00007f911890a740(0000) GS:ffff9cdd1f8c0000(0000) knlGS:0000000000000000
[ 733.690456] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 733.696864] CR2: 0000000000b5544c CR3: 0000000859374002 CR4: 00000000001606e0
[ 733.704826] Call Trace:
[ 733.707554] cbq_destroy+0xa1/0xd0 [sch_cbq]
[ 733.712318] qdisc_destroy+0x62/0x130
[ 733.716401] dsmark_destroy+0x2a/0x70 [sch_dsmark]
[ 733.721745] qdisc_destroy+0x62/0x130
[ 733.725829] qdisc_graft+0x3ba/0x470
[ 733.729817] tc_get_qdisc+0x2a6/0x2c0
[ 733.733901] ? cred_has_capability+0x7d/0x130
[ 733.738761] rtnetlink_rcv_msg+0x263/0x2d0
[ 733.743330] ? rtnl_calcit.isra.30+0x110/0x110
[ 733.748287] netlink_rcv_skb+0x4d/0x130
[ 733.752576] netlink_unicast+0x1a3/0x250
[ 733.756949] netlink_sendmsg+0x2ae/0x3a0
[ 733.761324] sock_sendmsg+0x36/0x40
[ 733.765213] ___sys_sendmsg+0x26f/0x2d0
[ 733.769493] ? handle_pte_fault+0x586/0xdf0
[ 733.774158] ? __handle_mm_fault+0x389/0x500
[ 733.778919] ? __sys_sendmsg+0x5e/0xa0
[ 733.783099] __sys_sendmsg+0x5e/0xa0
[ 733.787087] do_syscall_64+0x5b/0x180
[ 733.791171] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 733.796805] RIP: 0033:0x7f9117f23f10
[ 733.800791] Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8
[ 733.821873] RSP: 002b:00007ffe96818398 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 733.830319] RAX: ffffffffffffffda RBX: 000000005b71244c RCX: 00007f9117f23f10
[ 733.838280] RDX: 0000000000000000 RSI: 00007ffe968183e0 RDI: 0000000000000003
[ 733.846241] RBP: 00007ffe968183e0 R08: 000000000000ffff R09: 0000000000000003
[ 733.854202] R10: 00007ffe96817e20 R11: 0000000000000246 R12: 0000000000000000
[ 733.862161] R13: 0000000000662ee0 R14: 0000000000000000 R15: 0000000000000000
[ 733.870121] ---[ end trace 28edd4aad712ddca ]---
This is because we didn't update f->result.res when create new filter. Then in
tcindex_delete() -> tcf_unbind_filter(), we will failed to find out the res
and unbind filter, which will trigger the WARN_ON() in cbq_destroy_class().
Fix it by updating f->result.res when create new filter.
Fixes: 6e0565697a106 ("net_sched: fix another crash in cls_tcindex")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 2df8bee5654bb2b7312662ca6810d4dc16b0b67f ]
Li Shuang reported the following crash:
[ 71.267724] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[ 71.276456] PGD 800000085d9bd067 P4D 800000085d9bd067 PUD 859a0b067 PMD 0
[ 71.284127] Oops: 0000 [#1] SMP PTI
[ 71.288015] CPU: 12 PID: 2386 Comm: tc Not tainted 4.18.0-rc8.latest+ #131
[ 71.295686] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.1.5 04/11/2016
[ 71.304037] RIP: 0010:tcindex_delete+0x72/0x280 [cls_tcindex]
[ 71.310446] Code: 00 31 f6 48 87 75 20 48 85 f6 74 11 48 8b 47 18 48 8b 40 08 48 8b 40 50 e8 fb a6 f8 fc 48 85 db 0f 84 dc 00 00 00 48 8b 73 18 <8b> 56 04 48 8d 7e 04 85 d2 0f 84 7b 01 00
[ 71.331517] RSP: 0018:ffffb45207b3f898 EFLAGS: 00010282
[ 71.337345] RAX: ffff8ad3d72d6360 RBX: ffff8acc84393680 RCX: 000000000000002e
[ 71.345306] RDX: ffff8ad3d72c8570 RSI: 0000000000000000 RDI: ffff8ad847a45800
[ 71.353277] RBP: ffff8acc84393688 R08: ffff8ad3d72c8400 R09: 0000000000000000
[ 71.361238] R10: ffff8ad3de786e00 R11: 0000000000000000 R12: ffffb45207b3f8c7
[ 71.369199] R13: ffff8ad3d93bd2a0 R14: 000000000000002e R15: ffff8ad3d72c9600
[ 71.377161] FS: 00007f9d3ec3e740(0000) GS:ffff8ad3df980000(0000) knlGS:0000000000000000
[ 71.386188] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 71.392597] CR2: 0000000000000004 CR3: 0000000852f06003 CR4: 00000000001606e0
[ 71.400558] Call Trace:
[ 71.403299] tcindex_destroy_element+0x25/0x40 [cls_tcindex]
[ 71.409611] tcindex_walk+0xbb/0x110 [cls_tcindex]
[ 71.414953] tcindex_destroy+0x44/0x90 [cls_tcindex]
[ 71.420492] ? tcindex_delete+0x280/0x280 [cls_tcindex]
[ 71.426323] tcf_proto_destroy+0x16/0x40
[ 71.430696] tcf_chain_flush+0x51/0x70
[ 71.434876] tcf_block_put_ext.part.30+0x8f/0x1b0
[ 71.440122] tcf_block_put+0x4d/0x70
[ 71.444108] cbq_destroy+0x4d/0xd0 [sch_cbq]
[ 71.448869] qdisc_destroy+0x62/0x130
[ 71.452951] dsmark_destroy+0x2a/0x70 [sch_dsmark]
[ 71.458300] qdisc_destroy+0x62/0x130
[ 71.462373] qdisc_graft+0x3ba/0x470
[ 71.466359] tc_get_qdisc+0x2a6/0x2c0
[ 71.470443] ? cred_has_capability+0x7d/0x130
[ 71.475307] rtnetlink_rcv_msg+0x263/0x2d0
[ 71.479875] ? rtnl_calcit.isra.30+0x110/0x110
[ 71.484832] netlink_rcv_skb+0x4d/0x130
[ 71.489109] netlink_unicast+0x1a3/0x250
[ 71.493482] netlink_sendmsg+0x2ae/0x3a0
[ 71.497859] sock_sendmsg+0x36/0x40
[ 71.501748] ___sys_sendmsg+0x26f/0x2d0
[ 71.506029] ? handle_pte_fault+0x586/0xdf0
[ 71.510694] ? __handle_mm_fault+0x389/0x500
[ 71.515457] ? __sys_sendmsg+0x5e/0xa0
[ 71.519636] __sys_sendmsg+0x5e/0xa0
[ 71.523626] do_syscall_64+0x5b/0x180
[ 71.527711] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 71.533345] RIP: 0033:0x7f9d3e257f10
[ 71.537331] Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8
[ 71.558401] RSP: 002b:00007fff6f893398 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 71.566848] RAX: ffffffffffffffda RBX: 000000005b71274d RCX: 00007f9d3e257f10
[ 71.574810] RDX: 0000000000000000 RSI: 00007fff6f8933e0 RDI: 0000000000000003
[ 71.582770] RBP: 00007fff6f8933e0 R08: 000000000000ffff R09: 0000000000000003
[ 71.590729] R10: 00007fff6f892e20 R11: 0000000000000246 R12: 0000000000000000
[ 71.598689] R13: 0000000000662ee0 R14: 0000000000000000 R15: 0000000000000000
[ 71.606651] Modules linked in: sch_cbq cls_tcindex sch_dsmark xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_coni
[ 71.685425] libahci i2c_algo_bit i2c_core i40e libata dca mdio megaraid_sas dm_mirror dm_region_hash dm_log dm_mod
[ 71.697075] CR2: 0000000000000004
[ 71.700792] ---[ end trace f604eb1acacd978b ]---
Reproducer:
tc qdisc add dev lo handle 1:0 root dsmark indices 64 set_tc_index
tc filter add dev lo parent 1:0 protocol ip prio 1 tcindex mask 0xfc shift 2
tc qdisc add dev lo parent 1:0 handle 2:0 cbq bandwidth 10Mbit cell 8 avpkt 1000 mpu 64
tc class add dev lo parent 2:0 classid 2:1 cbq bandwidth 10Mbit rate 1500Kbit avpkt 1000 prio 1 bounded isolated allot 1514 weight 1 maxburst 10
tc filter add dev lo parent 2:0 protocol ip prio 1 handle 0x2e tcindex classid 2:1 pass_on
tc qdisc add dev lo parent 2:1 pfifo limit 5
tc qdisc del dev lo root
This is because in tcindex_set_parms, when there is no old_r, we set new
exts to cr.exts. And we didn't set it to filter when r == &new_filter_result.
Then in tcindex_delete() -> tcf_exts_get_net(), we will get NULL pointer
dereference as we didn't init exts.
Fix it by moving tcf_exts_change() after "if (old_r && old_r != r)" check.
Then we don't need "cr" as there is no errout after that.
Fixes: bf63ac73b3e13 ("net_sched: fix an oops in tcindex filter")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7e85dc8cb35abf16455f1511f0670b57c1a84608 ]
When blackhole is used on top of classful qdisc like hfsc it breaks
qlen and backlog counters because packets are disappear without notice.
In HFSC non-zero qlen while all classes are inactive triggers warning:
WARNING: ... at net/sched/sch_hfsc.c:1393 hfsc_dequeue+0xba4/0xe90 [sch_hfsc]
and schedules watchdog work endlessly.
This patch return __NET_XMIT_BYPASS in addition to NET_XMIT_SUCCESS,
this flag tells upper layer: this packet is gone and isn't queued.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 8d499533e0bc02d44283dbdab03142b599b8ba16 ]
use nla_strlcpy() to avoid copying data beyond the length of TCA_DEF_DATA
netlink attribute, in case it is less than SIMP_MAX_DATA and it does not
end with '\0' character.
v2: fix errors in the commit message, thanks Hangbin Liu
Fixes: fa1b1cff3d06 ("net_cls_act: Make act_simple use of netlink policy.")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit af5d01842fe1fbfb9f5e1c1d957ba02ab6f4569a ]
When application fails to pass flags in netlink TLV for a new skbedit action,
the kernel results in the following oops:
[ 8.307732] BUG: unable to handle kernel paging request at 0000000000021130
[ 8.309167] PGD 80000000193d1067 P4D 80000000193d1067 PUD 180e0067 PMD 0
[ 8.310595] Oops: 0000 [#1] SMP PTI
[ 8.311334] Modules linked in: kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper serio_raw
[ 8.314190] CPU: 1 PID: 397 Comm: tc Not tainted 4.17.0-rc3+ #357
[ 8.315252] RIP: 0010:__tcf_idr_release+0x33/0x140
[ 8.316203] RSP: 0018:ffffa0718038f840 EFLAGS: 00010246
[ 8.317123] RAX: 0000000000000001 RBX: 0000000000021100 RCX: 0000000000000000
[ 8.319831] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000021100
[ 8.321181] RBP: 0000000000000000 R08: 000000000004adf8 R09: 0000000000000122
[ 8.322645] R10: 0000000000000000 R11: ffffffff9e5b01ed R12: 0000000000000000
[ 8.324157] R13: ffffffff9e0d3cc0 R14: 0000000000000000 R15: 0000000000000000
[ 8.325590] FS: 00007f591292e700(0000) GS:ffff8fcf5bc40000(0000) knlGS:0000000000000000
[ 8.327001] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8.327987] CR2: 0000000000021130 CR3: 00000000180e6004 CR4: 00000000001606a0
[ 8.329289] Call Trace:
[ 8.329735] tcf_skbedit_init+0xa7/0xb0
[ 8.330423] tcf_action_init_1+0x362/0x410
[ 8.331139] ? try_to_wake_up+0x44/0x430
[ 8.331817] tcf_action_init+0x103/0x190
[ 8.332511] tc_ctl_action+0x11a/0x220
[ 8.333174] rtnetlink_rcv_msg+0x23d/0x2e0
[ 8.333902] ? _cond_resched+0x16/0x40
[ 8.334569] ? __kmalloc_node_track_caller+0x5b/0x2c0
[ 8.335440] ? rtnl_calcit.isra.31+0xf0/0xf0
[ 8.336178] netlink_rcv_skb+0xdb/0x110
[ 8.336855] netlink_unicast+0x167/0x220
[ 8.337550] netlink_sendmsg+0x2a7/0x390
[ 8.338258] sock_sendmsg+0x30/0x40
[ 8.338865] ___sys_sendmsg+0x2c5/0x2e0
[ 8.339531] ? pagecache_get_page+0x27/0x210
[ 8.340271] ? filemap_fault+0xa2/0x630
[ 8.340943] ? page_add_file_rmap+0x108/0x200
[ 8.341732] ? alloc_set_pte+0x2aa/0x530
[ 8.342573] ? finish_fault+0x4e/0x70
[ 8.343332] ? __handle_mm_fault+0xbc1/0x10d0
[ 8.344337] ? __sys_sendmsg+0x53/0x80
[ 8.345040] __sys_sendmsg+0x53/0x80
[ 8.345678] do_syscall_64+0x4f/0x100
[ 8.346339] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 8.347206] RIP: 0033:0x7f591191da67
[ 8.347831] RSP: 002b:00007fff745abd48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 8.349179] RAX: ffffffffffffffda RBX: 00007fff745abe70 RCX: 00007f591191da67
[ 8.350431] RDX: 0000000000000000 RSI: 00007fff745abdc0 RDI: 0000000000000003
[ 8.351659] RBP: 000000005af35251 R08: 0000000000000001 R09: 0000000000000000
[ 8.352922] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000
[ 8.354183] R13: 00007fff745afed0 R14: 0000000000000001 R15: 00000000006767c0
[ 8.355400] Code: 41 89 d4 53 89 f5 48 89 fb e8 aa 20 fd ff 85 c0 0f 84 ed 00
00 00 48 85 db 0f 84 cf 00 00 00 40 84 ed 0f 85 cd 00 00 00 45 84 e4 <8b> 53 30
74 0d 85 d2 b8 ff ff ff ff 0f 8f b3 00 00 00 8b 43 2c
[ 8.358699] RIP: __tcf_idr_release+0x33/0x140 RSP: ffffa0718038f840
[ 8.359770] CR2: 0000000000021130
[ 8.360438] ---[ end trace 60c66be45dfc14f0 ]---
The caller calls action's ->init() and passes pointer to "struct tc_action *a",
which later may be initialized to point at the existing action, otherwise
"struct tc_action *a" is still invalid, and therefore dereferencing it is an
error as happens in tcf_idr_release, where refcnt is decremented.
So in case of missing flags tcf_idr_release must be called only for
existing actions.
v2:
- prepare patch for net tree
Fixes: 5e1567aeb7fe ("net sched: skbedit action fix late binding")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 8258d2da9f9f521dce7019e018360c28d116354e ]
When we fail to modify a rule, we incorrectly release the idr handle
of the unmodified old rule.
Fix that by checking if we need to release it.
Fixes: fe2502e49b58 ("net_sched: remove cls_flower idr on failure")
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f29cdfbe33d6915ba8056179b0041279a67e3647 ]
tcf_skbmod_init() can fail after the idr has been successfully reserved.
When this happens, every subsequent attempt to configure skbmod rules
using the same idr value will systematically fail with -ENOSPC, unless
the first attempt was done using the 'replace' keyword:
# tc action add action skbmod swap mac index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action skbmod swap mac index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action skbmod swap mac index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in tcf_skbmod_init(), ensuring that tcf_idr_release() is called
on the error path when the idr has been reserved, but not yet inserted.
Also, don't test 'ovr' in the error path, to avoid a 'replace' failure
implicitly become a 'delete' that leaks refcount in act_skbmod module:
# rmmod act_skbmod; modprobe act_skbmod
# tc action add action skbmod swap mac index 100
# tc action add action skbmod swap mac continue index 100
RTNETLINK answers: File exists
We have an error talking to the kernel
# tc action replace action skbmod swap mac continue index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action list action skbmod
#
# rmmod act_skbmod
rmmod: ERROR: Module act_skbmod is in use
Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 1e46ef1762bb2e52f0f996131a4d16ed4e9fd065 ]
__tcf_ipt_init() can fail after the idr has been successfully reserved.
When this happens, subsequent attempts to configure xt/ipt rules using
the same idr value systematically fail with -ENOSPC:
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
Command "(null)" is unknown, try "tc actions help".
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
Command "(null)" is unknown, try "tc actions help".
# tc action add action xt -j LOG --log-prefix test1 index 100
tablename: mangle hook: NF_IP_POST_ROUTING
target: LOG level warning prefix "test1" index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of __tcf_ipt_init(), calling tcf_idr_release()
in place of tcf_idr_cleanup(). Since tcf_ipt_release() can now be called
when tcfi_t is NULL, we also need to protect calls to ipt_destroy_target()
to avoid NULL pointer dereference.
Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 94fa3f929ec0c048b1f3658cc335b940df4f6d22 ]
tcf_pedit_init() can fail to allocate 'keys' after the idr has been
successfully reserved. When this happens, subsequent attempts to configure
a pedit rule using the same idr value systematically fail with -ENOSPC:
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action pedit munge ip ttl set 63 index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of tcf_act_pedit_init(), calling
tcf_idr_release() in place of tcf_idr_cleanup().
Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5bf7f8185f7c7112decdfe3d3e5c5d5e67f099a1 ]
tcf_act_police_init() can fail after the idr has been successfully
reserved (e.g., qdisc_get_rtab() may return NULL). When this happens,
subsequent attempts to configure a police rule using the same idr value
systematiclly fail with -ENOSPC:
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc action add action police rate 1000 burst 1000 drop index 100
RTNETLINK answers: No space left on device
...
Fix this in the error path of tcf_act_police_init(), calling
tcf_idr_release() in place of tcf_idr_cleanup().
Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 60e10b3adc3bac0f6a894c28e0eb1f2d13607362 ]
if the kernel fails to duplicate 'sdata', creation of a new action fails
with -ENOMEM. However, subsequent attempts to install the same action
using the same value of 'index' systematically fail with -ENOSPC, and
that value of 'index' will no more be usable by act_simple, until rmmod /
insmod of act_simple.ko is done:
# tc actions add action simple sdata hello index 100
# tc actions list action simple
action order 0: Simple <hello>
index 100 ref 1 bind 0
# tc actions flush action simple
# tc actions add action simple sdata hello index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
# tc actions flush action simple
# tc actions add action simple sdata hello index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# tc actions add action simple sdata hello index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
...
Fix this in the error path of tcf_simp_init(), calling tcf_idr_release()
in place of tcf_idr_cleanup().
Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit bbc09e7842a5023ba5bc0f8d559b9dd464e44006 ]
when the following command sequence is entered
# tc action add action bpf bytecode '4,40 0 0 12,31 0 1 2048,6 0 0 262144,6 0 0 0' index 100
RTNETLINK answers: Invalid argument
We have an error talking to the kernel
# tc action add action bpf bytecode '4,40 0 0 12,21 0 1 2048,6 0 0 262144,6 0 0 0' index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
act_bpf correctly refuses to install the first TC rule, because 31 is not
a valid instruction. However, it refuses to install the second TC rule,
even if the BPF code is correct. Furthermore, it's no more possible to
install any other rule having the same value of 'index' until act_bpf
module is unloaded/inserted again. After the idr has been reserved, call
tcf_idr_release() instead of tcf_idr_cleanup(), to fix this issue.
Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 1f110e7cae09e6c6a144616480d1a9dd99c5208a ]
when the following command
# tc action add action sample rate 100 group 100 index 100
is run for the first time, and psample_group_get(100) fails to create a
new group, tcf_sample_cleanup() calls psample_group_put(NULL), thus
causing the following error:
BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
IP: psample_group_put+0x15/0x71 [psample]
PGD 8000000075775067 P4D 8000000075775067 PUD 7453c067 PMD 0
Oops: 0002 [#1] SMP PTI
Modules linked in: act_sample(E) psample ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core mbcache jbd2 crct10dif_pclmul snd_hwdep crc32_pclmul snd_seq ghash_clmulni_intel pcbc snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer glue_helper snd cryptd joydev pcspkr i2c_piix4 soundcore virtio_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_net ata_piix virtio_console virtio_blk libata serio_raw crc32c_intel virtio_pci i2c_core virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_tunnel_key]
CPU: 2 PID: 5740 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:psample_group_put+0x15/0x71 [psample]
RSP: 0018:ffffb8a80032f7d0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000024
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffc06d93c0
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
R10: 00000000bd003000 R11: ffff979fba04aa59 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: ffff979fbba3f22c
FS: 00007f7638112740(0000) GS:ffff979fbfd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000001c CR3: 00000000734ea001 CR4: 00000000001606e0
Call Trace:
__tcf_idr_release+0x79/0xf0
tcf_sample_init+0x125/0x1d0 [act_sample]
tcf_action_init_1+0x2cc/0x430
tcf_action_init+0xd3/0x1b0
tc_ctl_action+0x18b/0x240
rtnetlink_rcv_msg+0x29c/0x310
? _cond_resched+0x15/0x30
? __kmalloc_node_track_caller+0x1b9/0x270
? rtnl_calcit.isra.28+0x100/0x100
netlink_rcv_skb+0xd2/0x110
netlink_unicast+0x17c/0x230
netlink_sendmsg+0x2cd/0x3c0
sock_sendmsg+0x30/0x40
___sys_sendmsg+0x27a/0x290
? filemap_map_pages+0x34a/0x3a0
? __handle_mm_fault+0xbfd/0xe20
__sys_sendmsg+0x51/0x90
do_syscall_64+0x6e/0x1a0
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7f7637523ba0
RSP: 002b:00007fff0473ef58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fff0473f080 RCX: 00007f7637523ba0
RDX: 0000000000000000 RSI: 00007fff0473efd0 RDI: 0000000000000003
RBP: 000000005aaaac80 R08: 0000000000000002 R09: 0000000000000000
R10: 00007fff0473e9e0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff0473f094 R14: 0000000000000001 R15: 0000000000669f60
Code: be 02 00 00 00 48 89 df e8 a9 fe ff ff e9 7c ff ff ff 0f 1f 40 00 0f 1f 44 00 00 53 48 89 fb 48 c7 c7 c0 93 6d c0 e8 db 20 8c ef <83> 6b 1c 01 74 10 48 c7 c7 c0 93 6d c0 ff 14 25 e8 83 83 b0 5b
RIP: psample_group_put+0x15/0x71 [psample] RSP: ffffb8a80032f7d0
CR2: 000000000000001c
Fix it in tcf_sample_cleanup(), ensuring that calls to psample_group_put(p)
are done only when p is not NULL.
Fixes: cadb9c9fdbc6 ("net/sched: act_sample: Fix error path in init")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 44a63b137f7b6e4c7bd6c9cc21615941cb36509d ]
Hangbin reported an Oops triggered by the syzkaller qdisc rules:
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN PTI
Modules linked in: sch_red
CPU: 0 PID: 28699 Comm: syz-executor5 Not tainted 4.17.0-rc4.kcov #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:qdisc_hash_add+0x26/0xa0
RSP: 0018:ffff8800589cf470 EFLAGS: 00010203
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff824ad971
RDX: 0000000000000007 RSI: ffffc9000ce9f000 RDI: 000000000000003c
RBP: 0000000000000001 R08: ffffed000b139ea2 R09: ffff8800589cf4f0
R10: ffff8800589cf50f R11: ffffed000b139ea2 R12: ffff880054019fc0
R13: ffff880054019fb4 R14: ffff88005c0af600 R15: ffff880054019fb0
FS: 00007fa6edcb1700(0000) GS:ffff88005ce00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000740 CR3: 000000000fc16000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
red_change+0x2d2/0xed0 [sch_red]
qdisc_create+0x57e/0xef0
tc_modify_qdisc+0x47f/0x14e0
rtnetlink_rcv_msg+0x6a8/0x920
netlink_rcv_skb+0x2a2/0x3c0
netlink_unicast+0x511/0x740
netlink_sendmsg+0x825/0xc30
sock_sendmsg+0xc5/0x100
___sys_sendmsg+0x778/0x8e0
__sys_sendmsg+0xf5/0x1b0
do_syscall_64+0xbd/0x3b0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x450869
RSP: 002b:00007fa6edcb0c48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fa6edcb16b4 RCX: 0000000000450869
RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000013
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000008778 R14: 0000000000702838 R15: 00007fa6edcb1700
Code: e9 0b fe ff ff 0f 1f 44 00 00 55 53 48 89 fb 89 f5 e8 3f 07 f3 fe 48 8d 7b 3c 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2 75 51
RIP: qdisc_hash_add+0x26/0xa0 RSP: ffff8800589cf470
When a red qdisc is updated with a 0 limit, the child qdisc is left
unmodified, no additional scheduler is created in red_change(),
the 'child' local variable is rightfully NULL and must not add it
to the hash table.
This change addresses the above issue moving qdisc_hash_add() right
after the child qdisc creation. It additionally removes unneeded checks
for noop_qdisc.
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: 49b499718fa1 ("net: sched: make default fifo qdiscs appear in the dump")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5a4931ae0193f8a4a97e8260fd0df1d705d83299 ]
Similarly to what was done with commit a52956dfc503 ("net sched actions:
fix refcnt leak in skbmod"), fix the error path of tcf_vlan_init() to avoid
refcnt leaks when wrong value of TCA_VLAN_PUSH_VLAN_PROTOCOL is given.
Fixes: 5026c9b1bafc ("net sched: vlan action fix late binding")
CC: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d68d75fdc34b0253c2bded7ed18cd60eb5a9599b ]
In case modules are not configured, error out when tp->ops is null
and prevent later null pointer dereference.
Fixes: 33a48927c193 ("sched: push TC filter protocol creation into a separate function")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7df40c2673a1307c3260aab6f9d4b9bf97ca8fd7 ]
Normally, a socket can not be freed/reused unless all its TX packets
left qdisc and were TX-completed. However connect(AF_UNSPEC) allows
this to happen.
With commit fc59d5bdf1e3 ("pkt_sched: fq: clear time_next_packet for
reused flows") we cleared f->time_next_packet but took no special
action if the flow was still in the throttled rb-tree.
Since f->time_next_packet is the key used in the rb-tree searches,
blindly clearing it might break rb-tree integrity. We need to make
sure the flow is no longer in the rb-tree to avoid this problem.
Fixes: fc59d5bdf1e3 ("pkt_sched: fq: clear time_next_packet for reused flows")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a52956dfc503f8cc5cfe6454959b7049fddb4413 ]
When application fails to pass flags in netlink TLV when replacing
existing skbmod action, the kernel will leak refcnt:
$ tc actions get action skbmod index 1
total acts 0
action order 0: skbmod pipe set smac 00:11:22:33:44:55
index 1 ref 1 bind 0
For example, at this point a buggy application replaces the action with
index 1 with new smac 00:aa:22:33:44:55, it fails because of zero flags,
however refcnt gets bumped:
$ tc actions get actions skbmod index 1
total acts 0
action order 0: skbmod pipe set smac 00:11:22:33:44:55
index 1 ref 2 bind 0
$
Tha patch fixes this by calling tcf_idr_release() on existing actions.
Fixes: 86da71b57383d ("net_sched: Introduce skbmod action")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit cc74eddd0ff325d57373cea99f642b787d7f76f5 ]
There is currently no handling to check on a invalid tlv length. This
patch adds such handling to avoid killing the kernel with a malformed
ife packet.
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Reviewed-by: Yotam Gigi <yotam.gi@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f6cd14537ff9919081be19b9c53b9b19c0d3ea97 ]
We need to record stats for received metadata that we dont know how
to process. Have find_decode_metaid() return -ENOENT to capture this.
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Reviewed-by: Yotam Gigi <yotam.gi@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 2d433610176d6569e8b3a28f67bc72235bf69efc ]
when the following command
# tc action replace action skbmod swap mac index 100
is run for the first time, and tcf_skbmod_init() fails to allocate struct
tcf_skbmod_params, tcf_skbmod_cleanup() calls kfree_rcu(NULL), thus
causing the following error:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: __call_rcu+0x23/0x2b0
PGD 8000000034057067 P4D 8000000034057067 PUD 74937067 PMD 0
Oops: 0002 [#1] SMP PTI
Modules linked in: act_skbmod(E) psample ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec crct10dif_pclmul mbcache jbd2 crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep pcbc snd_seq snd_seq_device snd_pcm aesni_intel snd_timer crypto_simd glue_helper snd cryptd virtio_balloon joydev soundcore pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_net virtio_blk ata_piix libata crc32c_intel virtio_pci serio_raw virtio_ring virtio i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_skbmod]
CPU: 3 PID: 3144 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:__call_rcu+0x23/0x2b0
RSP: 0018:ffffbd2e403e7798 EFLAGS: 00010246
RAX: ffffffffc0872080 RBX: ffff981d34bff780 RCX: 00000000ffffffff
RDX: ffffffff922a5f00 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000000021f
R10: 000000003d003000 R11: 0000000000aaaaaa R12: 0000000000000000
R13: ffffffff922a5f00 R14: 0000000000000001 R15: ffff981d3b698c2c
FS: 00007f3678292740(0000) GS:ffff981d3fd80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000007c57a006 CR4: 00000000001606e0
Call Trace:
__tcf_idr_release+0x79/0xf0
tcf_skbmod_init+0x1d1/0x210 [act_skbmod]
tcf_action_init_1+0x2cc/0x430
tcf_action_init+0xd3/0x1b0
tc_ctl_action+0x18b/0x240
rtnetlink_rcv_msg+0x29c/0x310
? _cond_resched+0x15/0x30
? __kmalloc_node_track_caller+0x1b9/0x270
? rtnl_calcit.isra.28+0x100/0x100
netlink_rcv_skb+0xd2/0x110
netlink_unicast+0x17c/0x230
netlink_sendmsg+0x2cd/0x3c0
sock_sendmsg+0x30/0x40
___sys_sendmsg+0x27a/0x290
? filemap_map_pages+0x34a/0x3a0
? __handle_mm_fault+0xbfd/0xe20
__sys_sendmsg+0x51/0x90
do_syscall_64+0x6e/0x1a0
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7f36776a3ba0
RSP: 002b:00007fff4703b618 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fff4703b740 RCX: 00007f36776a3ba0
RDX: 0000000000000000 RSI: 00007fff4703b690 RDI: 0000000000000003
RBP: 000000005aaaba36 R08: 0000000000000002 R09: 0000000000000000
R10: 00007fff4703b0a0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff4703b754 R14: 0000000000000001 R15: 0000000000669f60
Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 <48> 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89
RIP: __call_rcu+0x23/0x2b0 RSP: ffffbd2e403e7798
CR2: 0000000000000008
Fix it in tcf_skbmod_cleanup(), ensuring that kfree_rcu(p, ...) is called
only when p is not NULL.
Fixes: 86da71b57383 ("net_sched: Introduce skbmod action")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit abdadd3cfd3e7ea3da61ac774f84777d1f702058 ]
when the following command
# tc action add action tunnel_key unset index 100
is run for the first time, and tunnel_key_init() fails to allocate struct
tcf_tunnel_key_params, tunnel_key_release() dereferences NULL pointers.
This causes the following error:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: tunnel_key_release+0xd/0x40 [act_tunnel_key]
PGD 8000000033787067 P4D 8000000033787067 PUD 74646067 PMD 0
Oops: 0000 [#1] SMP PTI
Modules linked in: act_tunnel_key(E) act_csum ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul snd_hda_codec_generic ghash_clmulni_intel snd_hda_intel pcbc snd_hda_codec snd_hda_core snd_hwdep snd_seq aesni_intel snd_seq_device crypto_simd glue_helper snd_pcm cryptd joydev snd_timer pcspkr virtio_balloon snd i2c_piix4 soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net virtio_blk drm virtio_console crc32c_intel ata_piix serio_raw i2c_core virtio_pci libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod
CPU: 2 PID: 3101 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:tunnel_key_release+0xd/0x40 [act_tunnel_key]
RSP: 0018:ffffba46803b7768 EFLAGS: 00010286
RAX: ffffffffc09010a0 RBX: 0000000000000000 RCX: 0000000000000024
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff99ee336d7480
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
R10: 0000000000000220 R11: ffff99ee79d73131 R12: 0000000000000000
R13: ffff99ee32d67610 R14: ffff99ee7671dc38 R15: 00000000fffffff4
FS: 00007febcb2cd740(0000) GS:ffff99ee7fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000007c8e4005 CR4: 00000000001606e0
Call Trace:
__tcf_idr_release+0x79/0xf0
tunnel_key_init+0xd9/0x460 [act_tunnel_key]
tcf_action_init_1+0x2cc/0x430
tcf_action_init+0xd3/0x1b0
tc_ctl_action+0x18b/0x240
rtnetlink_rcv_msg+0x29c/0x310
? _cond_resched+0x15/0x30
? __kmalloc_node_track_caller+0x1b9/0x270
? rtnl_calcit.isra.28+0x100/0x100
netlink_rcv_skb+0xd2/0x110
netlink_unicast+0x17c/0x230
netlink_sendmsg+0x2cd/0x3c0
sock_sendmsg+0x30/0x40
___sys_sendmsg+0x27a/0x290
__sys_sendmsg+0x51/0x90
do_syscall_64+0x6e/0x1a0
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7febca6deba0
RSP: 002b:00007ffe7b0dd128 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007ffe7b0dd250 RCX: 00007febca6deba0
RDX: 0000000000000000 RSI: 00007ffe7b0dd1a0 RDI: 0000000000000003
RBP: 000000005aaa90cb R08: 0000000000000002 R09: 0000000000000000
R10: 00007ffe7b0dcba0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffe7b0dd264 R14: 0000000000000001 R15: 0000000000669f60
Code: 44 00 00 8b 0d b5 23 00 00 48 8b 87 48 10 00 00 48 8b 3c c8 e9 a5 e5 d8 c3 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 9f b0 00 00 00 <83> 7b 10 01 74 0b 48 89 df 31 f6 5b e9 f2 fa 7f c3 48 8b 7b 18
RIP: tunnel_key_release+0xd/0x40 [act_tunnel_key] RSP: ffffba46803b7768
CR2: 0000000000000010
Fix this in tunnel_key_release(), ensuring 'param' is not NULL before
dereferencing it.
Fixes: d0f6dd8a914f ("net/sched: Introduce act_tunnel_key")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 3239534a79ee6f20cffd974173a1e62e0730e8ac ]
when tcf_bpf_init_from_ops() fails (e.g. because of program having invalid
number of instructions), tcf_bpf_cfg_cleanup() calls bpf_prog_put(NULL) or
bpf_prog_destroy(NULL). Unless CONFIG_BPF_SYSCALL is unset, this causes
the following error:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
PGD 800000007345a067 P4D 800000007345a067 PUD 340e1067 PMD 0
Oops: 0000 [#1] SMP PTI
Modules linked in: act_bpf(E) ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_generic pcbc snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd glue_helper cryptd joydev snd_timer snd virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_blk drm virtio_net virtio_console i2c_core crc32c_intel serio_raw virtio_pci ata_piix libata virtio_ring floppy virtio dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_bpf]
CPU: 3 PID: 5654 Comm: tc Tainted: G E 4.16.0.bpf_test+ #408
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:__bpf_prog_put+0xc/0xc0
RSP: 0018:ffff9594003ef728 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff9594003ef758 RCX: 0000000000000024
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
R10: 0000000000000220 R11: ffff8a7ab9f17131 R12: 0000000000000000
R13: ffff8a7ab7c3c8e0 R14: 0000000000000001 R15: ffff8a7ab88f1054
FS: 00007fcb2f17c740(0000) GS:ffff8a7abfd80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000020 CR3: 000000007c888006 CR4: 00000000001606e0
Call Trace:
tcf_bpf_cfg_cleanup+0x2f/0x40 [act_bpf]
tcf_bpf_cleanup+0x4c/0x70 [act_bpf]
__tcf_idr_release+0x79/0x140
tcf_bpf_init+0x125/0x330 [act_bpf]
tcf_action_init_1+0x2cc/0x430
? get_page_from_freelist+0x3f0/0x11b0
tcf_action_init+0xd3/0x1b0
tc_ctl_action+0x18b/0x240
rtnetlink_rcv_msg+0x29c/0x310
? _cond_resched+0x15/0x30
? __kmalloc_node_track_caller+0x1b9/0x270
? rtnl_calcit.isra.29+0x100/0x100
netlink_rcv_skb+0xd2/0x110
netlink_unicast+0x17c/0x230
netlink_sendmsg+0x2cd/0x3c0
sock_sendmsg+0x30/0x40
___sys_sendmsg+0x27a/0x290
? mem_cgroup_commit_charge+0x80/0x130
? page_add_new_anon_rmap+0x73/0xc0
? do_anonymous_page+0x2a2/0x560
? __handle_mm_fault+0xc75/0xe20
__sys_sendmsg+0x58/0xa0
do_syscall_64+0x6e/0x1a0
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7fcb2e58eba0
RSP: 002b:00007ffc93c496c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007ffc93c497f0 RCX: 00007fcb2e58eba0
RDX: 0000000000000000 RSI: 00007ffc93c49740 RDI: 0000000000000003
RBP: 000000005ac6a646 R08: 0000000000000002 R09: 0000000000000000
R10: 00007ffc93c49120 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffc93c49804 R14: 0000000000000001 R15: 000000000066afa0
Code: 5f 00 48 8b 43 20 48 c7 c7 70 2f 7c b8 c7 40 10 00 00 00 00 5b e9 a5 8b 61 00 0f 1f 44 00 00 0f 1f 44 00 00 41 54 55 48 89 fd 53 <48> 8b 47 20 f0 ff 08 74 05 5b 5d 41 5c c3 41 89 f4 0f 1f 44 00
RIP: __bpf_prog_put+0xc/0xc0 RSP: ffff9594003ef728
CR2: 0000000000000020
Fix it in tcf_bpf_cfg_cleanup(), ensuring that bpf_prog_{put,destroy}(f)
is called only when f is not NULL.
Fixes: bbc09e7842a5 ("net/sched: fix idr leak on the error path of tcf_bpf_init()")
Reported-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 734549eb550c0c720bc89e50501f1b1e98cdd841 ]
Fixes a bug in the tcf_dump_walker function that can cause some actions
to not be reported when dumping a large number of actions. This issue
became more aggrevated when cookies feature was added. In particular
this issue is manifest when large cookie values are assigned to the
actions and when enough actions are created that the resulting table
must be dumped in multiple batches.
The number of actions returned in each batch is limited by the total
number of actions and the memory buffer size. With small cookies
the numeric limit is reached before the buffer size limit, which avoids
the code path triggering this bug. When large cookies are used buffer
fills before the numeric limit, and the erroneous code path is hit.
For example after creating 32 csum actions with the cookie
aaaabbbbccccdddd
$ tc actions ls action csum
total acts 26
action order 0: csum (tcp) action continue
index 1 ref 1 bind 0
cookie aaaabbbbccccdddd
.....
action order 25: csum (tcp) action continue
index 26 ref 1 bind 0
cookie aaaabbbbccccdddd
total acts 6
action order 0: csum (tcp) action continue
index 28 ref 1 bind 0
cookie aaaabbbbccccdddd
......
action order 5: csum (tcp) action continue
index 32 ref 1 bind 0
cookie aaaabbbbccccdddd
Note that the action with index 27 is omitted from the report.
Fixes: 4b3550ef530c ("[NET_SCHED]: Use nla_nest_start/nla_nest_end")"
Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 35d889d10b649fda66121891ec05eca88150059d ]
When we exceed current packets limit and we have more than one
segment in the list returned by skb_gso_segment(), netem drops
only the first one, skipping the rest, hence kmemleak reports:
unreferenced object 0xffff880b5d23b600 (size 1024):
comm "softirq", pid 0, jiffies 4384527763 (age 2770.629s)
hex dump (first 32 bytes):
00 80 23 5d 0b 88 ff ff 00 00 00 00 00 00 00 00 ..#]............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000d8a19b9d>] __alloc_skb+0xc9/0x520
[<000000001709b32f>] skb_segment+0x8c8/0x3710
[<00000000c7b9bb88>] tcp_gso_segment+0x331/0x1830
[<00000000c921cba1>] inet_gso_segment+0x476/0x1370
[<000000008b762dd4>] skb_mac_gso_segment+0x1f9/0x510
[<000000002182660a>] __skb_gso_segment+0x1dd/0x620
[<00000000412651b9>] netem_enqueue+0x1536/0x2590 [sch_netem]
[<0000000005d3b2a9>] __dev_queue_xmit+0x1167/0x2120
[<00000000fc5f7327>] ip_finish_output2+0x998/0xf00
[<00000000d309e9d3>] ip_output+0x1aa/0x2c0
[<000000007ecbd3a4>] tcp_transmit_skb+0x18db/0x3670
[<0000000042d2a45f>] tcp_write_xmit+0x4d4/0x58c0
[<0000000056a44199>] tcp_tasklet_func+0x3d9/0x540
[<0000000013d06d02>] tasklet_action+0x1ca/0x250
[<00000000fcde0b8b>] __do_softirq+0x1b4/0x5a3
[<00000000e7ed027c>] irq_exit+0x1e2/0x210
Fix it by adding the rest of the segments, if any, to skb 'to_free'
list. Add new __qdisc_drop_all() and qdisc_drop_all() functions
because they can be useful in the future if we need to drop segmented
GSO packets in other places.
Fixes: 6071bd1aa13e ("netem: Segment GSO packets on enqueue")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 51d4740f88affd85d49c04e3c9cd129c0e33bcb9 ]
If set/unset mode of the tunnel_key action is not provided, ->init() still
returns 0, and the caller proceeds with bogus 'struct tc_action *' object,
this results in crash:
% tc actions add action tunnel_key src_ip 1.1.1.1 dst_ip 2.2.2.1 id 7 index 1
[ 35.805515] general protection fault: 0000 [#1] SMP PTI
[ 35.806161] Modules linked in: act_tunnel_key kvm_intel kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64
crypto_simd glue_helper cryptd serio_raw
[ 35.808233] CPU: 1 PID: 428 Comm: tc Not tainted 4.16.0-rc4+ #286
[ 35.808929] RIP: 0010:tcf_action_init+0x90/0x190
[ 35.809457] RSP: 0018:ffffb8edc068b9a0 EFLAGS: 00010206
[ 35.810053] RAX: 1320c000000a0003 RBX: 0000000000000001 RCX: 0000000000000000
[ 35.810866] RDX: 0000000000000070 RSI: 0000000000007965 RDI: ffffb8edc068b910
[ 35.811660] RBP: ffffb8edc068b9d0 R08: 0000000000000000 R09: ffffb8edc068b808
[ 35.812463] R10: ffffffffc02bf040 R11: 0000000000000040 R12: ffffb8edc068bb38
[ 35.813235] R13: 0000000000000000 R14: 0000000000000000 R15: ffffb8edc068b910
[ 35.814006] FS: 00007f3d0d8556c0(0000) GS:ffff91d1dbc40000(0000)
knlGS:0000000000000000
[ 35.814881] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 35.815540] CR2: 000000000043f720 CR3: 0000000019248001 CR4: 00000000001606a0
[ 35.816457] Call Trace:
[ 35.817158] tc_ctl_action+0x11a/0x220
[ 35.817795] rtnetlink_rcv_msg+0x23d/0x2e0
[ 35.818457] ? __slab_alloc+0x1c/0x30
[ 35.819079] ? __kmalloc_node_track_caller+0xb1/0x2b0
[ 35.819544] ? rtnl_calcit.isra.30+0xe0/0xe0
[ 35.820231] netlink_rcv_skb+0xce/0x100
[ 35.820744] netlink_unicast+0x164/0x220
[ 35.821500] netlink_sendmsg+0x293/0x370
[ 35.822040] sock_sendmsg+0x30/0x40
[ 35.822508] ___sys_sendmsg+0x2c5/0x2e0
[ 35.823149] ? pagecache_get_page+0x27/0x220
[ 35.823714] ? filemap_fault+0xa2/0x640
[ 35.824423] ? page_add_file_rmap+0x108/0x200
[ 35.825065] ? alloc_set_pte+0x2aa/0x530
[ 35.825585] ? finish_fault+0x4e/0x70
[ 35.826140] ? __handle_mm_fault+0xbc1/0x10d0
[ 35.826723] ? __sys_sendmsg+0x41/0x70
[ 35.827230] __sys_sendmsg+0x41/0x70
[ 35.827710] do_syscall_64+0x68/0x120
[ 35.828195] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 35.828859] RIP: 0033:0x7f3d0ca4da67
[ 35.829331] RSP: 002b:00007ffc9f284338 EFLAGS: 00000246 ORIG_RAX:
000000000000002e
[ 35.830304] RAX: ffffffffffffffda RBX: 00007ffc9f284460 RCX: 00007f3d0ca4da67
[ 35.831247] RDX: 0000000000000000 RSI: 00007ffc9f2843b0 RDI: 0000000000000003
[ 35.832167] RBP: 000000005aa6a7a9 R08: 0000000000000001 R09: 0000000000000000
[ 35.833075] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000
[ 35.833997] R13: 00007ffc9f2884c0 R14: 0000000000000001 R15: 0000000000674640
[ 35.834923] Code: 24 30 bb 01 00 00 00 45 31 f6 eb 5e 8b 50 08 83 c2 07 83 e2
fc 83 c2 70 49 8b 07 48 8b 40 70 48 85 c0 74 10 48 89 14 24 4c 89 ff <ff> d0 48
8b 14 24 48 01 c2 49 01 d6 45 85 ed 74 05 41 83 47 2c
[ 35.837442] RIP: tcf_action_init+0x90/0x190 RSP: ffffb8edc068b9a0
[ 35.838291] ---[ end trace a095c06ee4b97a26 ]---
Fixes: d0f6dd8a914f ("net/sched: Introduce act_tunnel_key")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7bbde83b1860c28a1cc35516352c4e7e5172c29a ]
In qdisc_graft_qdisc a "new" qdisc is attached and the 'qdisc_destroy'
operation is called on the old qdisc. The destroy operation will wait
a rcu grace period and call qdisc_rcu_free(). At which point
gso_cpu_skb is free'd along with all stats so no need to zero stats
and gso_cpu_skb from the graft operation itself.
Further after dropping the qdisc locks we can not continue to call
qdisc_reset before waiting an rcu grace period so that the qdisc is
detached from all cpus. By removing the qdisc_reset() here we get
the correct property of waiting an rcu grace period and letting the
qdisc_destroy operation clean up the qdisc correctly.
Note, a refcnt greater than 1 would cause the destroy operation to
be aborted however if this ever happened the reference to the qdisc
would be lost and we would have a memory leak.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d7cdee5ea8d28ae1b6922deb0c1badaa3aa0ef8c ]
Li Shuang reported an Oops with cls_u32 due to an use-after-free
in u32_destroy_key(). The use-after-free can be triggered with:
dev=lo
tc qdisc add dev $dev root handle 1: htb default 10
tc filter add dev $dev parent 1: prio 5 handle 1: protocol ip u32 divisor 256
tc filter add dev $dev protocol ip parent 1: prio 5 u32 ht 800:: match ip dst\
10.0.0.0/8 hashkey mask 0x0000ff00 at 16 link 1:
tc qdisc del dev $dev root
Which causes the following kasan splat:
==================================================================
BUG: KASAN: use-after-free in u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
Read of size 4 at addr ffff881b83dae618 by task kworker/u48:5/571
CPU: 17 PID: 571 Comm: kworker/u48:5 Not tainted 4.15.0+ #87
Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
Workqueue: tc_filter_workqueue u32_delete_key_freepf_work [cls_u32]
Call Trace:
dump_stack+0xd6/0x182
? dma_virt_map_sg+0x22e/0x22e
print_address_description+0x73/0x290
kasan_report+0x277/0x360
? u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
u32_delete_key_freepf_work+0x1c/0x30 [cls_u32]
process_one_work+0xae0/0x1c80
? sched_clock+0x5/0x10
? pwq_dec_nr_in_flight+0x3c0/0x3c0
? _raw_spin_unlock_irq+0x29/0x40
? trace_hardirqs_on_caller+0x381/0x570
? _raw_spin_unlock_irq+0x29/0x40
? finish_task_switch+0x1e5/0x760
? finish_task_switch+0x208/0x760
? preempt_notifier_dec+0x20/0x20
? __schedule+0x839/0x1ee0
? check_noncircular+0x20/0x20
? firmware_map_remove+0x73/0x73
? find_held_lock+0x39/0x1c0
? worker_thread+0x434/0x1820
? lock_contended+0xee0/0xee0
? lock_release+0x1100/0x1100
? init_rescuer.part.16+0x150/0x150
? retint_kernel+0x10/0x10
worker_thread+0x216/0x1820
? process_one_work+0x1c80/0x1c80
? lock_acquire+0x1a5/0x540
? lock_downgrade+0x6b0/0x6b0
? sched_clock+0x5/0x10
? lock_release+0x1100/0x1100
? compat_start_thread+0x80/0x80
? do_raw_spin_trylock+0x190/0x190
? _raw_spin_unlock_irq+0x29/0x40
? trace_hardirqs_on_caller+0x381/0x570
? _raw_spin_unlock_irq+0x29/0x40
? finish_task_switch+0x1e5/0x760
? finish_task_switch+0x208/0x760
? preempt_notifier_dec+0x20/0x20
? __schedule+0x839/0x1ee0
? kmem_cache_alloc_trace+0x143/0x320
? firmware_map_remove+0x73/0x73
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x170
? find_held_lock+0x39/0x1c0
? schedule+0xf3/0x3b0
? lock_downgrade+0x6b0/0x6b0
? __schedule+0x1ee0/0x1ee0
? do_wait_intr_irq+0x340/0x340
? do_raw_spin_trylock+0x190/0x190
? _raw_spin_unlock_irqrestore+0x32/0x60
? process_one_work+0x1c80/0x1c80
? process_one_work+0x1c80/0x1c80
kthread+0x312/0x3d0
? kthread_create_worker_on_cpu+0xc0/0xc0
ret_from_fork+0x3a/0x50
Allocated by task 1688:
kasan_kmalloc+0xa0/0xd0
__kmalloc+0x162/0x380
u32_change+0x1220/0x3c9e [cls_u32]
tc_ctl_tfilter+0x1ba6/0x2f80
rtnetlink_rcv_msg+0x4f0/0x9d0
netlink_rcv_skb+0x124/0x320
netlink_unicast+0x430/0x600
netlink_sendmsg+0x8fa/0xd60
sock_sendmsg+0xb1/0xe0
___sys_sendmsg+0x678/0x980
__sys_sendmsg+0xc4/0x210
do_syscall_64+0x232/0x7f0
return_from_SYSCALL_64+0x0/0x75
Freed by task 112:
kasan_slab_free+0x71/0xc0
kfree+0x114/0x320
rcu_process_callbacks+0xc3f/0x1600
__do_softirq+0x2bf/0xc06
The buggy address belongs to the object at ffff881b83dae600
which belongs to the cache kmalloc-4096 of size 4096
The buggy address is located 24 bytes inside of
4096-byte region [ffff881b83dae600, ffff881b83daf600)
The buggy address belongs to the page:
page:ffffea006e0f6a00 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0
flags: 0x17ffffc0008100(slab|head)
raw: 0017ffffc0008100 0000000000000000 0000000000000000 0000000100070007
raw: dead000000000100 dead000000000200 ffff880187c0e600 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff881b83dae500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff881b83dae580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff881b83dae600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff881b83dae680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff881b83dae700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
The problem is that the htnode is freed before the linked knodes and the
latter will try to access the first at u32_destroy_key() time.
This change addresses the issue using the htnode refcnt to guarantee
the correct free order. While at it also add a RCU annotation,
to keep sparse happy.
v1 -> v2: use rtnl_derefence() instead of RCU read locks
v2 -> v3:
- don't check refcnt in u32_destroy_hnode()
- cleaned-up u32_destroy() implementation
- cleaned-up code comment
v3 -> v4:
- dropped unneeded comment
Reported-by: Li Shuang <shuali@redhat.com>
Fixes: c0d378ef1266 ("net_sched: use tcf_queue_work() in u32 filter")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit eb53f7af6f15285e2f6ada97285395343ce9f433 ]
The following sequence is currently broken:
# tc qdisc add dev foo ingress
# tc filter replace dev foo protocol all ingress \
u32 match u8 0 0 action mirred egress mirror dev bar1
# tc filter replace dev foo protocol all ingress \
handle 800::800 pref 49152 \
u32 match u8 0 0 action mirred egress mirror dev bar2
Error: cls_u32: Key node flags do not match passed flags.
We have an error talking to the kernel, -1
The error comes from u32_change() when comparing new and
existing flags. The existing ones always contains one of
TCA_CLS_FLAGS_{,NOT}_IN_HW flag depending on offloading state.
These flags cannot be passed from userspace so the condition
(n->flags != flags) in u32_change() always fails.
Fix the condition so the flags TCA_CLS_FLAGS_NOT_IN_HW and
TCA_CLS_FLAGS_IN_HW are not taken into account.
Fixes: 24d3dc6d27ea ("net/sched: cls_u32: Reflect HW offload status")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5ae437ad5a2ed573b1ebb04e0afa70b8869f88dd ]
So far, if the filter was too large to fit in the allocated skb, the
kernel did not return any error and stopped dumping. Modify the dumper
so that it returns -EMSGSIZE when a filter fails to dump and it is the
first filter in the skb. If we are not first, we will get a next chance
with more room.
I understand this is pretty near to being an API change, but the
original design (silent truncation) can be considered a bug.
Note: The error case can happen pretty easily if you create a filter
with 32 actions and have 4kb pages. Also recent versions of iproute try
to be clever with their buffer allocation size, which in turn leads to
Signed-off-by: Roman Kapl <code@rkapl.cz>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit df45bf84e4f5a48f23d4b1a07d21d566e8b587b2 upstream.
Since the block is freed with last chain being put, once we reach the
end of iteration of list_for_each_entry_safe, the block may be
already freed. I'm hitting this only by creating and deleting clsact:
[ 202.171952] ==================================================================
[ 202.180182] BUG: KASAN: use-after-free in tcf_block_put_ext+0x240/0x390
[ 202.187590] Read of size 8 at addr ffff880225539a80 by task tc/796
[ 202.194508]
[ 202.196185] CPU: 0 PID: 796 Comm: tc Not tainted 4.15.0-rc2jiri+ #5
[ 202.203200] Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
[ 202.213613] Call Trace:
[ 202.216369] dump_stack+0xda/0x169
[ 202.220192] ? dma_virt_map_sg+0x147/0x147
[ 202.224790] ? show_regs_print_info+0x54/0x54
[ 202.229691] ? tcf_chain_destroy+0x1dc/0x250
[ 202.234494] print_address_description+0x83/0x3d0
[ 202.239781] ? tcf_block_put_ext+0x240/0x390
[ 202.244575] kasan_report+0x1ba/0x460
[ 202.248707] ? tcf_block_put_ext+0x240/0x390
[ 202.253518] tcf_block_put_ext+0x240/0x390
[ 202.258117] ? tcf_chain_flush+0x290/0x290
[ 202.262708] ? qdisc_hash_del+0x82/0x1a0
[ 202.267111] ? qdisc_hash_add+0x50/0x50
[ 202.271411] ? __lock_is_held+0x5f/0x1a0
[ 202.275843] clsact_destroy+0x3d/0x80 [sch_ingress]
[ 202.281323] qdisc_destroy+0xcb/0x240
[ 202.285445] qdisc_graft+0x216/0x7b0
[ 202.289497] tc_get_qdisc+0x260/0x560
Fix this by holding the block also by chain 0 and put chain 0
explicitly, out of the list_for_each_entry_safe loop at the very
end of tcf_block_put_ext.
Fixes: efbf78973978 ("net_sched: get rid of rcu_barrier() in tcf_block_put_ext()")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit efbf78973978b0d25af59bc26c8013a942af6e64 upstream.
Both Eric and Paolo noticed the rcu_barrier() we use in
tcf_block_put_ext() could be a performance bottleneck when
we have a lot of tc classes.
Paolo provided the following to demonstrate the issue:
tc qdisc add dev lo root htb
for I in `seq 1 1000`; do
tc class add dev lo parent 1: classid 1:$I htb rate 100kbit
tc qdisc add dev lo parent 1:$I handle $((I + 1)): htb
for J in `seq 1 10`; do
tc filter add dev lo parent $((I + 1)): u32 match ip src 1.1.1.$J
done
done
time tc qdisc del dev root
real 0m54.764s
user 0m0.023s
sys 0m0.000s
The rcu_barrier() there is to ensure we free the block after all chains
are gone, that is, to queue tcf_block_put_final() at the tail of workqueue.
We can achieve this ordering requirement by refcnt'ing tcf block instead,
that is, the tcf block is freed only when the last chain in this block is
gone. This also simplifies the code.
Paolo reported after this patch we get:
real 0m0.017s
user 0m0.000s
sys 0m0.017s
Tested-by: Paolo Abeni <pabeni@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a60b3f515d30d0fe8537c64671926879a3548103 upstream.
tcf_block_put_ext has assumed that all filters (and thus their goto
actions) are destroyed in RCU callback and thus can not race with our
list iteration. However, that is not true during netns cleanup (see
tcf_exts_get_net comment).
Prevent the user after free by holding all chains (except 0, that one is
already held). foreach_safe is not enough in this case.
To reproduce, run the following in a netns and then delete the ns:
ip link add dtest type dummy
tc qdisc add dev dtest ingress
tc filter add dev dtest chain 1 parent ffff: handle 1 prio 1 flower action goto chain 2
Fixes: 822e86d997 ("net_sched: remove tcf_block_put_deferred()")
Signed-off-by: Roman Kapl <code@rkapl.cz>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|