summaryrefslogtreecommitdiff
path: root/net/sched/act_api.c
AgeCommit message (Collapse)AuthorFilesLines
2021-04-14net: sched: bump refcount for new action in ACT replace modeKumar Kartikeya Dwivedi1-0/+3
commit 6855e8213e06efcaf7c02a15e12b1ae64b9a7149 upstream. Currently, action creation using ACT API in replace mode is buggy. When invoking for non-existent action index 42, tc action replace action bpf obj foo.o sec <xyz> index 42 kernel creates the action, fills up the netlink response, and then just deletes the action after notifying userspace. tc action show action bpf doesn't list the action. This happens due to the following sequence when ovr = 1 (replace mode) is enabled: tcf_idr_check_alloc is used to atomically check and either obtain reference for existing action at index, or reserve the index slot using a dummy entry (ERR_PTR(-EBUSY)). This is necessary as pointers to these actions will be held after dropping the idrinfo lock, so bumping the reference count is necessary as we need to insert the actions, and notify userspace by dumping their attributes. Finally, we drop the reference we took using the tcf_action_put_many call in tcf_action_add. However, for the case where a new action is created due to free index, its refcount remains one. This when paired with the put_many call leads to the kernel setting up the action, notifying userspace of its creation, and then tearing it down. For existing actions, the refcount is still held so they remain unaffected. Fortunately due to rtnl_lock serialization requirement, such an action with refcount == 1 will not be concurrently deleted by anything else, at best CLS API can move its refcount up and down by binding to it after it has been published from tcf_idr_insert_many. Since refcount is atleast one until put_many call, CLS API cannot delete it. Also __tcf_action_put release path already ensures deterministic outcome (either new action will be created or existing action will be reused in case CLS API tries to bind to action concurrently) due to idr lock serialization. We fix this by making refcount of newly created actions as 2 in ACT API replace mode. A relaxed store will suffice as visibility is ensured only after the tcf_idr_insert_many call. Note that in case of creation or overwriting using CLS API only (i.e. bind = 1), overwriting existing action object is not allowed, and any such request is silently ignored (without error). The refcount bump that occurs in tcf_idr_check_alloc call there for existing action will pair with tcf_exts_destroy call made from the owner module for the same action. In case of action creation, there is no existing action, so no tcf_exts_destroy callback happens. This means no code changes for CLS API. Fixes: cae422f379f3 ("net: sched: use reference counting action init") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-29net: avoid potential infinite loop in tc_ctl_action()Eric Dumazet1-6/+8
[ Upstream commit 39f13ea2f61b439ebe0060393e9c39925c9ee28c ] tc_ctl_action() has the ability to loop forever if tcf_action_add() returns -EAGAIN. This special case has been done in case a module needed to be loaded, but it turns out that tcf_add_notify() could also return -EAGAIN if the socket sk_rcvbuf limit is hit. We need to separate the two cases, and only loop for the module loading case. While we are at it, add a limit of 10 attempts since unbounded loops are always scary. syzbot repro was something like : socket(PF_NETLINK, SOCK_RAW|SOCK_NONBLOCK, NETLINK_ROUTE) = 3 write(3, ..., 38) = 38 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [0], 4) = 0 sendmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{..., 388}], msg_controllen=0, msg_flags=0x10}, ...) NMI backtrace for cpu 0 CPU: 0 PID: 1054 Comm: khungtaskd Not tainted 5.4.0-rc1+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101 nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62 arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline] check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline] watchdog+0x9d0/0xef0 kernel/hung_task.c:289 kthread+0x361/0x430 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 8859 Comm: syz-executor910 Not tainted 5.4.0-rc1+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:arch_local_save_flags arch/x86/include/asm/paravirt.h:751 [inline] RIP: 0010:lockdep_hardirqs_off+0x1df/0x2e0 kernel/locking/lockdep.c:3453 Code: 5c 08 00 00 5b 41 5c 41 5d 5d c3 48 c7 c0 58 1d f3 88 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 0f 85 d3 00 00 00 <48> 83 3d 21 9e 99 07 00 0f 84 b9 00 00 00 9c 58 0f 1f 44 00 00 f6 RSP: 0018:ffff8880a6f3f1b8 EFLAGS: 00000046 RAX: 1ffffffff11e63ab RBX: ffff88808c9c6080 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffff88808c9c6914 RBP: ffff8880a6f3f1d0 R08: ffff88808c9c6080 R09: fffffbfff16be5d1 R10: fffffbfff16be5d0 R11: 0000000000000003 R12: ffffffff8746591f R13: ffff88808c9c6080 R14: ffffffff8746591f R15: 0000000000000003 FS: 00000000011e4880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffff600400 CR3: 00000000a8920000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: trace_hardirqs_off+0x62/0x240 kernel/trace/trace_preemptirq.c:45 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline] _raw_spin_lock_irqsave+0x6f/0xcd kernel/locking/spinlock.c:159 __wake_up_common_lock+0xc8/0x150 kernel/sched/wait.c:122 __wake_up+0xe/0x10 kernel/sched/wait.c:142 netlink_unlock_table net/netlink/af_netlink.c:466 [inline] netlink_unlock_table net/netlink/af_netlink.c:463 [inline] netlink_broadcast_filtered+0x705/0xb80 net/netlink/af_netlink.c:1514 netlink_broadcast+0x3a/0x50 net/netlink/af_netlink.c:1534 rtnetlink_send+0xdd/0x110 net/core/rtnetlink.c:714 tcf_add_notify net/sched/act_api.c:1343 [inline] tcf_action_add+0x243/0x370 net/sched/act_api.c:1362 tc_ctl_action+0x3b5/0x4bc net/sched/act_api.c:1410 rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5386 netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5404 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:657 ___sys_sendmsg+0x803/0x920 net/socket.c:2311 __sys_sendmsg+0x105/0x1d0 net/socket.c:2356 __do_sys_sendmsg net/socket.c:2365 [inline] __se_sys_sendmsg net/socket.c:2363 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440939 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+cf0adbb9c28c8866c788@syzkaller.appspotmail.com Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-04net: sched: don't use tc_action->order during action dumpVlad Buslov1-2/+1
[ Upstream commit 4097e9d250fb17958c1d9b94538386edd3f20144 ] Function tcf_action_dump() relies on tc_action->order field when starting nested nla to send action data to userspace. This approach breaks in several cases: - When multiple filters point to same shared action, tc_action->order field is overwritten each time it is attached to filter. This causes filter dump to output action with incorrect attribute for all filters that have the action in different position (different order) from the last set tc_action->order value. - When action data is displayed using tc action API (RTM_GETACTION), action order is overwritten by tca_action_gd() according to its position in resulting array of nl attributes, which will break filter dump for all filters attached to that shared action that expect it to have different order value. Don't rely on tc_action->order when dumping actions. Set nla according to action position in resulting array of actions instead. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-04net: sched: null actions array pointer before releasing actionVlad Buslov1-1/+1
Currently, tcf_action_delete() nulls actions array pointer after putting and deleting it. However, if tcf_idr_delete_index() returns an error, pointer to action is not set to null. That results it being released second time in error handling code of tca_action_gd(). Kasan error: [ 807.367755] ================================================================== [ 807.375844] BUG: KASAN: use-after-free in tc_setup_cb_call+0x14e/0x250 [ 807.382763] Read of size 8 at addr ffff88033e636000 by task tc/2732 [ 807.391289] CPU: 0 PID: 2732 Comm: tc Tainted: G W 4.19.0-rc1+ #799 [ 807.399542] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017 [ 807.407948] Call Trace: [ 807.410763] dump_stack+0x92/0xeb [ 807.414456] print_address_description+0x70/0x360 [ 807.419549] kasan_report+0x14d/0x300 [ 807.423582] ? tc_setup_cb_call+0x14e/0x250 [ 807.428150] tc_setup_cb_call+0x14e/0x250 [ 807.432539] ? nla_put+0x65/0xe0 [ 807.436146] fl_dump+0x394/0x3f0 [cls_flower] [ 807.440890] ? fl_tmplt_dump+0x140/0x140 [cls_flower] [ 807.446327] ? lock_downgrade+0x320/0x320 [ 807.450702] ? lock_acquire+0xe2/0x220 [ 807.454819] ? is_bpf_text_address+0x5/0x140 [ 807.459475] ? memcpy+0x34/0x50 [ 807.462980] ? nla_put+0x65/0xe0 [ 807.466582] tcf_fill_node+0x341/0x430 [ 807.470717] ? tcf_block_put+0xe0/0xe0 [ 807.474859] tcf_node_dump+0xdb/0xf0 [ 807.478821] fl_walk+0x8e/0x170 [cls_flower] [ 807.483474] tcf_chain_dump+0x35a/0x4d0 [ 807.487703] ? tfilter_notify+0x170/0x170 [ 807.492091] ? tcf_fill_node+0x430/0x430 [ 807.496411] tc_dump_tfilter+0x362/0x3f0 [ 807.500712] ? tc_del_tfilter+0x850/0x850 [ 807.505104] ? kasan_unpoison_shadow+0x30/0x40 [ 807.509940] ? __mutex_unlock_slowpath+0xcf/0x410 [ 807.515031] netlink_dump+0x263/0x4f0 [ 807.519077] __netlink_dump_start+0x2a0/0x300 [ 807.523817] ? tc_del_tfilter+0x850/0x850 [ 807.528198] rtnetlink_rcv_msg+0x46a/0x6d0 [ 807.532671] ? rtnl_fdb_del+0x3f0/0x3f0 [ 807.536878] ? tc_del_tfilter+0x850/0x850 [ 807.541280] netlink_rcv_skb+0x18d/0x200 [ 807.545570] ? rtnl_fdb_del+0x3f0/0x3f0 [ 807.549773] ? netlink_ack+0x500/0x500 [ 807.553913] netlink_unicast+0x2d0/0x370 [ 807.558212] ? netlink_attachskb+0x340/0x340 [ 807.562855] ? _copy_from_iter_full+0xe9/0x3e0 [ 807.567677] ? import_iovec+0x11e/0x1c0 [ 807.571890] netlink_sendmsg+0x3b9/0x6a0 [ 807.576192] ? netlink_unicast+0x370/0x370 [ 807.580684] ? netlink_unicast+0x370/0x370 [ 807.585154] sock_sendmsg+0x6b/0x80 [ 807.589015] ___sys_sendmsg+0x4a1/0x520 [ 807.593230] ? copy_msghdr_from_user+0x210/0x210 [ 807.598232] ? do_wp_page+0x174/0x880 [ 807.602276] ? __handle_mm_fault+0x749/0x1c10 [ 807.607021] ? __handle_mm_fault+0x1046/0x1c10 [ 807.611849] ? __pmd_alloc+0x320/0x320 [ 807.615973] ? check_chain_key+0x140/0x1f0 [ 807.620450] ? check_chain_key+0x140/0x1f0 [ 807.624929] ? __fget_light+0xbc/0xd0 [ 807.628970] ? __sys_sendmsg+0xd7/0x150 [ 807.633172] __sys_sendmsg+0xd7/0x150 [ 807.637201] ? __ia32_sys_shutdown+0x30/0x30 [ 807.641846] ? up_read+0x53/0x90 [ 807.645442] ? __do_page_fault+0x484/0x780 [ 807.649949] ? do_syscall_64+0x1e/0x2c0 [ 807.654164] do_syscall_64+0x72/0x2c0 [ 807.658198] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 807.663625] RIP: 0033:0x7f42e9870150 [ 807.667568] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24 [ 807.687328] RSP: 002b:00007ffdbf595b58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 807.695564] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f42e9870150 [ 807.703083] RDX: 0000000000000000 RSI: 00007ffdbf595b80 RDI: 0000000000000003 [ 807.710605] RBP: 00007ffdbf599d90 R08: 0000000000679bc0 R09: 000000000000000f [ 807.718127] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffdbf599d88 [ 807.725651] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 807.735048] Allocated by task 2687: [ 807.738902] kasan_kmalloc+0xa0/0xd0 [ 807.742852] __kmalloc+0x118/0x2d0 [ 807.746615] tcf_idr_create+0x44/0x320 [ 807.750738] tcf_nat_init+0x41e/0x530 [act_nat] [ 807.755638] tcf_action_init_1+0x4e0/0x650 [ 807.760104] tcf_action_init+0x1ce/0x2d0 [ 807.764395] tcf_exts_validate+0x1d8/0x200 [ 807.768861] fl_change+0x55a/0x26b4 [cls_flower] [ 807.773845] tc_new_tfilter+0x748/0xa20 [ 807.778051] rtnetlink_rcv_msg+0x56a/0x6d0 [ 807.782517] netlink_rcv_skb+0x18d/0x200 [ 807.786804] netlink_unicast+0x2d0/0x370 [ 807.791095] netlink_sendmsg+0x3b9/0x6a0 [ 807.795387] sock_sendmsg+0x6b/0x80 [ 807.799240] ___sys_sendmsg+0x4a1/0x520 [ 807.803445] __sys_sendmsg+0xd7/0x150 [ 807.807473] do_syscall_64+0x72/0x2c0 [ 807.811506] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 807.818776] Freed by task 2728: [ 807.822283] __kasan_slab_free+0x122/0x180 [ 807.826752] kfree+0xf4/0x2f0 [ 807.830080] __tcf_action_put+0x5a/0xb0 [ 807.834281] tcf_action_put_many+0x46/0x70 [ 807.838747] tca_action_gd+0x232/0xc40 [ 807.842862] tc_ctl_action+0x215/0x230 [ 807.846977] rtnetlink_rcv_msg+0x56a/0x6d0 [ 807.851444] netlink_rcv_skb+0x18d/0x200 [ 807.855731] netlink_unicast+0x2d0/0x370 [ 807.860021] netlink_sendmsg+0x3b9/0x6a0 [ 807.864312] sock_sendmsg+0x6b/0x80 [ 807.868166] ___sys_sendmsg+0x4a1/0x520 [ 807.872372] __sys_sendmsg+0xd7/0x150 [ 807.876401] do_syscall_64+0x72/0x2c0 [ 807.880431] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 807.887704] The buggy address belongs to the object at ffff88033e636000 which belongs to the cache kmalloc-256 of size 256 [ 807.900909] The buggy address is located 0 bytes inside of 256-byte region [ffff88033e636000, ffff88033e636100) [ 807.913155] The buggy address belongs to the page: [ 807.918322] page:ffffea000cf98d80 count:1 mapcount:0 mapping:ffff88036f80ee00 index:0x0 compound_mapcount: 0 [ 807.928831] flags: 0x5fff8000008100(slab|head) [ 807.933647] raw: 005fff8000008100 ffffea000db44f00 0000000400000004 ffff88036f80ee00 [ 807.942050] raw: 0000000000000000 0000000080190019 00000001ffffffff 0000000000000000 [ 807.950456] page dumped because: kasan: bad access detected [ 807.958240] Memory state around the buggy address: [ 807.963405] ffff88033e635f00: fc fc fc fc fb fb fb fb fb fb fb fc fc fc fc fb [ 807.971288] ffff88033e635f80: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc [ 807.979166] >ffff88033e636000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 807.994882] ^ [ 807.998477] ffff88033e636080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 808.006352] ffff88033e636100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb [ 808.014230] ================================================================== [ 808.022108] Disabling lock debugging due to kernel taint Fixes: edfaf94fa705 ("net_sched: improve and refactor tcf_action_put_many()") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-30net_sched: reject unknown tcfa_action valuesPaolo Abeni1-5/+11
After the commit 802bfb19152c ("net/sched: user-space can't set unknown tcfa_action values"), unknown tcfa_action values are converted to TC_ACT_UNSPEC, but the common agreement is instead rejecting such configurations. This change also introduces a helper to simplify the destruction of a single action, avoiding code duplication. v1 -> v2: - helper is now static and renamed according to act_* convention - updated extack message, according to the new behavior Fixes: 802bfb19152c ("net/sched: user-space can't set unknown tcfa_action values") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21net_sched: remove list_head from tc_actionCong Wang1-1/+0
After commit 90b73b77d08e, list_head is no longer needed. Now we just need to convert the list iteration to array iteration for drivers. Fixes: 90b73b77d08e ("net: sched: change action API to use array of pointers to actions") Cc: Jiri Pirko <jiri@mellanox.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21net_sched: remove unused tcf_idr_check()Cong Wang1-19/+3
tcf_idr_check() is replaced by tcf_idr_check_alloc(), and __tcf_idr_check() now can be folded into tcf_idr_search(). Fixes: 0190c1d452a9 ("net: sched: atomically check-allocate action") Cc: Jiri Pirko <jiri@mellanox.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21net_sched: remove unused parameter for tcf_action_delete()Cong Wang1-3/+2
Fixes: 16af6067392c ("net: sched: implement reference counted action release") Cc: Jiri Pirko <jiri@mellanox.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21net_sched: remove unnecessary ops->delete()Cong Wang1-8/+7
All ops->delete() wants is getting the tn->idrinfo, but we already have tc_action before calling ops->delete(), and tc_action has a pointer ->idrinfo. More importantly, each type of action does the same thing, that is, just calling tcf_idr_delete_index(). So it can be just removed. Fixes: b409074e6693 ("net: sched: add 'delete' function to action ops") Cc: Jiri Pirko <jiri@mellanox.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21net_sched: improve and refactor tcf_action_put_many()Cong Wang1-16/+15
tcf_action_put_many() is mostly called to clean up actions on failure path, but tcf_action_put_many(&actions[acts_deleted]) is used in the ugliest way: it passes a slice of the array and uses an additional NULL at the end to avoid out-of-bound access. acts_deleted is completely unnecessary since we can teach tcf_action_put_many() scan the whole array and checks against NULL pointer. Which also means tcf_action_delete() should set deleted action pointers to NULL to avoid double free. Fixes: 90b73b77d08e ("net: sched: change action API to use array of pointers to actions") Cc: Jiri Pirko <jiri@mellanox.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30net/sched: user-space can't set unknown tcfa_action valuesPaolo Abeni1-0/+14
Currently, when initializing an action, the user-space can specify and use arbitrary values for the tcfa_action field. If the value is unknown by the kernel, is implicitly threaded as TC_ACT_UNSPEC. This change explicitly checks for unknown values at action creation time, and explicitly convert them to TC_ACT_UNSPEC. No functional changes are introduced, but this will allow introducing tcfa_action values not exposed to user-space in a later patch. Note: we can't use the above to hide TC_ACT_REDIRECT from user-space, as the latter is already part of uAPI. v3 -> v4: - use an helper to check for action validity (JiriP) - emit an extack for invalid actions (JiriP) v4 -> v5: - keep messages on a single line, drop net_warn (Marcelo) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27net: sched: don't dump chains only held by actionsJiri Pirko1-2/+2
In case a chain is empty and not explicitly created by a user, such chain should not exist. The only exception is if there is an action "goto chain" pointing to it. In that case, don't show the chain in the dump. Track the chain references held by actions and use them to find out if a chain should or should not be shown in chain dump. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12net: sched: fix unprotected access to rcu cookie pointerVlad Buslov1-2/+7
Fix action attribute size calculation function to take rcu read lock and access act_cookie pointer with rcu dereference. Fixes: eec94fdb0480 ("net: sched: use rcu for action cookie update") Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08net: sched: Fix warnings from xchg() on RCU'd cookie pointer.David S. Miller1-1/+1
The kbuild test robot reports: >> net/sched/act_api.c:71:15: sparse: incorrect type in initializer (different address spaces) @@ expected struct tc_cookie [noderef] <asn:4>*__ret @@ got [noderef] <asn:4>*__ret @@ net/sched/act_api.c:71:15: expected struct tc_cookie [noderef] <asn:4>*__ret net/sched/act_api.c:71:15: got struct tc_cookie *new_cookie >> net/sched/act_api.c:71:13: sparse: incorrect type in assignment (different address spaces) @@ expected struct tc_cookie *old @@ got struct tc_cookie [noderef] <struct tc_cookie *old @@ net/sched/act_api.c:71:13: expected struct tc_cookie *old net/sched/act_api.c:71:13: got struct tc_cookie [noderef] <asn:4>*[assigned] __ret >> net/sched/act_api.c:132:48: sparse: dereference of noderef expression Handle this in the usual way by force casting away the __rcu annotation when we are using xchg() on it. Fixes: eec94fdb0480 ("net: sched: use rcu for action cookie update") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08net: sched: change action API to use array of pointers to actionsVlad Buslov1-39/+50
Act API used linked list to pass set of actions to functions. It is intrusive data structure that stores list nodes inside action structure itself, which means it is not safe to modify such list concurrently. However, action API doesn't use any linked list specific operations on this set of actions, so it can be safely refactored into plain pointer array. Refactor action API to use array of pointers to tc_actions instead of linked list. Change argument 'actions' type of exported action init, destroy and dump functions. Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08net: sched: atomically check-allocate actionVlad Buslov1-20/+72
Implement function that atomically checks if action exists and either takes reference to it, or allocates idr slot for action index to prevent concurrent allocations of actions with same index. Use EBUSY error pointer to indicate that idr slot is reserved. Implement cleanup helper function that removes temporary error pointer from idr. (in case of error between idr allocation and insertion of newly created action to specified index) Refactor all action init functions to insert new action to idr using this API. Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08net: sched: use reference counting action initVlad Buslov1-18/+17
Change action API to assume that action init function always takes reference to action, even when overwriting existing action. This is necessary because action API continues to use action pointer after init function is done. At this point action becomes accessible for concurrent modifications, so user must always hold reference to it. Implement helper put list function to atomically release list of actions after action API init code is done using them. Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08net: sched: don't release reference on action overwriteVlad Buslov1-2/+0
Return from action init function with reference to action taken, even when overwriting existing action. Action init API initializes its fourth argument (pointer to pointer to tc action) to either existing action with same index or newly created action. In case of existing index(and bind argument is zero), init function returns without incrementing action reference counter. Caller of action init then proceeds working with action, without actually holding reference to it. This means that action could be deleted concurrently. Change action init behavior to always take reference to action before returning successfully, in order to protect from concurrent deletion. Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08net: sched: implement reference counted action releaseVlad Buslov1-22/+62
Implement helper delete function that uses new action ops 'delete', instead of destroying action directly. This is required so act API could delete actions by index, without holding any references to action that is being deleted. Implement function __tcf_action_put() that releases reference to action and frees it, if necessary. Refactor action deletion code to use new put function and not to rely on rtnl lock. Remove rtnl lock assertions that are no longer needed. Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08net: sched: implement action API that deletes action by indexVlad Buslov1-0/+39
Implement new action API function that atomically finds and deletes action from idr by index. Intended to be used by lockless actions that do not rely on rtnl lock. Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08net: sched: always take reference to actionVlad Buslov1-26/+20
Without rtnl lock protection it is no longer safe to use pointer to tc action without holding reference to it. (it can be destroyed concurrently) Remove unsafe action idr lookup function. Instead of it, implement safe tcf idr check function that atomically looks up action in idr and increments its reference and bind counters. Implement both action search and check using new safe function Reference taken by idr check is temporal and should not be accounted by userspace clients (both logically and to preserver current API behavior). Subtract temporal reference when dumping action to userspace using existing tca_get_fill function arguments. Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08net: sched: implement unlocked action init APIVlad Buslov1-7/+11
Add additional 'rtnl_held' argument to act API init functions. It is required to implement actions that need to release rtnl lock before loading kernel module and reacquire if afterwards. Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08net: sched: change type of reference and bind countersVlad Buslov1-10/+22
Change type of action reference counter to refcount_t. Change type of action bind counter to atomic_t. This type is used to allow decrementing bind counter without testing for 0 result. Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-08net: sched: use rcu for action cookie updateVlad Buslov1-14/+30
Implement functions to atomically update and free action cookie using rcu mechanism. Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22net: sched: don't disable bh when accessing action idrVlad Buslov1-10/+10
Initial net_device implementation used ingress_lock spinlock to synchronize ingress path of device. This lock was used in both process and bh context. In some code paths action map lock was obtained while holding ingress_lock. Commit e1e992e52faa ("[NET_SCHED] protect action config/dump from irqs") modified actions to always disable bh, while using action map lock, in order to prevent deadlock on ingress_lock in softirq. This lock was removed from net_device, so disabling bh, while accessing action map, is no longer necessary. Replace all action idr spinlock usage with regular calls that do not disable bh. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+3
Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c, we had some overlapping changes: 1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE --> MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE 2) In 'net-next' params->log_rq_size is renamed to be params->log_rq_mtu_frames. 3) In 'net-next' params->hard_mtu is added. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net: Drop pernet_operations::asyncKirill Tkhai1-1/+0
Synchronous pernet_operations are not allowed anymore. All are asynchronous. So, drop the structure member. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27net sched actions: fix dumping which requires several messages to user spaceCraig Dillabaugh1-1/+3
Fixes a bug in the tcf_dump_walker function that can cause some actions to not be reported when dumping a large number of actions. This issue became more aggrevated when cookies feature was added. In particular this issue is manifest when large cookie values are assigned to the actions and when enough actions are created that the resulting table must be dumped in multiple batches. The number of actions returned in each batch is limited by the total number of actions and the memory buffer size. With small cookies the numeric limit is reached before the buffer size limit, which avoids the code path triggering this bug. When large cookies are used buffer fills before the numeric limit, and the erroneous code path is hit. For example after creating 32 csum actions with the cookie aaaabbbbccccdddd $ tc actions ls action csum total acts 26 action order 0: csum (tcp) action continue index 1 ref 1 bind 0 cookie aaaabbbbccccdddd ..... action order 25: csum (tcp) action continue index 26 ref 1 bind 0 cookie aaaabbbbccccdddd total acts 6 action order 0: csum (tcp) action continue index 28 ref 1 bind 0 cookie aaaabbbbccccdddd ...... action order 5: csum (tcp) action continue index 32 ref 1 bind 0 cookie aaaabbbbccccdddd Note that the action with index 27 is omitted from the report. Fixes: 4b3550ef530c ("[NET_SCHED]: Use nla_nest_start/nla_nest_end")" Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-24net/sched: remove tcf_idr_cleanup()Davide Caratti1-8/+0
tcf_idr_cleanup() is no more used, so remove it. Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net sched actions: calculate add/delete event message sizeRoman Mashak1-0/+43
Introduce routines to calculate size of the shared tc netlink attributes and the full message size including netlink header and tc service header. Update add/delete action logic to have the size for event messages, the size is passed to tcf_add_notify() and tcf_del_notify() where the notification message is being allocated and constructed. Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09net sched actions: update Add/Delete action API with new argumentRoman Mashak1-8/+13
Introduce a new function argument to carry total attributes size for correct allocation of skb in event messages. Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-05net sched actions: corrected extack messageRoman Mashak1-1/+1
Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-17net: sched: act: handle extack in tcf_generic_walkerAlexander Aring1-2/+4
This patch adds extack handling for a common used TC act function "tcf_generic_walker()" to add an extack message on failures. The tcf_generic_walker() function can fail if get a invalid command different than DEL and GET. The naming "action" here is wrong, the correct naming would be command. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-17net: sched: act: add extack for walk callbackAlexander Aring1-2/+2
This patch adds extack support for act walker callback api. This prepares to handle extack support inside each specific act implementation. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-17net: sched: act: add extack for lookup callbackAlexander Aring1-1/+1
This patch adds extack support for act lookup callback api. This prepares to handle extack support inside each specific act implementation. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-17net: sched: act: add extack to init callbackAlexander Aring1-2/+3
This patch adds extack support for act init callback api. This prepares to handle extack support inside each specific act implementation. Based on work by David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-17net: sched: act: handle generic action errorsAlexander Aring1-32/+61
This patch adds extack support for generic act handling. The extack will be set deeper to each called function which is not part of netdev core api. Based on work by David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-17net: sched: act: add extack to initAlexander Aring1-6/+11
This patch adds extack to tcf_action_init and tcf_action_init_1 functions. These are necessary to make individual extack handling in each act implementation. Based on work by David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-17net: sched: act: fix code styleAlexander Aring1-6/+6
This patch is used by subsequent patches. It fixes code style issues caught by checkpatch. Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-17net: Revert sched action extack support series.David S. Miller1-17/+12
It was mis-applied and the changes had rejects. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16net: sched: act: add extack to initAlexander Aring1-6/+11
This patch adds extack to tcf_action_init and tcf_action_init_1 functions. These are necessary to make individual extack handling in each act implementation. Based on work by David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16net: sched: act: fix code styleAlexander Aring1-6/+6
This patch is used by subsequent patches. It fixes code style issues caught by checkpatch. Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16net: sched: fix unbalance in the error path of tca_action_flush()Davide Caratti1-1/+3
When tca_action_flush() calls the action walk() and gets an error, a successful call to nla_nest_start() is not followed by a call to nla_nest_cancel(). It's harmless, as the skb is freed in the error path - but it's worth to fix this unbalance. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-13net: Convert subsys_initcall() registered pernet_operations from net/schedKirill Tkhai1-0/+1
psched_net_ops only creates and destroyes /proc entry, and safe to be executed in parallel with any foreigh pernet_operations. tcf_action_net_ops initializes and destructs tcf_action_net::egdev_ht, which is not touched by foreign pernet_operations. So, make them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07idr: Rename idr_for_each_entry_extMatthew Wilcox1-3/+3
Most places in the kernel that we need to distinguish functions by the type of their arguments, we use '_ul' as a suffix for the unsigned long variant, not '_ext'. Also add kernel-doc. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-07net sched actions: Convert to use idr_alloc_u32Matthew Wilcox1-35/+25
Use the new helper. Also untangle the error path, and in so doing noticed that estimator generator failure would lead to us leaking an ID. Fix that bug. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-07idr: Delete idr_find_ext functionMatthew Wilcox1-1/+1
Simply changing idr_remove's 'id' argument to 'unsigned long' works for all callers. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-07idr: Delete idr_replace_ext functionMatthew Wilcox1-1/+1
Changing idr_replace's 'id' argument to 'unsigned long' works for all callers. Callers which passed a negative ID now get -ENOENT instead of -EINVAL. No callers relied on this error value. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2018-02-07idr: Delete idr_remove_ext functionMatthew Wilcox1-1/+1
Simply changing idr_remove's 'id' argument to 'unsigned long' suffices for all callers. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-12-06net_sched: remove unused parameter from act cleanup opsCong Wang1-1/+1
No one actually uses it. Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>