summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-03-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller49-499/+1383
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-02-28 The following pull-request contains BPF updates for your *net-next* tree. We've added 41 non-merge commits during the last 7 day(s) which contain a total of 49 files changed, 1383 insertions(+), 499 deletions(-). The main changes are: 1) BPF and Real-Time nicely co-exist. 2) bpftool feature improvements. 3) retrieve bpf_sk_storage via INET_DIAG. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28Merge branch 'net-cleanup-datagram-receive-helpers'David S. Miller6-53/+30
Paolo Abeni says: ==================== net: cleanup datagram receive helpers Several receive helpers have an optional destructor argument, which uglify the code a bit and is taxed by retpoline overhead. This series refactor the code so that we can drop such optional argument, cleaning the helpers a bit and avoiding an indirect call in fast path. The first patch refactor a bit the caller, so that the second patch actually dropping the argument is more straight-forward v1 -> v2: - call scm_stat_del() only when not peeking - Kirill - fix build issue with CONFIG_INET_ESPINTCP ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28net: datagram: drop 'destructor' argument from several helpersPaolo Abeni5-37/+23
The only users for such argument are the UDP protocol and the UNIX socket family. We can safely reclaim the accounted memory directly from the UDP code and, after the previous patch, we can do scm stats accounting outside the datagram helpers. Overall this cleans up a bit some datagram-related helpers, and avoids an indirect call per packet in the UDP receive path. v1 -> v2: - call scm_stat_del() only when not peeking - Kirill - fix build issue with CONFIG_INET_ESPINTCP Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28unix: uses an atomic type for scm files accountingPaolo Abeni2-16/+7
So the scm_stat_{add,del} helper can be invoked with no additional lock held. This clean-up the code a bit and will make the next patch easier. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28af_unix: Replace zero-length array with flexible-array memberGustavo A. R. Silva1-1/+1
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28bonding: Replace zero-length array with flexible-array memberGustavo A. R. Silva1-1/+1
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28net: core: Replace zero-length array with flexible-array memberGustavo A. R. Silva3-3/+3
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28ipv6: Replace zero-length array with flexible-array memberGustavo A. R. Silva2-2/+2
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28net: dccp: Replace zero-length array with flexible-array memberGustavo A. R. Silva2-2/+2
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28l2tp: Replace zero-length array with flexible-array memberGustavo A. R. Silva1-1/+1
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] Lastly, fix the following checkpatch warning: CHECK: Prefer kernel type 'u8' over 'uint8_t' #50: FILE: net/l2tp/l2tp_core.h:119: + uint8_t priv[]; /* private data */ This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28net: mpls: Replace zero-length array with flexible-array memberGustavo A. R. Silva2-3/+3
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28xdp: Replace zero-length array with flexible-array memberGustavo A. R. Silva1-2/+2
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28Merge tag 'mlx5-updates-2020-02-27' of ↵David S. Miller30-129/+519
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-02-27 mlx5 misc updates and minor cleanups: 1) Use per vport tables for mirroring 2) Improve log messages for SW steering (DR) 3) Add devlink fdb_large_groups parameter 4) E-Switch, Allow goto earlier chain 5) Don't allow forwarding between uplink representors 6) Add support for devlink-port in non-representors mode 7) Minor misc cleanups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28Merge branch 'bpf_sk_storage_via_inet_diag'Alexei Starovoitov14-185/+599
Martin KaFai Lau says: ==================== The bpf_prog can store specific info to a sk by using bpf_sk_storage. In other words, a sk can be extended by a bpf_prog. This series is to support providing bpf_sk_storage data during inet_diag's dump. The primary target is the usage like iproute2's "ss". The first two patches are refactoring works in inet_diag to make adding bpf_sk_storage support easier. The next two patches do the actual work. Please see individual patch for details. v2: - Add commit message for u16 to u32 change in min_dump_alloc in Patch 4 (Song) - Add comment to explain the !skb->len check in __inet_diag_dump in Patch 4. - Do the map->map_type check earlier in Patch 3 for readability. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-02-28bpf: inet_diag: Dump bpf_sk_storages in inet_diag_dump()Martin KaFai Lau4-2/+82
This patch will dump out the bpf_sk_storages of a sk if the request has the INET_DIAG_REQ_SK_BPF_STORAGES nlattr. An array of SK_DIAG_BPF_STORAGE_REQ_MAP_FD can be specified in INET_DIAG_REQ_SK_BPF_STORAGES to select which bpf_sk_storage to dump. If no map_fd is specified, all bpf_sk_storages of a sk will be dumped. bpf_sk_storages can be added to the system at runtime. It is difficult to find a proper static value for cb->min_dump_alloc. This patch learns the nlattr size required to dump the bpf_sk_storages of a sk. If it happens to be the very first nlmsg of a dump and it cannot fit the needed bpf_sk_storages, it will try to expand the skb by "pskb_expand_head()". Instead of expanding it in inet_sk_diag_fill(), it is expanded at a sleepable context in __inet_diag_dump() so __GFP_DIRECT_RECLAIM can be used. In __inet_diag_dump(), it will retry as long as the skb is empty and the cb->min_dump_alloc becomes larger than before. cb->min_dump_alloc is bounded by KMALLOC_MAX_SIZE. The min_dump_alloc is also changed from 'u16' to 'u32' to accommodate a sk that may have a few large bpf_sk_storages. The updated cb->min_dump_alloc will also be used to allocate the skb in the next dump. This logic already exists in netlink_dump(). Here is the sample output of a locally modified 'ss' and it could be made more readable by using BTF later: [root@arch-fb-vm1 ~]# ss --bpf-map-id 14 --bpf-map-id 13 -t6an 'dst [::1]:8989' State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess ESTAB 0 0 [::1]:51072 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ] ESTAB 0 0 [::1]:51070 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ] [root@arch-fb-vm1 ~]# ~/devshare/github/iproute2/misc/ss --bpf-maps -t6an 'dst [::1]:8989' State Recv-Q Send-Q Local Address:Port Peer Address:Port Process ESTAB 0 0 [::1]:51072 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ] bpf_map_id:12 value:[ 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000... total:65407 ] ESTAB 0 0 [::1]:51070 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ] bpf_map_id:12 value:[ 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000... total:65407 ] Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200225230427.1976129-1-kafai@fb.com
2020-02-28bpf: INET_DIAG support in bpf_sk_storageMartin KaFai Lau5-6/+346
This patch adds INET_DIAG support to bpf_sk_storage. 1. Although this series adds bpf_sk_storage diag capability to inet sk, bpf_sk_storage is in general applicable to all fullsock. Hence, the bpf_sk_storage logic will operate on SK_DIAG_* nlattr. The caller will pass in its specific nesting nlattr (e.g. INET_DIAG_*) as the argument. 2. The request will be like: INET_DIAG_REQ_SK_BPF_STORAGES (nla_nest) (defined in latter patch) SK_DIAG_BPF_STORAGE_REQ_MAP_FD (nla_put_u32) SK_DIAG_BPF_STORAGE_REQ_MAP_FD (nla_put_u32) ...... Considering there could have multiple bpf_sk_storages in a sk, instead of reusing INET_DIAG_INFO ("ss -i"), the user can select some specific bpf_sk_storage to dump by specifying an array of SK_DIAG_BPF_STORAGE_REQ_MAP_FD. If no SK_DIAG_BPF_STORAGE_REQ_MAP_FD is specified (i.e. an empty INET_DIAG_REQ_SK_BPF_STORAGES), it will dump all bpf_sk_storages of a sk. 3. The reply will be like: INET_DIAG_BPF_SK_STORAGES (nla_nest) (defined in latter patch) SK_DIAG_BPF_STORAGE (nla_nest) SK_DIAG_BPF_STORAGE_MAP_ID (nla_put_u32) SK_DIAG_BPF_STORAGE_MAP_VALUE (nla_reserve_64bit) SK_DIAG_BPF_STORAGE (nla_nest) SK_DIAG_BPF_STORAGE_MAP_ID (nla_put_u32) SK_DIAG_BPF_STORAGE_MAP_VALUE (nla_reserve_64bit) ...... 4. Unlike other INET_DIAG info of a sk which is pretty static, the size required to dump the bpf_sk_storage(s) of a sk is dynamic as the system adding more bpf_sk_storage_map. It is hard to set a static min_dump_alloc size. Hence, this series learns it at the runtime and adjust the cb->min_dump_alloc as it iterates all sk(s) of a system. The "unsigned int *res_diag_size" in bpf_sk_storage_diag_put() is for this purpose. The next patch will update the cb->min_dump_alloc as it iterates the sk(s). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200225230421.1975729-1-kafai@fb.com
2020-02-28inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->dataMartin KaFai Lau8-64/+98
The INET_DIAG_REQ_BYTECODE nlattr is currently re-found every time when the "dump()" is re-started. In a latter patch, it will also need to parse the new INET_DIAG_REQ_SK_BPF_STORAGES nlattr to learn the map_fds. Thus, this patch takes this chance to store the parsed nlattr in cb->data during the "start" time of a dump. By doing this, the "bc" argument also becomes unnecessary and is removed. Also, the two copies of the INET_DIAG_REQ_BYTECODE parsing-audit logic between compat/current version can be consolidated to one. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200225230415.1975555-1-kafai@fb.com
2020-02-28inet_diag: Refactor inet_sk_diag_fill(), dump(), and dump_one()Martin KaFai Lau7-113/+73
In a latter patch, there is a need to update "cb->min_dump_alloc" in inet_sk_diag_fill() as it learns the diffierent bpf_sk_storages stored in a sk while dumping all sk(s) (e.g. tcp_hashinfo). The inet_sk_diag_fill() currently does not take the "cb" as an argument. One of the reason is inet_sk_diag_fill() is used by both dump_one() and dump() (which belong to the "struct inet_diag_handler". The dump_one() interface does not pass the "cb" along. This patch is to make dump_one() pass a "cb". The "cb" is created in inet_diag_cmd_exact(). The "nlh" and "in_skb" are stored in "cb" as the dump() interface does. The total number of args in inet_sk_diag_fill() is also cut from 10 to 7 and that helps many callers to pass fewer args. In particular, "struct user_namespace *user_ns", "u32 pid", and "u32 seq" can be replaced by accessing "cb->nlh" and "cb->skb". A similar argument reduction is also made to inet_twsk_diag_fill() and inet_req_diag_fill(). inet_csk_diag_dump() and inet_csk_diag_fill() are also removed. They are mostly equivalent to inet_sk_diag_fill(). Their repeated usages are very limited. Thus, inet_sk_diag_fill() is directly used in those occasions. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200225230409.1975173-1-kafai@fb.com
2020-02-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller264-1539/+3508
The mptcp conflict was overlapping additions. The SMC conflict was an additional and removal happening at the same time. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-28net/mlx5e: Remove redundant comment about goto slow pathRoi Dayan1-4/+2
The code is self explanatory and makes the comment redundant. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5e: Reduce number of arguments in slow path handlingEli Cohen1-23/+20
mlx5e_tc_offload_to_slow_path() and mlx5e_tc_unoffload_from_slow_path() take an extra argument allocated on the stack of the caller but not used by the caller. Avoid the extra argument and use local variable in the function itself. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5e: Remove unused argument from parse_tc_pedit_action()Eli Cohen1-5/+3
parse_attr is not used by parse_tc_pedit_action() so revmove it. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5e: Use NL_SET_ERR_MSG_MOD() extack for errorsRoi Dayan1-7/+14
This to be consistent and adds the module name to the error message. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5e: Use netdev_warn() instead of pr_err() for errorsRoi Dayan1-6/+11
This is for added netdev prefix that helps identify the source of the message. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5e: Use netdev_warn() for errors for added prefixRoi Dayan1-11/+16
This helps identify the source of the message. If netdev still doesn't exists use mlx5_core_warn(). Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5: DR, Improve log messagesErez Shitrit8-34/+49
Few print messages are in debug level where they should be in error, and few messages are missing. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5: DR, Change matcher priority parameter typeHamdan Igbaria4-5/+5
Change matcher priority parameter type from u16 to u32, this change is needed since sometimes upper levels create a matcher with priority bigger than 2^16. Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5e: Add devlink fdb_large_groups parameterJianbo Liu7-9/+75
Add a devlink parameter to control the number of large groups in a autogrouped flow table. The default value is 15, and the range is between 1 and 1024. The size of each large group can be calculated according to the following formula: size = 4M / (fdb_large_groups + 1). Examples: - Set the number of large groups to 20. $ devlink dev param set pci/0000:82:00.0 name fdb_large_groups \ cmode driverinit value 20 Then run devlink reload command to apply the new value. $ devlink dev reload pci/0000:82:00.0 - Read the number of large groups in flow table. $ devlink dev param show pci/0000:82:00.0 name fdb_large_groups pci/0000:82:00.0: name fdb_large_groups type driver-specific values: cmode driverinit value 20 Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5: Change the name of steering mode param idJianbo Liu1-3/+3
The prefix should be "MLX5_DEVLINK_PARAM_ID_" for all in mlx5_devlink_param_id enum. Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5e: Add support for devlink-port in non-representors modeVladyslav Tarasiuk5-1/+66
Added devlink_port field to mlx5e_priv structure and a callback to netdev ops to enable devlink to get info about the port. The port registration happens at driver initialization. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5e: Rename representor get devlink port functionVladyslav Tarasiuk1-3/+3
Rename representor's mlx5e_get_devlink_port() to mlx5e_rep_get_devlink_port(). The downstream patch will add a non-representor mlx5e function called mlx5e_get_devlink_phy_port(). Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5: E-Switch, Allow goto earlier chain if FW supports itRoi Dayan3-1/+9
Mellanox FW can support this if ignore_flow_level capability exists. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5e: Eswitch, Use per vport tables for mirroringEli Cohen5-18/+221
When using port mirroring, we forward the traffic to another table and use that table to forward to the mirrored vport. Since the hardware loses the values of reg c, and in particular reg c0, we fail the match on the input vport which previously existed in reg c0. To overcome this situation, we use a set of per vport tables, positioned at the lowest priority, and forward traffic to those tables. Since these tables are per vport, we can avoid matching on reg c0. Fixes: c01cfd0f1115 ("net/mlx5: E-Switch, Add match on vport metadata for rule in fast path") Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5: Eswitch, avoid redundant maskEli Cohen1-2/+2
misc_params.source_port is a 16 bit field already so no need for redundant masking against 0xffff. Also change local variables type to u16. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28net/mlx5e: Don't allow forwarding between uplinkTonghao Zhang3-0/+23
We can install forwarding packets rule between uplink in switchdev mode, as show below. But the hardware does not do that as expected (mlnx_perf -i $PF1, we can't get the counter of the PF1). By the way, if we add the uplink PF0, PF1 to Open vSwitch and enable hw-offload, the rules can be offloaded but not work fine too. This patch add a check and if so return -EOPNOTSUPP. $ tc filter add dev $PF0 protocol all parent ffff: prio 1 handle 1 \ flower skip_sw action mirred egress redirect dev $PF1 $ tc -d -s filter show dev $PF0 ingress skip_sw in_hw in_hw_count 1 action order 1: mirred (Egress Redirect to device enp130s0f1) stolen ... Sent hardware 408954 bytes 4173 pkt ... Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds70-536/+1062
Pull networking fixes from David Miller: 1) Fix leak in nl80211 AP start where we leak the ACL memory, from Johannes Berg. 2) Fix double mutex unlock in mac80211, from Andrei Otcheretianski. 3) Fix RCU stall in ipset, from Jozsef Kadlecsik. 4) Fix devlink locking in devlink_dpipe_table_register, from Madhuparna Bhowmik. 5) Fix race causing TX hang in ll_temac, from Esben Haabendal. 6) Stale eth hdr pointer in br_dev_xmit(), from Nikolay Aleksandrov. 7) Fix TX hash calculation bounds checking wrt. tc rules, from Amritha Nambiar. 8) Size netlink responses properly in schedule action code to take into consideration TCA_ACT_FLAGS. From Jiri Pirko. 9) Fix firmware paths for mscc PHY driver, from Antoine Tenart. 10) Don't register stmmac notifier multiple times, from Aaro Koskinen. 11) Various rmnet bug fixes, from Taehee Yoo. 12) Fix vsock deadlock in vsock transport release, from Stefano Garzarella. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits) net: dsa: mv88e6xxx: Fix masking of egress port mlxsw: pci: Wait longer before accessing the device after reset sfc: fix timestamp reconstruction at 16-bit rollover points vsock: fix potential deadlock in transport->release() unix: It's CONFIG_PROC_FS not CONFIG_PROCFS net: rmnet: fix packet forwarding in rmnet bridge mode net: rmnet: fix bridge mode bugs net: rmnet: use upper/lower device infrastructure net: rmnet: do not allow to change mux id if mux id is duplicated net: rmnet: remove rcu_read_lock in rmnet_force_unassociate_device() net: rmnet: fix suspicious RCU usage net: rmnet: fix NULL pointer dereference in rmnet_changelink() net: rmnet: fix NULL pointer dereference in rmnet_newlink() net: phy: marvell: don't interpret PHY status unless resolved mlx5: register lag notifier for init network namespace only unix: define and set show_fdinfo only if procfs is enabled hinic: fix a bug of rss configuration hinic: fix a bug of setting hw_ioctxt hinic: fix a irq affinity bug net/smc: check for valid ib_client_data ...
2020-02-28bpf: Replace zero-length array with flexible-array memberGustavo A. R. Silva6-6/+6
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200227001744.GA3317@embeddedor
2020-02-27net: dsa: mv88e6xxx: Fix masking of egress portAndrew Lunn1-2/+2
Add missing ~ to the usage of the mask. Reported-by: Kevin Benson <Kevin.Benson@zii.aero> Reported-by: Chris Healy <Chris.Healy@zii.aero> Fixes: 5c74c54ce6ff ("net: dsa: mv88e6xxx: Split monitor port configuration") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27mlxsw: pci: Wait longer before accessing the device after resetAmit Cohen1-1/+1
During initialization the driver issues a reset to the device and waits for 100ms before checking if the firmware is ready. The waiting is necessary because before that the device is irresponsive and the first read can result in a completion timeout. While 100ms is sufficient for Spectrum-1 and Spectrum-2, it is insufficient for Spectrum-3. Fix this by increasing the timeout to 200ms. Fixes: da382875c616 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC") Signed-off-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27mlxsw: reg: Update module_type values in PMTM register and map them to widthJiri Pirko2-8/+31
There are couple new values that PMTM register can return in module_type field. Add them and map them to module width in mlxsw_core_module_max_width(). Fix the existing names on the way. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27WAN: Replace zero-length array with flexible-array memberGustavo A. R. Silva2-2/+2
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27NFC: Replace zero-length array with flexible-array memberGustavo A. R. Silva4-11/+11
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27sfc: fix timestamp reconstruction at 16-bit rollover pointsAlex Maftei (amaftei)1-3/+35
We can't just use the top bits of the last sync event as they could be off-by-one every 65,536 seconds, giving an error in reconstruction of 65,536 seconds. This patch uses the difference in the bottom 16 bits (mod 2^16) to calculate an offset that needs to be applied to the last sync event to get to the current time. Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com> Acked-by: Martin Habets <mhabets@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27vsock: fix potential deadlock in transport->release()Stefano Garzarella3-13/+12
Some transports (hyperv, virtio) acquire the sock lock during the .release() callback. In the vsock_stream_connect() we call vsock_assign_transport(); if the socket was previously assigned to another transport, the vsk->transport->release() is called, but the sock lock is already held in the vsock_stream_connect(), causing a deadlock reported by syzbot: INFO: task syz-executor280:9768 blocked for more than 143 seconds. Not tainted 5.6.0-rc1-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor280 D27912 9768 9766 0x00000000 Call Trace: context_switch kernel/sched/core.c:3386 [inline] __schedule+0x934/0x1f90 kernel/sched/core.c:4082 schedule+0xdc/0x2b0 kernel/sched/core.c:4156 __lock_sock+0x165/0x290 net/core/sock.c:2413 lock_sock_nested+0xfe/0x120 net/core/sock.c:2938 virtio_transport_release+0xc4/0xd60 net/vmw_vsock/virtio_transport_common.c:832 vsock_assign_transport+0xf3/0x3b0 net/vmw_vsock/af_vsock.c:454 vsock_stream_connect+0x2b3/0xc70 net/vmw_vsock/af_vsock.c:1288 __sys_connect_file+0x161/0x1c0 net/socket.c:1857 __sys_connect+0x174/0x1b0 net/socket.c:1874 __do_sys_connect net/socket.c:1885 [inline] __se_sys_connect net/socket.c:1882 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1882 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe To avoid this issue, this patch remove the lock acquiring in the .release() callback of hyperv and virtio transports, and it holds the lock when we call vsk->transport->release() in the vsock core. Reported-by: syzbot+731710996d79d0d58fbc@syzkaller.appspotmail.com Fixes: 408624af4c89 ("vsock: use local transport when it is loaded") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27Merge branch 'rework-phylink-interface-for-split-MAC-PCS-support'David S. Miller23-180/+358
Russell King says: ==================== rework phylink interface for split MAC/PCS support The following series changes the phylink interface to allow us to better support split MAC / MAC PCS setups. The fundamental change required for this turns out to be quite simple. Today, mac_config() is used for everything to do with setting the parameters for the MAC, and mac_link_up() is used to inform the MAC driver that the link is now up (and so to allow packet flow.) mac_config() also has had a few implementation issues, with folk who believe that members such as "speed" and "duplex" are always valid, where "link" gets used inappropriately, etc. With the proposed patches, all this changes subtly - but in a backwards compatible way at this stage. We pass the the full resolved link state (speed, duplex, pause) to mac_link_up(), and it is now guaranteed that these parameters to this function will always be valid (no more SPEED_UNKNOWN or DUPLEX_UNKNOWN here - unless phylink is fed with such things.) Drivers should convert over to using the state in mac_link_up() rather than configuring the speed, duplex and pause in the mac_config() method. The patch series includes a number of MAC drivers which I've thought have been easy targets - I've left the remainder as I think they need maintainer input. However, *all* drivers will need conversion for future phylink development. v2: add ocelot/felix and qca/ar9331 DSA drivers to patch 2, add received tested-by so far. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27net: mvpp2: use resolved link config in mac_link_up()Russell King1-36/+47
Convert the Marvell mvpp2 ethernet driver to use the finalised link parameters in mac_link_up() rather than the parameters in mac_config(). Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27net: mvneta: use resolved link config in mac_link_up()Russell King1-17/+38
Convert the Marvell mvneta ethernet driver to use the finalised link parameters in mac_link_up() rather than the parameters in mac_config(). Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27net: macb: use resolved link config in mac_link_up()Russell King2-22/+29
Convert the macb ethernet driver to use the finalised link parameters in mac_link_up() rather than the parameters in mac_config(). Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27net: dpaa2-mac: use resolved link config in mac_link_up()Russell King2-22/+33
Convert the DPAA2 ethernet driver to use the finalised link parameters in mac_link_up() rather than the parameters in mac_config(), which are more suited to the needs of the DPAA2 MC firmware than those available via mac_config(). Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27net: axienet: use resolved link config in mac_link_up()Russell King1-19/+19
Convert the Xilinx AXI ethernet driver to use the finalised link parameters in mac_link_up() rather than the parameters in mac_config(). Tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>