summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-04-30rionet: Fix use correct return type for ndo_start_xmit()Yunjian Wang1-1/+2
The method ndo_start_xmit() returns a value of type netdev_tx_t. Fix the ndo function to use the correct type. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30net: caif: Fix use correct return type for ndo_start_xmit()Yunjian Wang1-1/+2
The method ndo_start_xmit() returns a value of type netdev_tx_t. Fix the ndo function to use the correct type. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30Merge branch 'net-phy-mdio-add-IPQ40xx-MDIO-support'David S. Miller5-0/+257
Robert Marko says: ==================== net: phy: mdio: add IPQ40xx MDIO support This patch series provides support for the IPQ40xx built-in MDIO interface. Included are driver, devicetree bindings for it and devicetree node. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30ARM: dts: qcom: ipq4019: add MDIO nodeRobert Marko1-0/+28
This patch adds the necessary MDIO interface node to the Qualcomm IPQ4019 DTSI. Built-in QCA8337N switch is managed using it, and since we have a driver for it lets add it. Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Robert Marko <robert.marko@sartura.hr> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Luka Perkov <luka.perkov@sartura.hr> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30dt-bindings: add Qualcomm IPQ4019 MDIO bindingsRobert Marko1-0/+61
This patch adds the binding document for the IPQ40xx MDIO driver. Signed-off-by: Robert Marko <robert.marko@sartura.hr> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Luka Perkov <luka.perkov@sartura.hr> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30net: phy: mdio: add IPQ4019 MDIO driverRobert Marko3-0/+168
This patch adds the driver for the MDIO interface inside of Qualcomm IPQ40xx series SoC-s. Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Robert Marko <robert.marko@sartura.hr> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Luka Perkov <luka.perkov@sartura.hr> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30hinic: Use ARRAY_SIZE for nic_vf_cmd_msg_handlerZou Wei1-5/+3
fix coccinelle warning, use ARRAY_SIZE drivers/net/ethernet/huawei/hinic/hinic_sriov.c:713:43-44: WARNING: Use ARRAY_SIZE v1-->v2: remove cmd_number v2-->v3: preserve the reverse christmas tree ordering of local variables Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30hinic: make a bunch of functions staticYueHaibing1-43/+48
These fucntions is used only in hinic_sriov.c, so make them static to fix sparse warnings. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30net/mlx5e: Unify reserving space for WQEsMaxim Mikityanskiy6-69/+88
In our fast-path design, a WQE (Work Queue Element) must not cross the page boundary. To enforce that, for WQEs consisting of more than one BB (Basic Block), the driver checks the available contiguous space in the WQ in advance, and if it's not enough, it pads it with NOPs. This patch modifies the code that calculates the position of next WQE, considering the padding, and prepares the WQE. This code is common for all SQ types. In this patch it's reorganized in a way that makes the usage pattern unified for all SQ types, and makes the implementations self-contained and look almost the same, preparing the repeating code to further attempts to deduplicate it. One place is left as is: mlx5e_sq_xmit and mlx5e_fill_sq_frag_edge call inside, because it is special in a way that it may also copy WQE's cseg and eseg when reserving space. This will be eliminated in one of the following patches, and this place will be converted to the new approach, too. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5e: Rename ICOSQ WQE info struct and fieldMaxim Mikityanskiy4-15/+15
Structs mlx5e_txqsq and mlx5e_xdpsq contain wqe_info arrays to store supplementary information corresponding to WQEs in the queue. Struct mlx5e_icosq also has such an array, but it's called differently - ico_wqe. This patch renames it to unify with the other SQs. In addition, rename the struct to emphasize its specific usage. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5e: Fetch WQE: reuse code and enforce typingMaxim Mikityanskiy8-40/+38
There are multiple functions mlx5{e,i}_*_fetch_wqe that contain the same code, that is repeated, because they operate on different SQ struct types. mlx5e_sq_fetch_wqe also returns void *, instead of the concrete WQE type. This commit generalizes the fetch WQE operation by putting this code into a single function. To simplify calls of the generic function in concrete use cases, macros are provided that substitute the right WQE size and cast the return type. Before this patch, fetch_wqe used to calculate pi itself, but the value was often known to the caller. This calculation is moved outside to eliminate this unnecessary step and prepare for the fill_frag_edge refactoring in the next patch. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5e: XDP, Print the offending TX descriptor on error completionTariq Toukan1-4/+3
Upon an error completion on an XDP SQ, print the offending WQE to ease the debug process. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5e: TX, Generalise code and usage of error CQE dumpTariq Toukan4-22/+27
Error CQE was dumped only for TXQ SQs. Generalise the function, and add usage for error completions on ICO SQs and XDP SQs. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5e: Use proper name field for the UMR keyTariq Toukan1-1/+1
Even though some of the WQE control segment's field share the same memory bits (a union of fields), prefer having the right field name for every different usage. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5: Add support for release all pages eventEran Ben Elisha2-3/+41
If FW sets release_all_pages bit in MLX5_EVENT_TYPE_PAGE_REQUEST, driver shall release all pages of a given function id, with no further pages reclaim negotiation with FW nor MANAGE_PAGES commands from driver towards FW. Upon receiving this bit as part of pages reclaim event, driver will initiate release all flow, in which it will iterate and release all function's pages. As part of driver <-> FW capabilities handshake, FW will report release_all_pages max HCA cap bit, and driver will set the release_all_pages bit in HCA cap. NIC: ConnectX-4 Lx CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz Test case: Simulataniously FLR 4 VFs, and measure FW release pages by driver. Before: 3.18 Sec After: 0.31 Sec Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5: Rate limit page not found error messagesEran Ben Elisha1-1/+1
Thousands of pages are released with free_addr() function. In case of buggy sync between FW and driver on released address, the log will be flooded with error messages. Use mlx5_core_warn_rl() to limit it. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5: Add helper function to release fw pageEran Ben Elisha1-13/+17
Factor out the fwp address release page to an helper function, will be used in the downstream patch. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5: Remove unused field in EQTariq Toukan1-1/+0
The size field in EQ is not in use. Remove it. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5: CT: Remove unused variablesPaul Blakey1-3/+0
Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5e: CT: Avoid false warning about rule may be used uninitializedRoi Dayan1-1/+1
Avoid gcc warning by preset rule to invalid ptr. Fixes: 4c3844d9e97e ("net/mlx5e: CT: Introduce connection tracking") Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5e: Remove unneeded semicolonZheng Bin1-1/+1
Fixes coccicheck warning: drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c:690:2-3: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5e: Use helper API to get devlink port index for all port flavoursParav Pandit1-8/+4
Use existing helper API to get unique devlink port index for all devlink port flavours. Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5: IPsec, Fix coverity issueRaed Salem1-2/+2
The cited commit introduced the following coverity issue at functions mlx5_fpga_is_ipsec_device() and mlx5_fpga_ipsec_release_sa_ctx(): - bit_and_with_zero: accel_xfrm->attrs.action & MLX5_ACCEL_ESP_ACTION_DECRYPT is always 0. As MLX5_ACCEL_ESP_ACTION_DECRYPT is not a bitwise flag and was wrongly used with bitwise operation, the above expression is always zero value as MLX5_ACCEL_ESP_ACTION_DECRYPT is zero. Fix by using "==" comparison operator instead. Fixes: 7dfee4b1d79e ("net/mlx5: IPsec, Refactor SA handle creation and destruction") Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30Merge branch 'mlx5-next' of ↵Saeed Mahameed72-1393/+1144
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5 updates for both net-next and rdma-next: 1) HW bits and definitions for TLS and IPsec offlaods 2) Release all pages capability bits 3) New command interface helpers and some code cleanup as a result 4) Move qp.c out of mlx5 core driver into mlx5_ib rdma driver Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30selftests/bpf: Test allowed maps for bpf_sk_select_reuseportJakub Sitnicki2-1/+56
Check that verifier allows passing a map of type: BPF_MAP_TYPE_REUSEPORT_SOCKARRARY, or BPF_MAP_TYPE_SOCKMAP, or BPF_MAP_TYPE_SOCKHASH ... to bpf_sk_select_reuseport helper. Suggested-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200430104738.494180-1-jakub@cloudflare.com
2020-04-30libbpf: Fix false uninitialized variable warningAndrii Nakryiko1-1/+1
Some versions of GCC falsely detect that vi might not be initialized. That's not true, but let's silence it with NULL initialization. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200430021436.1522502-1-andriin@fb.com
2020-04-30bpf, riscv: Fix stack layout of JITed code on RV32Luke Nelson1-33/+65
This patch fixes issues with stackframe unwinding and alignment in the current stack layout for BPF programs on RV32. In the current layout, RV32 fp points to the JIT scratch registers, rather than to the callee-saved registers. This breaks stackframe unwinding, which expects fp to point just above the saved ra and fp registers. This patch fixes the issue by moving the callee-saved registers to be stored on the top of the stack, pointed to by fp. This satisfies the assumptions of stackframe unwinding. This patch also fixes an issue with the old layout that the stack was not aligned to 16 bytes. Stacktrace from JITed code using the old stack layout: [ 12.196249 ] [<c0402200>] walk_stackframe+0x0/0x96 Stacktrace using the new stack layout: [ 13.062888 ] [<c0402200>] walk_stackframe+0x0/0x96 [ 13.063028 ] [<c04023c6>] show_stack+0x28/0x32 [ 13.063253 ] [<a403e778>] bpf_prog_82b916b2dfa00464+0x80/0x908 [ 13.063417 ] [<c09270b2>] bpf_test_run+0x124/0x39a [ 13.063553 ] [<c09276c0>] bpf_prog_test_run_skb+0x234/0x448 [ 13.063704 ] [<c048510e>] __do_sys_bpf+0x766/0x13b4 [ 13.063840 ] [<c0485d82>] sys_bpf+0xc/0x14 [ 13.063961 ] [<c04010f0>] ret_from_syscall+0x0/0x2 The new code is also simpler to understand and includes an ASCII diagram of the stack layout. Tested on riscv32 QEMU virt machine. Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Xi Wang <xi.wang@gmail.com> Link: https://lore.kernel.org/bpf/20200430005127.2205-1-luke.r.nels@gmail.com
2020-04-30Merge branch 'net-bcmgenet-add-support-for-Wake-on-Filter'David S. Miller3-76/+708
Doug Berger says: ==================== net: bcmgenet: add support for Wake on Filter Changes in v2: Corrected Signed-off-by for commit 3/7. This commit set adds support for waking from 'standby' using a Rx Network Flow Classification filter specified with ethtool. The first two commits are bug fixes that should be applied to the stable branches, but are included in this patch set to reduce merge conflicts that might occur if not applied before the other commits in this set. The next commit consolidates WoL clock managment as a part of the overall WoL configuration. The next commit restores a set of functions that were removed from the driver just prior to the 4.9 kernel release. The following commit relocates the functions in the file to prevent the need for additional forward declarations. Next, support for the Rx Network Flow Classification interface of ethtool is added. Finally, support for the WAKE_FILTER wol method is added. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30net: bcmgenet: add WAKE_FILTER supportDoug Berger2-11/+37
This commit enables support for the WAKE_FILTER method of Wake on LAN for the GENET driver. The method can be enabled by adding 'f' to the interface 'wol' setting specified by ethtool. Rx network flow rules can be specified using ethtool. Rules that define a flow-type with the RX_CLS_FLOW_WAKE action (i.e. -2) can wake the system from the 'standby' power state when the WAKE_FILTER WoL method is enabled. Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30net: bcmgenet: add support for ethtool rxnfc flowsDoug Berger2-11/+488
This commit enables driver support for ethtool commands of this form: ethtool -N|-U|--config-nfc|--config-ntuple devname flow-type ether|ip4 [src xx:yy:zz:aa:bb:cc [m xx:yy:zz:aa:bb:cc]] [dst xx:yy:zz:aa:bb:cc [m xx:yy:zz:aa:bb:cc]] [proto N [m N]] [src-ip x.x.x.x [m x.x.x.x]] [dst-ip x.x.x.x [m x.x.x.x]] [tos N [m N]] [l4proto N [m N]] [src-port N [m N]] [dst-port N [m N]] [spi N [m N]] [l4data N [m N]] [vlan-etype N [m N]] [vlan N [m N]] [dst-mac xx:yy:zz:aa:bb:cc [m xx:yy:zz:aa:bb:cc]] [action 0] [loc N] | delete N Since there is only one Rx Ring in this implementation action 0 behaves no differently from not specifying a rule. The rules can be seen with ethtool commands of this form: ethtool -n|-u|--show-nfc|--show-ntuple devname [rule N] Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30net: bcmgenet: code movementDoug Berger1-154/+154
The Hardware Filter Block code will be used by ethtool functions when defining flow types so this commit moves the functions in the file to prevent the need for prototype declarations. This is broken out to facilitate review. Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30Revert "net: bcmgenet: remove unused function in bcmgenet.c"Doug Berger1-0/+122
This reverts commit e2072600a24161b7ddcfb26814f69f5fbc8ef85a. This commit restores the previous implementation of Hardware Filter Block functions to the file for use in subsequent commits. Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30net: bcmgenet: move clk_wol management to bcmgenet_wolDoug Berger3-16/+20
The GENET_POWER_WOL_MAGIC power up and power down code configures the device for WoL when suspending and disables the WoL logic when resuming. It makes sense that this code should also manage the WoL clocking. This commit consolidates the logic and moves it earlier in the resume sequence. Since the clock is now only enabled if WoL is successfully entered the wol_active flag is introduced to track that state to keep the clock enables and disables balanced in case a suspend is aborted. The MPD_EN hardware bit can't be used because it can be cleared when the MAC is reset by a deep sleep. Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30net: bcmgenet: Fix WoL with password after deep sleepDoug Berger2-21/+20
Broadcom STB chips support a deep sleep mode where all register contents are lost. Because we were stashing the MagicPacket password into some of these registers a suspend into that deep sleep then a resumption would not lead to being able to wake-up from MagicPacket with password again. Fix this by keeping a software copy of the password and program it during suspend. Fixes: c51de7f3976b ("net: bcmgenet: add Wake-on-LAN support code") Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30net: bcmgenet: set Rx mode before starting netifDoug Berger1-0/+4
This commit explicitly calls the bcmgenet_set_rx_mode() function when the network interface is started. This function is normally called by ndo_set_rx_mode when the flags are changed, but apparently not when the driver is suspended and resumed. This change ensures that address filtering or promiscuous mode are properly restored by the driver after the MAC may have been reset. Fixes: b6e978e50444 ("net: bcmgenet: add suspend/resume callbacks") Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-30bpf: Fix unused variable warningArnd Bergmann1-1/+1
Hiding the only using of bpf_link_type_strs[] in an #ifdef causes an unused-variable warning: kernel/bpf/syscall.c:2280:20: error: 'bpf_link_type_strs' defined but not used [-Werror=unused-variable] 2280 | static const char *bpf_link_type_strs[] = { Move the definition into the same #ifdef. Fixes: f2e10bff16a0 ("bpf: Add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200429132217.1294289-1-arnd@arndb.de
2020-04-30selftests/bpf: Use SOCKMAP for server sockets in bpf_sk_assign testJakub Sitnicki3-52/+53
Update bpf_sk_assign test to fetch the server socket from SOCKMAP, now that map lookup from BPF in SOCKMAP is enabled. This way the test TC BPF program doesn't need to know what address server socket is bound to. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200429181154.479310-4-jakub@cloudflare.com
2020-04-30selftests/bpf: Test that lookup on SOCKMAP/SOCKHASH is allowedJakub Sitnicki2-30/+70
Now that bpf_map_lookup_elem() is white-listed for SOCKMAP/SOCKHASH, replace the tests which check that verifier prevents lookup on these map types with ones that ensure that lookup operation is permitted, but only with a release of acquired socket reference. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200429181154.479310-3-jakub@cloudflare.com
2020-04-30bpf: Allow bpf_map_lookup_elem for SOCKMAP and SOCKHASHJakub Sitnicki3-12/+55
White-list map lookup for SOCKMAP/SOCKHASH from BPF. Lookup returns a pointer to a full socket and acquires a reference if necessary. To support it we need to extend the verifier to know that: (1) register storing the lookup result holds a pointer to socket, if lookup was done on SOCKMAP/SOCKHASH, and that (2) map lookup on SOCKMAP/SOCKHASH is a reference acquiring operation, which needs a corresponding reference release with bpf_sk_release. On sock_map side, lookup handlers exposed via bpf_map_ops now bump sk_refcnt if socket is reference counted. In turn, bpf_sk_select_reuseport, the only in-kernel user of SOCKMAP/SOCKHASH ops->map_lookup_elem, was updated to release the reference. Sockets fetched from a map can be used in the same way as ones returned by BPF socket lookup helpers, such as bpf_sk_lookup_tcp. In particular, they can be used with bpf_sk_assign to direct packets toward a socket on TC ingress path. Suggested-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200429181154.479310-2-jakub@cloudflare.com
2020-04-30tools: bpftool: Make libcap dependency optionalQuentin Monnet3-5/+38
The new libcap dependency is not used for an essential feature of bpftool, and we could imagine building the tool without checks on CAP_SYS_ADMIN by disabling probing features as an unprivileged users. Make it so, in order to avoid a hard dependency on libcap, and to ease packaging/embedding of bpftool. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200429144506.8999-4-quentin@isovalent.com
2020-04-30tools: bpftool: Allow unprivileged users to probe featuresQuentin Monnet4-16/+100
There is demand for a way to identify what BPF helper functions are available to unprivileged users. To do so, allow unprivileged users to run "bpftool feature probe" to list BPF-related features. This will only show features accessible to those users, and may not reflect the full list of features available (to administrators) on the system. To avoid the case where bpftool is inadvertently run as non-root and would list only a subset of the features supported by the system when it would be expected to list all of them, running as unprivileged is gated behind the "unprivileged" keyword passed to the command line. When used by a privileged user, this keyword allows to drop the CAP_SYS_ADMIN and to list the features available to unprivileged users. Note that this addsd a dependency on libpcap for compiling bpftool. Note that there is no particular reason why the probes were restricted to root, other than the fact I did not need them for unprivileged and did not bother with the additional checks at the time probes were added. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200429144506.8999-3-quentin@isovalent.com
2020-04-30tools: bpftool: For "feature probe" define "full_mode" bool as globalQuentin Monnet1-8/+7
The "full_mode" variable used for switching between full or partial feature probing (i.e. with or without probing helpers that will log warnings in kernel logs) was piped from the main do_probe() function down to probe_helpers_for_progtype(), where it is needed. Define it as a global variable: the calls will be more readable, and if other similar flags were to be used in the future, we could use global variables as well instead of extending again the list of arguments with new flags. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200429144506.8999-2-quentin@isovalent.com
2020-04-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller7-42/+129
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for nf-next: 1) Add IPS_HW_OFFLOAD status bit, from Bodong Wang. 2) Remove 128-bit limit on the set element data area, rise it to 64 bytes. 3) Report EOPNOTSUPP for unsupported NAT types and flags. 4) Set up nft_nat flags from the control plane path. 5) Add helper functions to set up the nf_nat_range2 structure. 6) Add netmap support for nft_nat. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-29Merge branch 'net-smc-preparations-for-SMC-R-link-failover'David S. Miller16-702/+1063
Karsten Graul says: ==================== net/smc: preparations for SMC-R link failover This patch series prepares the SMC code for the implementation of SMC-R link failover capabilities which are still missing to reach full compliance with RFC 7609. The code changes are separated into 65 patches which together form the new functionality. I tried to create meaningful patches which allow to follow the implementation. Question: how to handle the remaining 52 patches? All of them are needed for link failover to work and should make it into the same merge window. Can I send them all together? The SMC-R implementation will transparently make use of the link failover feature when matching RoCE devices are available, no special setup is required. All RoCE devices with the same PNET ID as the TCP device (hardware-defined or user-defined via the smc_pnet tool) are candidates to get used to form a link in a link group. When at least 2 RoCE devices are available on both communication endpoints then a symmetric link group is formed, meaning the link group has 2 independent links. If one RoCE device goes down then all connections on this link are moved to the surviving link. Upon recovery of the failing device or availability of a new one, the symmetric link group will be restored. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-29net/smc: move llc layer related init and clear into smc_llc.cKarsten Graul5-14/+31
Introduce smc_llc_lgr_init() and smc_llc_lgr_clear() to implement all llc layer specific initialization and cleanup in module smc_llc.c. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-29net/smc: use mutex instead of rwlock_t to protect buffersKarsten Graul2-13/+13
The locks for sndbufs and rmbs are never used from atomic context. Using a mutex for these locks will allow to nest locks with other mutexes. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-29net/smc: process llc responses in tasklet contextKarsten Graul2-108/+116
When llc responses are received then possible waiters for this response are to be notified. This can be done in tasklet context, without to use a work in the llc work queue. Move all code that handles llc responses into smc_llc_rx_response(). Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-29net/smc: use worker to process incoming llc messagesKarsten Graul4-58/+96
Incoming llc messages are processed in irq tasklet context, and a worker is used to send outgoing messages. The worker is needed because getting a send buffer could result in a wait for a free buffer. To make sure all incoming llc messages are processed in a serialized way introduce an event queue and create a new queue entry for each message which is queued to this event queue. A new worker processes the event queue entries in order. And remove the use of a separate worker to send outgoing llc messages because the messages are processed in worker context already. With this event queue the serialized llc_wq work queue is obsolete, remove it. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-29net/smc: simplify link deactivationKarsten Graul3-18/+6
Cancel the testlink worker during link clear processing and remove the extra function smc_llc_link_inactive(). Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-29net/smc: move testlink work to system work queueKarsten Graul1-5/+6
The testlink work waits for a response to the testlink request and blocks the single threaded llc_wq. This type of work does not have to be serialized and can be moved to the system work queue. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>