summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-05-22ice: remove unnecessary backslashBruce Allan1-1/+1
Self-explanatory. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: remove unnecessary checkBruce Allan1-1/+1
The variable status cannot be zero due to a prior check of it; remove this check. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: remove unnecessary expression that is always trueBruce Allan1-1/+2
The else conditional expression is always true due to the if conditional expression; remove it and add a comment to make it obvious still. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: Fix check for removing/adding mac filtersLihong Yang1-5/+10
In function ice_set_mac_address, we will remove old dev_addr before adding the new MAC. In the removing and adding process of the MAC, there is no need to return error if the check finds the to-be-removed dev_addr does not exist in the MAC filter list or the to-be-added mac already exists, keep going or return success accordingly. Signed-off-by: Lihong Yang <lihong.yang@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: refactor filter functionsMichal Swiatkowski8-264/+494
Move filter functions to separate file. Add functions that prepare suitable ice_fltr_info struct depending on the filter type and add this struct to earlier created list: - ice_fltr_add_mac_to_list - ice_fltr_add_vlan_to_list - ice_fltr_add_eth_to_list This functions are used in adding and removing filters. Create wrappers for functions mentioned above that alloc list, add suitable ice_fltr_info to it and call add or remove function. - ice_fltr_prepare_mac - ice_fltr_prepare_mac_and_broadcast - ice_fltr_prepare_vlan - ice_fltr_prepare_eth Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: Fix resource leak on early exit from functionEric Joyner1-1/+3
Memory allocated in the ice_add_prof_id_vsig() function wasn't being properly freed if an error occurred inside the for-loop in the function. In particular, 'p' wasn't being freed if an error occurred before it was added to the resource list at the end of the for-loop. Signed-off-by: Eric Joyner <eric.joyner@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: cleanup vf_id signednessJesse Brandeburg3-10/+11
The vf_id variable is dealt with in the code in inconsistent ways of sign usage, preventing compilation with -Werror=sign-compare. Fix this problem in the code by always treating vf_id as unsigned, since there are no valid values of vf_id that are negative. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: Fix casting issuesKarol Kolacinski13-64/+74
Change min() macros to min_t() which has compare type specified and it helps avoid precision loss. In some cases there was precision loss during calls or assignments. Some fields in structs were unnecessarily large and gave multiple warnings. There were also some minor type differences which are now fixed as well as some cases where a simple cast was needed. Callers were were passing data that is a u16 to ice_sched_cfg_node_bw_alloc() but the function was truncating that to a u8. Fix that by changing the function to take a u16. Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: Provide more meaningful error messageLihong Yang6-113/+247
When printing the ice status or AQ error codes, instead of printing out the numerical value, provide the description of the error code. This provides more info about the issue than a number. Signed-off-by: Lihong Yang <lihong.yang@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: Fix probe/open race conditionAnirudh Venkataramanan1-10/+14
As soon as the driver registers the PF netdev, userspace utilities like NetworkManager try to bring up the associated interface. When this happens, the driver may not have finished initializing fully, resulting in a bunch of errors in the interface up flow. The driver already has a mechanism to indicate if it's not up yet; by setting the __ICE_DOWN bit in pf->state, but this bit gets cleared too early in the current flow. So clear this bit only when the driver is fully up. Also check for the same bit in the ice_open flow, and return -EBUSY if the bit is set. Also in ice_open, replace references of vsi->back with a local variable. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: only drop link once when setting pauseparamsDave Ertman1-12/+0
Currently, the ice driver is setting a PHY configuration, which causes a link drop, and then additionally it calls for a nway_reset, which restarts auto-negotiation on the link, which also causes a link drop. These two link events in such close timing is causing the FW to not be able to generate a link interrupt for the driver to respond to. Remove the unnecessary auto-negotiation restart from the set pauseparams flow. Also remove error path that would have performed an ice_down/ice_up as that is also unnecessary. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: Fix check for contiguous TCsDave Ertman1-7/+12
The current implementation for contiguous TC check is assuming that the UPs will be mapped to TCs in a linear progressing fashion. This is obviously not always true. Change the check to allow for various UP2TC mapping configurations. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: Don't reset and rebuild for Tx timeout on PFC enabled queueAvinash JD5-0/+94
When there's a Tx timeout for a queue which belongs to a PFC enabled TC, then it's not because the queue is hung but because PFC is in action. In PFC, peer sends a pause frame for a specified period of time when its buffer threshold is exceeded (due to congestion). Netdev on the other hand checks if ACK is received within a specified time for a TX packet, if not, it'll invoke the tx_timeout routine. Signed-off-by: Avinash JD <avinash.dayanand@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: Add VF promiscuous supportBrett Creeley4-2/+223
Implement promiscuous support for VF VSIs. Behaviour of promiscuous support is based on VF trust as well as the, introduced, vf-true-promisc flag. A trusted VF with vf-true-promisc disabled will be the default VSI, which means that all traffic without a matching destination MAC address in the device's internal switch will be forwarded to this VF VSI. A trusted VF with vf-true-promisc enabled will go into "true promiscuous mode". This amounts to the VF receiving all ingress and egress traffic that hits the device's internal switch. An untrusted VF will only receive traffic destined for that VF. The vf-true-promisc-support flag cannot be toggled while any VF is in promiscuous mode. This flag should be set prior to loading the iavf driver or spawning VF(s). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: Add support for tunnel offloadsTony Nguyen14-14/+867
Create a boost TCAM entry for each tunnel port in order to get a tunnel PTYPE. Update netdev feature flags and implement the appropriate logic to get and set values for hardware offloads. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22ice: report netlist version in .info_getJacob Keller5-0/+158
The flash memory for the ice hardware contains a block of information used for link management called the Netlist module. As this essentially represents another section of firmware, add its version information to the output of the driver's .info_get handler. This includes both a version and the first few bytes of a hash of the module contents. fw.netlist -> the version information extracted from the netlist module fw.netlist.build-> first 4 bytes of the hash of the contents, similar to fw.mgmt.build Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22flow_dissector: Drop BPF flow dissector prog ref on netns cleanupJakub Sitnicki1-5/+21
When attaching a flow dissector program to a network namespace with bpf(BPF_PROG_ATTACH, ...) we grab a reference to bpf_prog. If netns gets destroyed while a flow dissector is still attached, and there are no other references to the prog, we leak the reference and the program remains loaded. Leak can be reproduced by running flow dissector tests from selftests/bpf: # bpftool prog list # ./test_flow_dissector.sh ... selftests: test_flow_dissector [PASS] # bpftool prog list 4: flow_dissector name _dissect tag e314084d332a5338 gpl loaded_at 2020-05-20T18:50:53+0200 uid 0 xlated 552B jited 355B memlock 4096B map_ids 3,4 btf_id 4 # Fix it by detaching the flow dissector program when netns is going away. Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook") Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20200521083435.560256-1-jakub@cloudflare.com
2020-05-22Merge branch 'improve-branch_taken'Alexei Starovoitov4-3/+86
John Fastabend says: ==================== This series adds logic to the verifier to handle the case where a pointer is known to be non-null but then the verifier encountesr a instruction, such as 'if ptr == 0 goto X' or 'if ptr != 0 goto X', where the pointer is compared against 0. Because the verifier tracks if a pointer may be null in many cases we can improve the branch tracking by following the case known to be true. The first patch adds the verifier logic and patches 2-4 add the test cases. v1->v2: fix verifier logic to return -1 indicating both paths must still be walked if value is not zero. Move mark_precision skip for this case into caller of mark_precision to ensure mark_precision can still catch other misuses. And add PTR_TYPE_BTF_ID to our list of supported types. Finally add one more test to catch the value not equal zero case. Thanks to Andrii for original review. Also fixed up commit messages hopefully its better now. ==================== Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-05-22bpf: Selftests, add printk to test_sk_lookup_kern to encode null ptr checkJohn Fastabend1-0/+1
Adding a printk to test_sk_lookup_kern created the reported failure where a pointer type is checked twice for NULL. Lets add it to the progs test test_sk_lookup_kern.c so we test the case from C all the way into the verifier. We already have printk's in selftests so seems OK to add another one. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/159009170603.6313.1715279795045285176.stgit@john-Precision-5820-Tower
2020-05-22bpf: Selftests, verifier case for non null pointer map value branchJohn Fastabend1-0/+19
When we have pointer type that is known to be non-null we only follow the non-null branch. This adds tests to cover the map_value pointer returned from a map lookup. To force an error if both branches are followed we do an ALU op on R10. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/159009168650.6313.7434084136067263554.stgit@john-Precision-5820-Tower
2020-05-22bpf: Selftests, verifier case for non null pointer check branch takenJohn Fastabend1-0/+33
When we have pointer type that is known to be non-null and comparing against zero we only follow the non-null branch. This adds tests to cover this case for reference tracking. Also add the other case when comparison against a non-zero value and ensure we still fail with unreleased reference. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/159009166599.6313.1593680633787453767.stgit@john-Precision-5820-Tower
2020-05-22bpf: Verifier track null pointer branch_taken with JNE and JEQJohn Fastabend1-3/+33
Currently, when considering the branches that may be taken for a jump instruction if the register being compared is a pointer the verifier assumes both branches may be taken. But, if the jump instruction is comparing if a pointer is NULL we have this information in the verifier encoded in the reg->type so we can do better in these cases. Specifically, these two common cases can be handled. * If the instruction is BPF_JEQ and we are comparing against a zero value. This test is 'if ptr == 0 goto +X' then using the type information in reg->type we can decide if the ptr is not null. This allows us to avoid pushing both branches onto the stack and instead only use the != 0 case. For example PTR_TO_SOCK and PTR_TO_SOCK_OR_NULL encode the null pointer. Note if the type is PTR_TO_SOCK_OR_NULL we can not learn anything. And also if the value is non-zero we learn nothing because it could be any arbitrary value a different pointer for example * If the instruction is BPF_JNE and ware comparing against a zero value then a similar analysis as above can be done. The test in asm looks like 'if ptr != 0 goto +X'. Again using the type information if the non null type is set (from above PTR_TO_SOCK) we know the jump is taken. In this patch we extend is_branch_taken() to consider this extra information and to return only the branch that will be taken. This resolves a verifier issue reported with C code like the following. See progs/test_sk_lookup_kern.c in selftests. sk = bpf_sk_lookup_tcp(skb, tuple, tuple_len, BPF_F_CURRENT_NETNS, 0); bpf_printk("sk=%d\n", sk ? 1 : 0); if (sk) bpf_sk_release(sk); return sk ? TC_ACT_OK : TC_ACT_UNSPEC; In the above the bpf_printk() will resolve the pointer from PTR_TO_SOCK_OR_NULL to PTR_TO_SOCK. Then the second test guarding the release will cause the verifier to walk both paths resulting in the an unreleased sock reference. See verifier/ref_tracking.c in selftests for an assembly version of the above. After the above additional logic is added the C code above passes as expected. Reported-by: Andrey Ignatov <rdna@fb.com> Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/159009164651.6313.380418298578070501.stgit@john-Precision-5820-Tower
2020-05-22Merge branch 'af_xdp-common-alloc'Alexei Starovoitov47-1940/+1279
Björn Töpel says: ==================== Overview ======== Driver adoption for AF_XDP has been slow. The amount of code required to proper support AF_XDP is substantial and the driver/core APIs are vague or even non-existing. Drivers have to manually adjust data offsets, updating AF_XDP handles differently for different modes (aligned/unaligned). This series attempts to improve the situation by introducing an AF_XDP buffer allocation API. The implementation is based on a single core (single producer/consumer) buffer pool for the AF_XDP UMEM. A buffer is allocated using the xsk_buff_alloc() function, and returned using xsk_buff_free(). If a buffer is disassociated with the pool, e.g. when a buffer is passed to an AF_XDP socket, a buffer is said to be released. Currently, the release function is only used by the AF_XDP internals and not visible to the driver. Drivers using this API should register the XDP memory model with the new MEM_TYPE_XSK_BUFF_POOL type, which will supersede the MEM_TYPE_ZERO_COPY type. The buffer type is struct xdp_buff, and follows the lifetime of regular xdp_buffs, i.e. the lifetime of an xdp_buff is restricted to a NAPI context. In other words, the API is not replacing xdp_frames. DMA mapping/synching is folded into the buffer handling as well. @JeffK The Intel drivers changes should go through the bpf-next tree, and not your regular Intel tree, since multiple (non-Intel) drivers are affected. The outline of the series is as following: Patch 1 is a fix for xsk_umem_xdp_frame_sz(). Patch 2 to 4 are restructures/clean ups. The XSKMAP implementation is moved to net/xdp/. Functions/defines/enums that are only used by the AF_XDP internals are moved from the global include/net/xdp_sock.h to net/xdp/xsk.h. We are also introducing a new "driver include file", include/net/xdp_sock_drv.h, which is the only file NIC driver developers adding AF_XDP zero-copy support should care about. Patch 5 adds the new API, and migrates the "copy-mode"/skb-mode AF_XDP path to the new API. Patch 6 to 11 migrates the existing zero-copy drivers to the new API. Patch 12 removes the MEM_TYPE_ZERO_COPY memory type, and the "handle" member of struct xdp_buff. Patch 13 simplifies the xdp_return_{frame,frame_rx_napi,buff} functions. Patch 14 is a performance patch, where some functions are inlined. Finally, patch 15 updates the MAINTAINERS file to correctly mirror the new file layout. Note that this series removes the "handle" member from struct xdp_buff, which reduces the xdp_buff size. After this series, the diff stat of drivers/net/ is: 27 files changed, 419 insertions(+), 1288 deletions(-) This series is a first step of simplifying the driver side of AF_XDP. I think more of the AF_XDP logic can be moved from the drivers to the AF_XDP core, e.g. the "need wakeup" set/clear functionality. Statistics when allocation fails can now be added to the socket statistics via the XDP_STATISTICS getsockopt(). This will be added in a follow up series. Performance =========== As a nice side effect, performance is up a bit as well. * i40e: 3% higher pps for rxdrop, zero-copy, aligned and unaligned (40 GbE, 64B packets). * mlx5: RX +0.8 Mpps, TX +0.4 Mpps Changelog ========= v4->v5: * Fix various kdoc and GCC warnings (W=1). (Jakub) v3->v4: * mlx5: Remove unused variable num_xsk_frames. (Jakub) * i40e: Made i40e_fd_handle_status() static. (kbuild test robot) v2->v3: * Added xsk_umem_xdp_frame_sz() fix to the series. (Björn) * Initialize struct xdp_buff member frame_sz. (Björn) * Add API to query the DMA address of a frame. (Maxim) * Do DMA sync for CPU till the end of the frame to handle possible growth (frame_sz). (Maxim) * mlx5: Handle frame_sz, use xsk_buff_xdp_get_frame_dma, use xsk_buff API for DMA sync on TX, add performance numbers. (Maxim) v1->v2: * mlx5: Fix DMA address handling, set XDP metadata to invalid. (Maxim) * ixgbe: Fixed xdp_buff data_end update. (Björn) * Swapped SoBs in patch 4. (Maxim) rfc->v1: * Fixed build errors/warnings for m68k and riscv. (kbuild test robot) * Added headroom/chunk size getter. (Maxim/Björn) * mlx5: Put back the sanity check for XSK params, use XSK API to get the total headroom size. (Maxim) * Fixed spelling in commit message. (Björn) * Make sure xp_validate_desc() is inlined for Tx perf. (Maxim) * Sorted file entries. (Joe) * Added xdp_return_{frame,frame_rx_napi,buff} simplification (Björn) Thanks for all the comments/input/help! ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-05-22MAINTAINERS, xsk: Update AF_XDP section after moves/addsBjörn Töpel1-1/+5
Update MAINTAINERS to correctly mirror the current AF_XDP socket file layout. Also, add the AF_XDP files of libbpf. rfc->v1: Sorted file entries. (Joe) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/bpf/20200520192103.355233-16-bjorn.topel@gmail.com
2020-05-22xsk: Explicitly inline functions and move definitionsBjörn Töpel4-150/+156
In order to reduce the number of function calls, the struct xsk_buff_pool definition is moved to xsk_buff_pool.h. The functions xp_get_dma(), xp_dma_sync_for_cpu(), xp_dma_sync_for_device(), xp_validate_desc() and various helper functions are explicitly inlined. Further, move xp_get_handle() and xp_release() to xsk.c, to allow for the compiler to perform inlining. rfc->v1: Make sure xp_validate_desc() is inlined for Tx perf. (Maxim) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-15-bjorn.topel@gmail.com
2020-05-22xdp: Simplify xdp_return_{frame, frame_rx_napi, buff}Björn Töpel1-12/+9
The xdp_return_{frame,frame_rx_napi,buff} function are never used, except in xdp_convert_zc_to_xdp_frame(), by the MEM_TYPE_XSK_BUFF_POOL memory type. To simplify and reduce code, change so that xdp_convert_zc_to_xdp_frame() calls xsk_buff_free() directly since the type is know, and remove MEM_TYPE_XSK_BUFF_POOL from the switch statement in __xdp_return() function. Suggested-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-14-bjorn.topel@gmail.com
2020-05-22xsk: Remove MEM_TYPE_ZERO_COPY and corresponding codeBjörn Töpel11-510/+15
There are no users of MEM_TYPE_ZERO_COPY. Remove all corresponding code, including the "handle" member of struct xdp_buff. rfc->v1: Fixed spelling in commit message. (Björn) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-13-bjorn.topel@gmail.com
2020-05-22mlx5, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOLBjörn Töpel10-210/+96
Use the new MEM_TYPE_XSK_BUFF_POOL API in lieu of MEM_TYPE_ZERO_COPY in mlx5e. It allows to drop a lot of code from the driver (which is now common in AF_XDP core and was related to XSK RX frame allocation, DMA mapping, etc.) and slightly improve performance (RX +0.8 Mpps, TX +0.4 Mpps). rfc->v1: Put back the sanity check for XSK params, use XSK API to get the total headroom size. (Maxim) v1->v2: Fix DMA address handling, set XDP metadata to invalid. (Maxim) v2->v3: Handle frame_sz, use xsk_buff_xdp_get_frame_dma, use xsk_buff API for DMA sync on TX, add performance numbers. (Maxim) v3->v4: Remove unused variable num_xsk_frames. (Jakub) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-12-bjorn.topel@gmail.com
2020-05-22ixgbe, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOLBjörn Töpel4-271/+62
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL APIs. v1->v2: Fixed xdp_buff data_end update. (Björn) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: intel-wired-lan@lists.osuosl.org Link: https://lore.kernel.org/bpf/20200520192103.355233-11-bjorn.topel@gmail.com
2020-05-22ice, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOLBjörn Töpel4-359/+54
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL APIs. v4->v5: Fixed "warning: Excess function parameter 'alloc' description in 'ice_alloc_rx_bufs_zc'" and "warning: Excess function parameter 'xdp' description in 'ice_construct_skb_zc'". (Jakub) Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: intel-wired-lan@lists.osuosl.org Link: https://lore.kernel.org/bpf/20200520192103.355233-10-bjorn.topel@gmail.com
2020-05-22i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOLBjörn Töpel4-335/+47
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL APIs. The AF_XDP zero-copy rx_bi ring is now simply a struct xdp_buff pointer. v4->v5: Fixed "warning: Excess function parameter 'bi' description in 'i40e_construct_skb_zc'". (Jakub) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: intel-wired-lan@lists.osuosl.org Link: https://lore.kernel.org/bpf/20200520192103.355233-9-bjorn.topel@gmail.com
2020-05-22i40e: Separate kernel allocated rx_bi rings from AF_XDP ringsBjörn Töpel7-127/+142
Continuing the path to support MEM_TYPE_XSK_BUFF_POOL, the AF_XDP zero-copy/sk_buff rx_bi rings are now separate. Functions to properly allocate the different rings are added as well. v3->v4: Made i40e_fd_handle_status() static. (kbuild test robot) v4->v5: Fix kdoc for i40e_clean_programming_status(). (Jakub) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: intel-wired-lan@lists.osuosl.org Link: https://lore.kernel.org/bpf/20200520192103.355233-8-bjorn.topel@gmail.com
2020-05-22i40e: Refactor rx_bi accessesBjörn Töpel2-12/+23
As a first step to migrate i40e to the new MEM_TYPE_XSK_BUFF_POOL APIs, code that accesses the rx_bi (SW/shadow ring) is refactored to use an accessor function. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: intel-wired-lan@lists.osuosl.org Link: https://lore.kernel.org/bpf/20200520192103.355233-7-bjorn.topel@gmail.com
2020-05-22xsk: Introduce AF_XDP buffer allocation APIBjörn Töpel12-119/+819
In order to simplify AF_XDP zero-copy enablement for NIC driver developers, a new AF_XDP buffer allocation API is added. The implementation is based on a single core (single producer/consumer) buffer pool for the AF_XDP UMEM. A buffer is allocated using the xsk_buff_alloc() function, and returned using xsk_buff_free(). If a buffer is disassociated with the pool, e.g. when a buffer is passed to an AF_XDP socket, a buffer is said to be released. Currently, the release function is only used by the AF_XDP internals and not visible to the driver. Drivers using this API should register the XDP memory model with the new MEM_TYPE_XSK_BUFF_POOL type. The API is defined in net/xdp_sock_drv.h. The buffer type is struct xdp_buff, and follows the lifetime of regular xdp_buffs, i.e. the lifetime of an xdp_buff is restricted to a NAPI context. In other words, the API is not replacing xdp_frames. In addition to introducing the API and implementations, the AF_XDP core is migrated to use the new APIs. rfc->v1: Fixed build errors/warnings for m68k and riscv. (kbuild test robot) Added headroom/chunk size getter. (Maxim/Björn) v1->v2: Swapped SoBs. (Maxim) v2->v3: Initialize struct xdp_buff member frame_sz. (Björn) Add API to query the DMA address of a frame. (Maxim) Do DMA sync for CPU till the end of the frame to handle possible growth (frame_sz). (Maxim) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-6-bjorn.topel@gmail.com
2020-05-22xsk: Move defines only used by AF_XDP internals to xsk.hBjörn Töpel3-14/+16
Move the XSK_NEXT_PG_CONTIG_{MASK,SHIFT}, and XDP_UMEM_USES_NEED_WAKEUP defines from xdp_sock.h to the AF_XDP internal xsk.h file. Also, start using the BIT{,_ULL} macro instead of explicit shifts. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-5-bjorn.topel@gmail.com
2020-05-22xsk: Move driver interface to xdp_sock_drv.hMagnus Karlsson15-218/+238
Move the AF_XDP zero-copy driver interface to its own include file called xdp_sock_drv.h. This, hopefully, will make it more clear for NIC driver implementors to know what functions to use for zero-copy support. v4->v5: Fix -Wmissing-prototypes by include header file. (Jakub) Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-4-bjorn.topel@gmail.com
2020-05-22xsk: Move xskmap.c to net/xdp/Björn Töpel5-24/+19
The XSKMAP is partly implemented by net/xdp/xsk.c. Move xskmap.c from kernel/bpf/ to net/xdp/, which is the logical place for AF_XDP related code. Also, move AF_XDP struct definitions, and function declarations only used by AF_XDP internals into net/xdp/xsk.h. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-3-bjorn.topel@gmail.com
2020-05-22xsk: Fix xsk_umem_xdp_frame_sz()Björn Töpel1-1/+1
Calculating the "data_hard_end" for an XDP buffer coming from AF_XDP zero-copy mode, the return value of xsk_umem_xdp_frame_sz() is added to "data_hard_start". Currently, the chunk size of the UMEM is returned by xsk_umem_xdp_frame_sz(). This is not correct, if the fixed UMEM headroom is non-zero. Fix this by returning the chunk_size without the UMEM headroom. Fixes: 2a637c5b1aaf ("xdp: For Intel AF_XDP drivers add XDP frame_sz") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-2-bjorn.topel@gmail.com
2020-05-22Merge tag 'amd-drm-fixes-5.7-2020-05-21' of ↵Dave Airlie18-95/+126
git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.7-2020-05-21: amdgpu: - DP fix - Floating point fix - Fix cursor stutter issue Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200521222415.4122-1-alexander.deucher@amd.com
2020-05-22net: sgi: ioc3-eth: Fix return value check in ioc3eth_probe()Tang Bin1-4/+4
In the function devm_platform_ioremap_resource(), if get resource failed, the return value is ERR_PTR() not NULL. Thus it must be replaced by IS_ERR(), or else it may result in crashes if a critical error path is encountered. Fixes: 0ce5ebd24d25 ("mfd: ioc3: Add driver for SGI IOC3 chip") Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22net: don't return invalid table id error when we fall back to PF_UNSPECSabrina Dubroca5-6/+4
In case we can't find a ->dumpit callback for the requested (family,type) pair, we fall back to (PF_UNSPEC,type). In effect, we're in the same situation as if userspace had requested a PF_UNSPEC dump. For RTM_GETROUTE, that handler is rtnl_dump_all, which calls all the registered RTM_GETROUTE handlers. The requested table id may or may not exist for all of those families. commit ae677bbb4441 ("net: Don't return invalid table id error when dumping all families") fixed the problem when userspace explicitly requests a PF_UNSPEC dump, but missed the fallback case. For example, when we pass ipv6.disable=1 to a kernel with CONFIG_IP_MROUTE=y and CONFIG_IP_MROUTE_MULTIPLE_TABLES=y, the (PF_INET6, RTM_GETROUTE) handler isn't registered, so we end up in rtnl_dump_all, and listing IPv6 routes will unexpectedly print: # ip -6 r Error: ipv4: MR table does not exist. Dump terminated commit ae677bbb4441 introduced the dump_all_families variable, which gets set when userspace requests a PF_UNSPEC dump. However, we can't simply set the family to PF_UNSPEC in rtnetlink_rcv_msg in the fallback case to get dump_all_families == true, because some messages types (for example RTM_GETRULE and RTM_GETNEIGH) only register the PF_UNSPEC handler and use the family to filter in the kernel what is dumped to userspace. We would then export more entries, that userspace would have to filter. iproute does that, but other programs may not. Instead, this patch removes dump_all_families and updates the RTM_GETROUTE handlers to check if the family that is being dumped is their own. When it's not, which covers both the intentional PF_UNSPEC dumps (as dump_all_families did) and the fallback case, ignore the missing table id error. Fixes: cb167893f41e ("net: Plumb support for filtering ipv4 and ipv6 multicast route dumps") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22Merge branch 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux ↵Dave Airlie2-2/+4
into drm-fixes two fixes: - memory leak fix when userspace passes a invalid softpin address - off-by-one crashing the kernel in the perfmon domain iteration when the GPU core has both 2D and 3D capabilities Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas Stach <l.stach@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/ca7bce5c8aff0fcbbdc3bf2b9723df5f687c8924.camel@pengutronix.de
2020-05-22net: ipip: fix wrong address family in init error pathVadim Fedorenko1-1/+1
In case of error with MPLS support the code is misusing AF_INET instead of AF_MPLS. Fixes: 1b69e7e6c4da ("ipip: support MPLS over IPv4") Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22Merge branch 'net-tls-fix-encryption-error-path'David S. Miller1-7/+10
Vadim Fedorenko says: ==================== net/tls: fix encryption error path The problem with data stream corruption was found in KTLS transmit path with small socket send buffers and large amount of data. bpf_exec_tx_verdict() frees open record on any type of error including EAGAIN, ENOMEM and ENOSPC while callers are able to recover this transient errors. Also wrong error code was returned to user space in that case. This patchset fixes the problems. ==================== Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22net/tls: free record only on encryption errorVadim Fedorenko1-2/+4
We cannot free record on any transient error because it leads to losing previos data. Check socket error to know whether record must be freed or not. Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22net/tls: fix encryption error checkingVadim Fedorenko1-5/+6
bpf_exec_tx_verdict() can return negative value for copied variable. In that case this value will be pushed back to caller and the real error code will be lost. Fix it using signed type and checking for positive value. Fixes: d10523d0b3d7 ("net/tls: free the record on encryption error") Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling") Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22Merge branch 'provide-KAPI-for-SQI'David S. Miller5-3/+108
Oleksij Rempel says: ==================== provide KAPI for SQI This patches are extending ethtool netlink interface to export Signal Quality Index (SQI). SQI provided by 100Base-T1 PHYs and can be used for cable diagnostic. Compared to a typical cable tests, this value can be only used after link is established. changes v3: - rename __ethtool_get_sqi* to linkstate_get_sqi*. And move this functions to the net/ethtool/linkstate.c - protect linkstate_get_sqi* with locking changes v2: - use u32 instead of u8 for SQI - add SQI_MAX field and callbacks - some style fixes in the rst. - do not convert index to shifted index. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22net: phy: tja11xx: add SQI supportOleksij Rempel1-0/+26
This patch implements reading of the Signal Quality Index for better cable/link troubleshooting. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22ethtool: provide UAPI for PHY Signal Quality Index (SQI)Oleksij Rempel4-3/+82
Signal Quality Index is a mandatory value required by "OPEN Alliance SIG" for the 100Base-T1 PHYs [1]. This indicator can be used for cable integrity diagnostic and investigating other noise sources and implement by at least two vendors: NXP[2] and TI[3]. [1] http://www.opensig.org/download/document/218/Advanced_PHY_features_for_automotive_Ethernet_V1.0.pdf [2] https://www.nxp.com/docs/en/data-sheet/TJA1100.pdf [3] https://www.ti.com/product/DP83TC811R-Q1 Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22Merge branch 'net-ethernet-ti-fix-some-return-value-check'David S. Miller4-6/+7
Wei Yongjun says: ==================== net: ethernet: ti: fix some return value check This patchset convert cpsw_ale_create() to return PTR_ERR() only, and changed all the caller to check IS_ERR() instead of NULL. Since v2: 1) rebased on net.git, as Jakub's suggest 2) split am65-cpsw-nuss.c changes, as Grygorii's suggest ==================== Signed-off-by: David S. Miller <davem@davemloft.net>