summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-01-06ixgbe: setup xdp_rxq_infoJesper Dangaard Brouer3-1/+15
Driver hook points for xdp_rxq_info: * reg : ixgbe_setup_rx_resources() * unreg: ixgbe_free_rx_resources() Tested on actual hardware. V2: Fix ixgbe_set_ringparam, clear xdp_rxq_info in temp_ring Cc: intel-wired-lan@lists.osuosl.org Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-06i40e: setup xdp_rxq_infoJesper Dangaard Brouer3-2/+21
The i40e driver has a special "FDIR" RX-ring (I40E_VSI_FDIR) which is a sideband channel for configuring/updating the flow director tables. This (i40e_vsi_)type does not invoke XDP-ebpf code. As suggested by Björn (V2): Instead of marking this I40E_VSI_FDIR RX-ring a special case, reverse the logic and only select RX-rings of type I40E_VSI_MAIN to register xdp_rxq_info's for. Driver hook points for xdp_rxq_info: * reg : i40e_setup_rx_descriptors (via i40e_vsi_setup_rx_resources) * unreg: i40e_free_rx_resources (via i40e_vsi_free_rx_resources) Tested on actual hardware with samples/bpf program. V2: Fixed bug in i40e_set_ringparam (memset zero) + match on I40E_VSI_MAIN. V4: Update patch desc that got out-of-sync with code. Cc: intel-wired-lan@lists.osuosl.org Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-06xdp/mlx5: setup xdp_rxq_infoJesper Dangaard Brouer3-0/+14
The mlx5 driver have a special drop-RQ queue (one per interface) that simply drops all incoming traffic. It helps driver keep other HW objects (flow steering) alive upon down/up operations. It is temporarily pointed by flow steering objects during the interface setup, and when interface is down. It lacks many fields that are set in a regular RQ (for example its state is never switched to MLX5_RQC_STATE_RDY). (Thanks to Tariq Toukan for explanation). The XDP RX-queue info for this drop-RQ marked as unused, which allow us to use the same takedown/free code path as other RX-queues. Driver hook points for xdp_rxq_info: * reg : mlx5e_alloc_rq() * unused: mlx5e_alloc_drop_rq() * unreg : mlx5e_free_rq() Tested on actual hardware with samples/bpf program Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Matan Barak <matanb@mellanox.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-06xdp: base API for new XDP rx-queue info conceptJesper Dangaard Brouer4-1/+117
This patch only introduce the core data structures and API functions. All XDP enabled drivers must use the API before this info can used. There is a need for XDP to know more about the RX-queue a given XDP frames have arrived on. For both the XDP bpf-prog and kernel side. Instead of extending xdp_buff each time new info is needed, the patch creates a separate read-mostly struct xdp_rxq_info, that contains this info. We stress this data/cache-line is for read-only info. This is NOT for dynamic per packet info, use the data_meta for such use-cases. The performance advantage is this info can be setup at RX-ring init time, instead of updating N-members in xdp_buff. A possible (driver level) micro optimization is that xdp_buff->rxq assignment could be done once per XDP/NAPI loop. The extra pointer deref only happens for program needing access to this info (thus, no slowdown to existing use-cases). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-04bpf: only build sockmap with CONFIG_INETJohn Fastabend3-2/+4
The sockmap infrastructure is only aware of TCP sockets at the moment. In the future we plan to add UDP. In both cases CONFIG_NET should be built-in. So lets only build sockmap if CONFIG_INET is enabled. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-04bpf: sockmap remove unused functionJohn Fastabend1-8/+0
This was added for some work that was eventually factored out but the helper call was missed. Remove it now and add it back later if needed. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-04Merge branch 'bpf-bpftool-misc-fixes'Daniel Borkmann8-37/+40
Jakub Kicinski says: ==================== This series addresses small issues that snuck through the review of cgroup code. "list" and "show" are now made aliases to satisfy all users. Small fix to errors printed is needed, errors can't contain new line characters, otherwise JSON will break. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-04tools: bpftool: remove new lines from errorsJakub Kicinski2-11/+11
It's a little bit unusual for kernel style, but we add the new line character to error strings inside the p_err() function. We do this because new lines at the end of error strings will break JSON output. Fix a few p_err("..\n") which snuck in recently. Fixes: 5ccda64d38cc ("bpftool: implement cgroup bpf operations") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-04tools: bpftool: alias show and list commandsJakub Kicinski8-19/+22
iproute2 seems to accept show and list as aliases. Let's do the same thing, and by allowing both bring cgroup syntax back in line with maps and progs. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-04tools: bpftool: rename cgroup list -> show in the codeJakub Kicinski1-9/+9
So far we have used "show" as a keyword for listing programs and maps. Use the word "show" in the code for cgroups too, next commit will alias show and list. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31Merge branch 'bpf-offload-report-dev'Daniel Borkmann17-89/+346
Jakub Kicinski says: ==================== This series is a redo of reporting offload device information to user space after the first attempt did not take into account name spaces. As requested by Kirill offloads are now protected by an r/w sem. This allows us to remove the workqueue and free the offload state fully when device is removed (suggested by Alexei). Net namespace is reported with a device/inode pair. The accompanying bpftool support is placed in common code because maps will have very similar info. Note that the UAPI information can't be nicely encapsulated into a struct, because in case we need to grow the device information the new fields will have to be added at the end of struct bpf_prog_info, we can't grow structures in the middle of bpf_prog_info. v3: - use dev_get_by_index(); - redo ns code (new patch 6). v2: - rework the locking in patch 1 (use RCU instead of locking dependencies); - grab RTNL for a short time in patch 6; - minor update to the test in patch 8. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31selftests/bpf: test device info reporting for bound progsJakub Kicinski1-11/+101
Check if bound programs report correct device info. Test in local namespace, in remote one, back to the local ns, remove the device and check that information is cleared. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31tools: bpftool: report device information for offloaded programsJakub Kicinski3-0/+57
Print the just-exposed device information about device to which program is bound. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31bpf: offload: report device information for offloaded programsJakub Kicinski5-0/+73
Report to the user ifindex and namespace information of offloaded programs. If device has disappeared return -ENODEV. Specify the namespace using dev/inode combination. CC: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31nsfs: generalize ns_get_path() for path resolution with a taskJakub Kicinski2-3/+29
ns_get_path() takes struct task_struct and proc_ns_ops as its parameters. For path resolution directly from a namespace, e.g. based on a networking device's net name space, we need more flexibility. Add a ns_get_path_cb() helper which will allow callers to use any method of obtaining the name space reference. Convert ns_get_path() to use ns_get_path_cb(). Following patches will bring a networking user. CC: Eric W. Biederman <ebiederm@xmission.com> Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31bpf: offload: free program id when device disappearsJakub Kicinski3-2/+12
Bound programs are quite useless after their device disappears. They are simply waiting for reference count to go to zero, don't list them in BPF_PROG_GET_NEXT_ID by freeing their ID early. Note that orphaned offload programs will return -ENODEV on BPF_OBJ_GET_INFO_BY_FD so user will never see ID 0. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31bpf: offload: free prog->aux->offload when device disappearsJakub Kicinski1-14/+9
All bpf offload operations should now be under bpf_devs_lock, it's safe to free and clear the entire offload structure, not only the netdev pointer. __bpf_prog_offload_destroy() will no longer be called multiple times. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31bpf: offload: allow netdev to disappear while verifier is runningJakub Kicinski8-48/+37
To allow verifier instruction callbacks without any extra locking NETDEV_UNREGISTER notification would wait on a waitqueue for verifier to finish. This design decision was made when rtnl lock was providing all the locking. Use the read/write lock instead and remove the workqueue. Verifier will now call into the offload code, so dev_ops are moved to offload structure. Since verifier calls are all under bpf_prog_is_dev_bound() we no longer need static inline implementations to please builds with CONFIG_NET=n. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31bpf: offload: don't use prog->aux->offload as booleanJakub Kicinski2-2/+5
We currently use aux->offload to indicate that program is bound to a specific device. This forces us to keep the offload structure around even after the device is gone. Add a bool member to struct bpf_prog_aux to indicate if offload was requested. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-31bpf: offload: don't require rtnl for dev list manipulationJakub Kicinski1-10/+24
We don't need the RTNL lock for all operations on offload state. We only need to hold it around ndo calls. The device offload initialization doesn't require it. The soon-to-come querying of the offload info will only need it partially. We will also be able to remove the waitqueue in following patches. Use struct rw_semaphore because map offload will require sleeping with the semaphore held for read. Suggested-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-30tools/bpftool: fix bpftool build with bintutils >= 2.9Roman Gushchin6-0/+86
Bpftool build is broken with binutils version 2.29 and later. The cause is commit 003ca0fd2286 ("Refactor disassembler selection") in the binutils repo, which changed the disassembler() function signature. Fix this by adding a new "feature" to the tools/build/features infrastructure and make it responsible for decision which disassembler() function signature to use. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-30tools/bpftool: use version from the kernel source treeRoman Gushchin2-11/+5
Bpftool determines it's own version based on the kernel version, which is picked from the linux/version.h header. It's strange to use the version of the installed kernel headers, and makes much more sense to use the version of the actual source tree, where bpftool sources are. Fix this by building kernelversion target and use the resulting string as bpftool version. Example: before: $ bpftool version bpftool v4.14.6 after: $ bpftool version bpftool v4.15.0-rc3 $bpftool version --json {"version":"4.15.0-rc3"} Signed-off-by: Roman Gushchin <guro@fb.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller212-1204/+2378
net/ipv6/ip6_gre.c is a case of parallel adds. include/trace/events/tcp.h is a little bit more tricky. The removal of in-trace-macro ifdefs in 'net' paralleled with moving show_tcp_state_name and friends over to include/trace/events/sock.h in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds30-121/+298
Pull networking fixes from David Miller: 1) IPv6 gre tunnels end up with different default features enabled depending upon whether netlink or ioctls are used to bring them up. Fix from Alexey Kodanev. 2) Fix read past end of user control message in RDS< from Avinash Repaka. 3) Missing RCU barrier in mini qdisc code, from Cong Wang. 4) Missing policy put when reusing per-cpu route entries, from Florian Westphal. 5) Handle nested PCI errors properly in bnx2x driver, from Guilherme G. Piccoli. 6) Run nested transport mode IPSEC packets via tasklet, from Herbert Xu. 7) Fix handling poll() for stream sockets in tipc, from Parthasarathy Bhuvaragan. 8) Fix two stack-out-of-bounds issues in IPSEC, from Steffen Klassert. 9) Another zerocopy ubuf handling fix, from Willem de Bruijn. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits) strparser: Call sock_owned_by_user_nocheck sock: Add sock_owned_by_user_nocheck skbuff: in skb_copy_ubufs unclone before releasing zerocopy tipc: fix hanging poll() for stream sockets sctp: Replace use of sockets_allocated with specified macro. bnx2x: Improve reliability in case of nested PCI errors tg3: Enable PHY reset in MTU change path for 5720 tg3: Add workaround to restrict 5762 MRRS to 2048 tg3: Update copyright net: fec: unmap the xmit buffer that are not transferred by DMA tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path tipc: error path leak fixes in tipc_enable_bearer() RDS: Check cmsg_len before dereferencing CMSG_DATA tcp: Avoid preprocessor directives in tracepoint macro args tipc: fix memory leak of group member when peer node is lost net: sched: fix possible null pointer deref in tcf_block_put tipc: base group replicast ack counter on number of actual receivers net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap() net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround ip6_gre: fix device features for ioctl setup ...
2017-12-29Merge tag 'drm-fixes-for-v4.15-rc6' of ↵Linus Torvalds3-4/+5
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "nouveau and i915 regression fixes" * tag 'drm-fixes-for-v4.15-rc6' of git://people.freedesktop.org/~airlied/linux: drm/nouveau: fix race when adding delayed work items i915: Reject CCS modifiers for pipe C on Geminilake drm/i915/gvt: Fix pipe A enable as default for vgpu
2017-12-29Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fix from Stephen Boyd: "One more fix for the runtime PM clk patches. We're calling a runtime PM API that may schedule from somewhere that we can't do that. We change to the async version of pm_runtime_put() to fix it" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: use atomic runtime pm api in clk_core_is_enabled
2017-12-29Merge tag 'led_fixes_for_4.15-rc6' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED fix from Jacek Anaszewski: "A single LED fix for brightness setting when delay_off is 0" * tag 'led_fixes_for_4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: led: core: Fix brightness setting when setting delay_off=0
2017-12-29Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds22-105/+208
Pull rdma fixes from Jason Gunthorpe: "This is the next batch of for-rc patches from RDMA. It includes the fix for the ipoib regression I mentioned last time, and the result of a fairly major debugging effort to get iser working reliably on cxgb4 hardware - it turns out the cxgb4 driver was not handling QP error flushing properly causing iser to fail. - cxgb4 fix for an iser testing failure as debugged by Steve and Sagi. The problem was a driver bug in the handling of shutting down a QP. - Various vmw_pvrdma fixes for bogus WARN_ON, missed resource free on error unwind and a use after free bug - Improper congestion counter values on mlx5 when link aggregation is enabled - ipoib lockdep regression introduced in this merge window - hfi1 regression supporting the device in a VM introduced in a recent patch - Typo that breaks future uAPI compatibility in the verbs core - More SELinux related oops fixing - Fix an oops during error unwind in mlx5" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/mlx5: Fix mlx5_ib_alloc_mr error flow IB/core: Verify that QP is security enabled in create and destroy IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp() IB/mlx5: Serialize access to the VMA list IB/hfi: Only read capability registers if the capability exists IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush IB/mlx5: Fix congestion counters in LAG mode RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroy RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP path iw_cxgb4: when flushing, complete all wrs in a chain iw_cxgb4: reflect the original WR opcode in drain cqes iw_cxgb4: Only validate the MSN for successful completions
2017-12-29Merge tag 'mlx5-shared-4.16-1' of ↵David S. Miller9-214/+424
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== Mellanox, mlx5 E-Switch updates 2017-12-19 This series includes updates for mlx5 E-Switch infrastructures, to be merged into net-next and rdma-next trees. Mark's patches provide E-Switch refactoring that generalize the mlx5 E-Switch vf representors interfaces and data structures. The serious is mainly focused on moving ethernet (netdev) specific representors logic out of E-Switch (eswitch.c) into mlx5e representor module (en_rep.c), which provides better separation and allows future support for other types of vf representors (e.g. RDMA). Gal's patches at the end of this serious, provide a simple syntax fix and two other patches that handles vport ingress/egress ACL steering name spaces to be aligned with the Firmware/Hardware specs. V1->V2: - Addressed coding style comments in patches #1 and #7 - The series is still based on rc4, as now I see net-next is also @rc4. V2->V3: - Fixed compilation warning, reported by Dave. Please pull and let me know if there's any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-29net/mlx5: Separate ingress/egress namespaces for each vportGal Pressman4-30/+133
Each vport has its own root flow table for the ACL flow tables and root flow table is per namespace, therefore we should create a namespace for each vport. Fixes: efdc810ba39d ("net/mlx5: Flow steering, Add vport ACL support") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-12-29net/mlx5: Fix ingress/egress naming mistakeGal Pressman1-2/+2
The functions names do not represent their actions, switch the mistaken ingress/egress naming. Fixes: fba53f7b5719 ("net/mlx5: Introduce mlx5_flow_steering structure") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-12-29net/mlx5e: E-Switch, Use the name of static array instead of its addressGal Pressman1-13/+13
Using the address of a static array is the same as using its name (in this specific use-case), but it's confusing and makes the code less readable. Fixes: 1bd27b11c1df ("net/mlx5: Introduce E-switch QoS management") Fixes: bd77bf1cb595 ("net/mlx5: Add SRIOV VF max rate configuration support") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-12-29net/mlx5e: E-Switch, Move send-to-vport rule struct to en_repMark Bloch3-16/+16
Move struct mlx5_esw_sq which keeps send-to-vport rule to from the eswitch code to mlx5e and rename it to better reflect where it belongs Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-12-29net/mlx5: E-Switch, Create generic header struct to be used by representorsMark Bloch5-44/+88
Now that we don't store type dependent data in struct mlx5_eswitch_rep we can create a generic interface, and representor type. struct mlx5_eswitch_rep will store an array of interfaces, each interface is used by a different representor type. Once we moved to a more generic interface, rdma driver representors can be added and utilize the same mechanism as the Ethernet driver representors use. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-12-28Merge branch 'strparser-Fix-lockdep-issue'David S. Miller2-1/+6
Tom Herbert says: ==================== strparser: Fix lockdep issue When sock_owned_by_user returns true in strparser. Fix is to add and call sock_owned_by_user_nocheck since the check for owned by user is not an error condition in this case. ==================== Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28strparser: Call sock_owned_by_user_nocheckTom Herbert1-1/+1
strparser wants to check socket ownership without producing any warnings. As indicated by the comment in the code, it is permissible for owned_by_user to return true. Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com> Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28sock: Add sock_owned_by_user_nocheckTom Herbert1-0/+5
This allows checking socket lock ownership with producing lockdep warnings. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28skbuff: in skb_copy_ubufs unclone before releasing zerocopyWillem de Bruijn1-3/+3
skb_copy_ubufs must unclone before it is safe to modify its skb_shared_info with skb_zcopy_clear. Commit b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even without user frags") ensures that all skbs release their zerocopy state, even those without frags. But I forgot an edge case where such an skb arrives that is cloned. The stack does not build such packets. Vhost/tun skbs have their frags orphaned before cloning. TCP skbs only attach zerocopy state when a frag is added. But if TCP packets can be trimmed or linearized, this might occur. Tracing the code I found no instance so far (e.g., skb_linearize ends up calling skb_zcopy_clear if !skb->data_len). Still, it is non-obvious that no path exists. And it is fragile to rely on this. Fixes: b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even without user frags") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28Merge branch 'mlx4-misc-for-4.16'David S. Miller5-46/+63
Tariq Toukan says: ==================== mlx4 misc for 4.16 This patchset contains misc cleanups and improvements to the mlx4 Core and Eth drivers. In patches 1 and 2 I reduce and reorder the branches in the RX csum flow. In patch 3 I align the FMR unmapping flow with the device spec, to allow a remapping afterwards. Patch 4 by Moni changes the default QoS settings so that a pause frame stops all traffic regardless of its prio. Series generated against net-next commit: 836df24a7062 net: hns3: hns3_get_channels() can be static ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28net/mlx4_en: Change default QoS settingsMoni Shoua3-0/+13
Change the default mapping between TC and TCG as follows: Prio | TC/TCG | from to | (set by FW) (set by SW) ---------+----------------------------------- 0 | 0/0 0/7 1 | 1/0 0/6 2 | 2/0 0/5 3 | 3/0 0/4 4 | 4/0 0/3 5 | 5/0 0/2 6 | 6/0 0/1 7 | 7/0 0/0 These new settings cause that a pause frame for any prio stops traffic for all prios. Fixes: 564c274c3df0 ("net/mlx4_en: DCB QoS support") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28net/mlx4_core: Cleanup FMR unmapping flowTariq Toukan1-19/+21
Remove redundant and not essential operations in fmr unmap/free. According to device spec, in FMR unmap it is sufficient to set ownership bit to SW. This allows remapping afterwards. Fixes: 8ad11fb6b073 ("IB/mlx4: Implement FMRs") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28net/mlx4_en: RX csum, reorder branchesTariq Toukan1-25/+21
Use early goto commands, and save else branches. This uses less indentations and brackets, making the code more readable. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28net/mlx4_en: RX csum, remove redundant branches and checksTariq Toukan1-3/+9
Do not check IPv6 bit in cqe status if CONFIG_IPV6 is not enabled. Function check_csum() is reached only with IPv4 or IPv6 set (if enabled), if IPv6 is not set (or is not enabled) it is redundant to test the IPv4 bit. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28net: sched: don't set extack message in case the qdisc will be createdJiri Pirko1-3/+1
If the qdisc is not found here, it is going to be created. Therefore, this is not an error path. Remove the extack message set and don't confuse user with error message in case the qdisc was created successfully. Fixes: 09215598119e ("net: sched: sch_api: handle generic qdisc errors") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28tipc: fix hanging poll() for stream socketsParthasarathy Bhuvaragan1-1/+1
In commit 42b531de17d2f6 ("tipc: Fix missing connection request handling"), we replaced unconditional wakeup() with condtional wakeup for clients with flags POLLIN | POLLRDNORM | POLLRDBAND. This breaks the applications which do a connect followed by poll with POLLOUT flag. These applications are not woken when the connection is ESTABLISHED and hence sleep forever. In this commit, we fix it by including the POLLOUT event for sockets in TIPC_CONNECTING state. Fixes: 42b531de17d2f6 ("tipc: Fix missing connection request handling") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28Merge branch 'AVE-ethernet'David S. Miller6-0/+1813
Kunihiko Hayashi says: ==================== net: add UniPhier AVE ethernet support This series adds support for Socionext AVE ethernet controller implemented on UniPhier SoCs. This driver supports RGMII/RMII modes. v8: https://www.spinics.net/lists/netdev/msg474374.html The PHY patch included in v1 has already separated in: http://www.spinics.net/lists/netdev/msg454595.html Changes since v8: - move operators at the beginning of the line to the end of the previous line - dt-bindings: add blank lines before mdio and phy subnodes Changes since v7: - dt-bindings: fix mdio subnode description Changes since v6: - sort the order of local variables from longest to shortest line - fix ave_probe() which calls register_netdev() at the end of initialization - dt-bindings: remove phy node descriptions in mdio node Changes since v5: - replace license boilerplate with SPDX Identifier - remove inline directives and an unused function Changes since v4: - fix larger integer warning on AVE_PFMBYTE_MASK0 Changes since v3: - remove checking dma address and use dma_set_mask() to restirct address - replace ave_mdio_busywait() with read_poll_timeout() - replace functions to access to registers with readl/writel() directly - replace a function to access to macaddr with ave_hw_write_macaddr() - change return value of ave_dma_map() to error value - move mdiobus_unregister() from ave_remove() to ave_uninit() - eliminate else block at the end of ave_dma_map() - add mask definitions for packet filter - sort bitmap definitions in descending order - add error check to some functions - rename and sort functions to clear sub-categories - fix error value consistency - remove unneeded initializers - change type of constant arrays Changes since v2: - replace clk_get() with devm_clk_get() - replace reset_control_get() with devm_reset_control_get_optional_shared() - add error return when the error occurs on the above *_get functions - sort soc data and compatible strings - remove clearly obvious comments - modify dt-bindings document consistent with these modifications Changes since v1: - add/remove devicetree properties and sub-node - remove "internal-phy-interrupt" and "desc-bits" property - add SoC data structures based on compatible strings - add node operation to apply "mdio" sub-node - add support for features - add support for {get,set}_pauseparam and pause frame operations - add support for ndo_get_stats64 instead of ndo_get_stats - replace with desiable functions - replace check for valid phy_mode with phy_interface{_mode}_is_rgmii() - replace phy attach message with phy_attached_info() - replace 32bit operation with {upper,lower}_32_bits() on ave_wdesc_addr() - replace nway_reset and get_link with generic functions - move operations to proper functions - move phy_start_aneg() to ndo_open, and remove unnecessary PHY interrupt operations See http://www.spinics.net/lists/netdev/msg454590.html - move irq initialization and descriptor memory allocation to ndo_open - move initialization of reset and clock and mdiobus to ndo_init - fix skbuffer operations - fix skb alignment operations and add Rx buffer adjustment for descriptor See http://www.spinics.net/lists/netdev/msg456014.html - add error returns when dma_map_single() failed - clean up code structures - clean up wait-loop and wake-queue conditions - add ave_wdesc_addr() and offset definitions - add ave_macaddr_init() to clean up mac-address operation - fix checking whether Tx entry is not enough - fix supported features of phydev - add necessary free/disable operations - add phydev check on ave_{get,set}_wol() - remove netif_carrier functions, phydev initializer, and Tx budget check - change obsolate codes - replace ndev->{base_addr,irq} with the members of ave_private - rename goto labels and mask definitions, and remove unused codes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28net: ethernet: socionext: add AVE ethernet driverKunihiko Hayashi5-0/+1765
The UniPhier platform from Socionext provides the AVE ethernet controller that includes MAC and MDIO bus supporting RGMII/RMII modes. The controller is named AVE. Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28dt-bindings: net: add DT bindings for Socionext UniPhier AVEKunihiko Hayashi1-0/+48
DT bindings for the AVE ethernet controller found on Socionext's UniPhier platforms. Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28cxgb4/cxgb4vf: support for XLAUI Port TypeGanesh Goudar4-2/+20
Add support for new Backplane XLAUI port type. Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-28cxgb4: display VNI correctlyGanesh Goudar1-3/+5
Fix incorrect VNI display in mps_tcam Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>