summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-01-12net/mlx5_core: Set priority attributesMaor Gottlieb2-19/+55
Each priority has two attributes: 1. max_ft - maximum allowed flow tables under this priority. 2. start_level - start level range of the flow tables in the priority. These attributes are set by traversing the tree nodes by DFS and set start level and max flow tables to each priority. Start level depends on the max flow tables of the prior priorities in the tree. The leaves of the trees have max_ft set in them. Each node accumulates the max_ft of its children and set it accordingly. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-12net/mlx5_core: Connect flow tablesMaor Gottlieb3-10/+104
Flow tables from different priorities should be chained together. When a packet arrives we search for a match in the by-pass flow tables (first we search for a match in priority 0 and if we don't find a match we move to the next priority). If we can't find a match in any of the bypass flow-tables, we continue searching in the flow-tables of the next priority, which are the kernel's flow tables. Setting the miss flow table in a new flow table to be the next one in the list is performed via create flow table API. If we want to change an existing flow table, for example in order to point from an existing flow table to the new next-in-list flow table, we use the modify flow table API. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-12net/mlx5_core: Introduce modify flow table commandMaor Gottlieb3-4/+83
Introduce the modify flow table command. This command is used when we want to change the next flow table of an existing flow table. The next flow table is defined as the table we search (in order to find a match), if we couldn't find a match in any of the flow table entries in the current flow table. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-12net/mlx5_core: Managing root flow tableMaor Gottlieb5-10/+144
The root Flow Table for each Flow Table Type is defined, by default, as the Flow Table with level 0. In order not to use an empty flow tables and introduce new hops, but still preserve space for flow-tables that have a priority greater(lower number) than the current flow table, we introduce this new set root flow table command. This command tells the HW to start matching packets from the assigned root flow table. This command is used when we create new flow table with level lower than the current lowest flow table or it is the first flow table. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-12net/mlx5_core: Add utilities to find next and prev flow-tablesMaor Gottlieb1-0/+67
Add two utility functions for find next and prev flow table. Find next flow table function gets priority and return the first flow table of the next priority in the tree. Find prev flow table return the last flow table of the previous priority in the tree. These utility functions are used for chaining flow table from different priorities. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-12net/mlx5_core: Introduce flow steering autogrouped flow tableMaor Gottlieb3-19/+158
When user add rule to autogrouped flow table, we search for flow group with the same match criteria, if we don't find such group then we create new flow group with the required match criteria and insert the rule to this group. We divide the flow table into required_groups + 1, in order to reserve a part of the flow table for rules which don't match any existing group. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-12Merge branch 'bpf-next'David S. Miller2-22/+110
Daniel Borkmann says: ==================== BPF update This set adds IPv6 support for bpf_skb_{set,get}_tunnel_key() helper. It also exports flags to user space that are being used in helpers and weren't exported thus far. For more details, please see the individual patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-12bpf: support ipv6 for bpf_skb_{set,get}_tunnel_keyDaniel Borkmann2-8/+71
After IPv6 support has recently been added to metadata dst and related encaps, add support for populating/reading it from an eBPF program. Commit d3aa45ce6b ("bpf: add helpers to access tunnel metadata") started with initial IPv4-only support back then (due to IPv6 metadata support not being available yet). To stay compatible with older programs, we need to test for the passed structure size. Also TOS and TTL support from the ip_tunnel_info key has been added. Tested with vxlan devs in collect meta data mode with IPv4, IPv6 and in compat mode over different network namespaces. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-12bpf: export helper function flags and reject invalid onesDaniel Borkmann2-14/+39
Export flags used by eBPF helper functions through UAPI, so they can be used by programs (instead of them redefining all flags each time or just using the hard-coded values). It also gives a better overview what flags are used where and we can further get rid of the extra macros defined in filter.c. Moreover, reject invalid flags. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-12bonding: make mii_status sysfs node consistentJarod Wilson1-1/+1
The spew in /proc/net/bonding/bond0 uses netif_carrier_ok() to determine mii_status, while /sys/class/net/bond0/bonding/mii_status looks at curr_active_slave, which doesn't actually seem to be set sometimes when the bond actually is up. A mode 4 bond configured via ifcfg-foo files on a Red Hat Enterprise Linux system, after boot, comes up clean and functional, but the sysfs node shows mii_status of down, while proc shows up. A simple enough fix here seems to be to use the same method for determining up or down in both places, and I'd opt for the one that seems to match reality. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <gospo@cumulusnetworks.com> CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net/rtnetlink: remove unused sz_idx variableAlexander Kuleshov1-2/+1
The sz_idx variable is defined in the rtnetlink_rcv_msg(), but not used anywhere. Let's remove it. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: bfin_mac: Use phy_find_first() instead of open-coding itGuenter Roeck1-15/+2
Use phy_find_first() to find the first phy device instead of open-coding it. Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11Merge branch 'ovs-cleanups'David S. Miller2-24/+5
Jean Sacren says: ==================== Trivial fix-ups for openvswitch This series does trivial fix-ups for openvswitch as follows: 1) Clean up the leftover of the unused function. 2) Fix up the twisted struct geneve_port member name. 3) Update the kernel doc to reflect the changes in struct vport. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11openvswitch: update kernel doc for struct vportJean Sacren1-1/+2
commit be4ace6e6b1b ("openvswitch: Move dev pointer into vport itself") The commit above added @dev and moved @rcu to the bottom of struct vport, but the change was not reflected in the kernel doc. So let's update the kernel doc as well. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11openvswitch: fix struct geneve_port member nameJean Sacren1-3/+3
commit 6b001e682e90 ("openvswitch: Use Geneve device.") The commit above introduced 'port_no' as the name for the member of struct geneve_port. The correct name should be 'dst_port' as described in the kernel doc. Let's fix that member name and all the pertinent instances so that both doc and code would be consistent. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11openvswitch: clean up unused functionJean Sacren1-20/+0
commit 6b001e682e90 ("openvswitch: Use Geneve device.") The commit above deleted the only call site of ovs_tunnel_route_lookup() and now that function is not used any more. So let's delete the function definition as well. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: ti: cpmac: Fix build error due to missed API changeGuenter Roeck1-1/+1
Commit 7f854420fbfe ("phy: Add API for {un}registering an mdio device to a bus") introduces an API to access mii_bus structures, but missed to update the TI cpamc driver. This results in the following error message. drivers/net/ethernet/ti/cpmac.c: In function 'cpmac_probe': drivers/net/ethernet/ti/cpmac.c:1119:18: error: 'struct mii_bus' has no member named 'phy_map' Fixes: 7f854420fbfe ("phy: Add API for {un}registering an mdio device to a bus") Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: tc35815: Drop unused variableGuenter Roeck1-1/+0
Commit e7f4dc3536a4 ("mdio: Move allocation of interrupts into core") removes some code from tc_mii_init(), but does not remove a now unused variable. This results in the following build warning. drivers/net/ethernet/toshiba/tc35815.c: In function 'tc_mii_init': drivers/net/ethernet/toshiba/tc35815.c:670:6: warning: unused variable 'i' Fixes: e7f4dc3536a4 ("mdio: Move allocation of interrupts into core") Cc: Andrew Lunn <andrew@lunn.ch> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: tc35815: Fix build error due to missed API changeGuenter Roeck1-15/+2
Commit 7f854420fbfe ("phy: Add API for {un}registering an mdio device to a bus") introduces an API to access mii_bus structures, but missed to update the tc35815 driver. This results in the following error message. drivers/net/ethernet/toshiba/tc35815.c: In function 'tc_mii_probe': drivers/net/ethernet/toshiba/tc35815.c:617:18: error: 'struct mii_bus' has no member named 'phy_map' drivers/net/ethernet/toshiba/tc35815.c:623:24: error: 'struct mii_bus' has no member named 'phy_map' Instead of looping over the list of phy addresses to find a phy chip, use phy_find_first(). While the intent of the original code was to return an error if more than one phy was specified, this code path was never executed because the loop aborted after finding the first phy. The original code is therefore semantically identical to phy_find_first(), thus it is simpler and more straightforward to use phy_find_first() directly. Fixes: 7f854420fbfe ("phy: Add API for {un}registering an mdio device to a bus") Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: phy: Add support for SMSC LAN8740 PHYJoshua Henderson1-0/+22
LAN8740 has a different phy_id than LAN8710/LAN8720. Signed-off-by: Joshua Henderson <joshua.henderson@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11Merge tag 'wireless-drivers-next-for-davem-2016-01-09' of ↵David S. Miller64-627/+1587
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== brcmfmac * query features through firmware command * ARP offload through inet notifier * force probe to succeed for debugging purposes * random mac support for scheduled scan * support wowl upon net detect iwlwifi * bug fixes and improvements for firmware debug system * advertise support for Rx A-MSDU in A-MPDU * support -20.ucode * fix WoWLAN for iwldvm * preparations towards multiple Rx queues * platform power improvements for GO mode when no clients are associated ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: add scheduling point in recvmmsg/sendmmsgEric Dumazet1-0/+2
Applications often have to reduce number of datagrams they receive or send per system call to avoid starvation problems. Really the kernel should take care of this by using cond_resched(), so that applications can experiment bigger batch sizes. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11ipv6: always add flag an address that failed DAD with DADFAILEDLubomir Rintel1-2/+3
The userspace needs to know why is the address being removed so that it can perhaps obtain a new address. Without the DADFAILED flag it's impossible to distinguish removal of a temporary and tentative address due to DAD failure from other reasons (device removed, manual address removal). Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: lpc_eth: Remove unused variablesFabio Estevam1-2/+1
Commit e7f4dc3536a400 ("mdio: Move allocation of interrupts into core") introduced the following build warnings: drivers/net/ethernet/nxp/lpc_eth.c: In function 'lpc_mii_init': drivers/net/ethernet/nxp/lpc_eth.c:865:1: warning: label 'err_out_1' defined but not used [-Wunused-label] drivers/net/ethernet/nxp/lpc_eth.c:826:20: warning: unused variable 'i' [-Wunused-variable] Remove the unused variables to fix them. Reported-by: Olof's autobuilder <build@lixom.net> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11bfin_mac: fix error pathSudip Mukherjee1-1/+1
While building blackfin defconfig we were getting a build warning: warning: label 'out_err_irq_alloc' defined but not used. Commit e7f4dc3536a4 ("mdio: Move allocation of interrupts into core") removed the label out_err_mdiobus_register but then mistakenly jumped to out_err_alloc. But it was actually supposed to jump to out_err_irq_alloc. Fixes: e7f4dc3536a4 ("mdio: Move allocation of interrupts into core") Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11phy: fix blackfin build failureSudip Mukherjee1-1/+2
The build of blackfin defconfig is failing with the error: error: 'struct mii_bus' has no member named 'phy_map' A new API mdiobus_get_phy() was introduced and phy_map was removed but it was not changed here. Fixes: 7f854420fbfe ("phy: Add API for {un}registering an mdio device to a bus.") Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11cxgb4: Fixes static checker warning in mps_tcam_show()Hariprasad Shenai1-2/+2
The commit 115b56af88b5 ("cxgb4: Update mps_tcam output to include T6 fields") from Dec 23, 2015, leads to the following static checker warning: drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c:1735 mps_tcam_show() warn: we tested 'lookup_type' before and it was 'true' Fixing it. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11Merge branch 'emac-RK3036'David S. Miller2-31/+61
Xing Zheng says: ==================== Add support emac for the RK3036 SoC platform We have supported the emac for RK3066/RK3188, but the RK3036 have some configuration different with them. We should let the driver of emac_rockchip compatible with other Rockchip SoCs. Changes in v2: - Separate DTS from patch series. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: ethernet: arc: Add support emac for RK3036Xing Zheng2-3/+9
The RK3036's GRFs offset are different with RK3066/RK3188, and need to set mac TX/RX clock before probe emac. Signed-off-by: Xing Zheng <zhengxing@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: ethernet: arc: Keep emac compatibility for more Rockchip SoCsXing Zheng1-24/+45
On the RK3066/RK3188, there was fixed GRF offset configuration to set emac and fixed DIV2 mac TX/RX clock. So, we need to easily set and fit to other SoCs (RK3036) which maybe have different GRF offset, and need adjust mac TX/RX clock. Signed-off-by: Xing Zheng <zhengxing@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: ethernet: arc: Probe emac after set RMII clockXing Zheng1-4/+7
After enter arc_emac_probe, emac will get_phy_id, phy_poll_reset and other connecting PHY via mdiobus_read, so we need to set correct ref clock rate for emac before probe emac. Signed-off-by: Xing Zheng <zhengxing@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11Merge branch 'bnxt_en-zeropad-fw-and-reset'David S. Miller2-4/+45
Michael Chan says: ==================== bnxt_en: Zero pad fw messages and add fw reset. 2 patches related to firmware for net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11bnxt_en: Reset embedded processor after applying firmware upgradeRob Swindell1-4/+42
Use HWRM_FW_RESET command to request a self-reset of the embedded processor(s) after successfully applying a firmware update. For boot processor, the self-reset is currently deferred until the next PCIe reset. Signed-off-by: Rob Swindell <swindell@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11bnxt_en: Zero pad firmware messages to 128 bytes.Michael Chan1-0/+3
For future compatibility, zero pad all messages that the driver sends to the firmware to 128 bytes. If these messages are extended in the future with new byte enables, zero padding these messages now will guarantee future compatibility. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net, sched: add clsact qdiscDaniel Borkmann8-16/+186
This work adds a generalization of the ingress qdisc as a qdisc holding only classifiers. The clsact qdisc works on ingress, but also on egress. In both cases, it's execution happens without taking the qdisc lock, and the main difference for the egress part compared to prior version of [1] is that this can be applied with _any_ underlying real egress qdisc (also classless ones). Besides solving the use-case of [1], that is, allowing for more programmability on assigning skb->priority for the mqprio case that is supported by most popular 10G+ NICs, it also opens up a lot more flexibility for other tc applications. The main work on classification can already be done at clsact egress time if the use-case allows and state stored for later retrieval f.e. again in skb->priority with major/minors (which is checked by most classful qdiscs before consulting tc_classify()) and/or in other skb fields like skb->tc_index for some light-weight post-processing to get to the eventual classid in case of a classful qdisc. Another use case is that the clsact egress part allows to have a central egress counterpart to the ingress classifiers, so that classifiers can easily share state (e.g. in cls_bpf via eBPF maps) for ingress and egress. Currently, default setups like mq + pfifo_fast would require for this to use, for example, prio qdisc instead (to get a tc_classify() run) and to duplicate the egress classifier for each queue. With clsact, it allows for leaving the setup as is, it can additionally assign skb->priority to put the skb in one of pfifo_fast's bands and it can share state with maps. Moreover, we can access the skb's dst entry (f.e. to retrieve tclassid) w/o the need to perform a skb_dst_force() to hold on to it any longer. In lwt case, we can also use this facility to setup dst metadata via cls_bpf (bpf_skb_set_tunnel_key()) without needing a real egress qdisc just for that (case of IFF_NO_QUEUE devices, for example). The realization can be done without any changes to the scheduler core framework. All it takes is that we have two a-priori defined minors/child classes, where we can mux between ingress and egress classifier list (dev->ingress_cl_list and dev->egress_cl_list, latter stored close to dev->_tx to avoid extra cacheline miss for moderate loads). The egress part is a bit similar modelled to handle_ing() and patched to a noop in case the functionality is not used. Both handlers are now called sch_handle_ingress() and sch_handle_egress(), code sharing among the two doesn't seem practical as there are various minor differences in both paths, so that making them conditional in a single handler would rather slow things down. Full compatibility to ingress qdisc is provided as well. Since both piggyback on TC_H_CLSACT, only one of them (ingress/clsact) can exist per netdevice, and thus ingress qdisc specific behaviour can be retained for user space. This means, either a user does 'tc qdisc add dev foo ingress' and configures ingress qdisc as usual, or the 'tc qdisc add dev foo clsact' alternative, where both, ingress and egress classifier can be configured as in the below example. ingress qdisc supports attaching classifier to any minor number whereas clsact has two fixed minors for muxing between the lists, therefore to not break user space setups, they are better done as two separate qdiscs. I decided to extend the sch_ingress module with clsact functionality so that commonly used code can be reused, the module is being aliased with sch_clsact so that it can be auto-loaded properly. Alternative would have been to add a flag when initializing ingress to alter its behaviour plus aliasing to a different name (as it's more than just ingress). However, the first would end up, based on the flag, choosing the new/old behaviour by calling different function implementations to handle each anyway, the latter would require to register ingress qdisc once again under different alias. So, this really begs to provide a minimal, cleaner approach to have Qdisc_ops and Qdisc_class_ops by its own that share callbacks used by both. Example, adding qdisc: # tc qdisc add dev foo clsact # tc qdisc show dev foo qdisc mq 0: root qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc clsact ffff: parent ffff:fff1 Adding filters (deleting, etc works analogous by specifying ingress/egress): # tc filter add dev foo ingress bpf da obj bar.o sec ingress # tc filter add dev foo egress bpf da obj bar.o sec egress # tc filter show dev foo ingress filter protocol all pref 49152 bpf filter protocol all pref 49152 bpf handle 0x1 bar.o:[ingress] direct-action # tc filter show dev foo egress filter protocol all pref 49152 bpf filter protocol all pref 49152 bpf handle 0x1 bar.o:[egress] direct-action A 'tc filter show dev foo' or 'tc filter show dev foo parent ffff:' will show an empty list for clsact. Either using the parent names (ingress/egress) or specifying the full major/minor will then show the related filter lists. Prior work on a mqprio prequeue() facility [1] was done mainly by John Fastabend. [1] http://patchwork.ozlabs.org/patch/512949/ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11ethernet: amd: au1000: Remove pointless warningAndrew Lunn1-3/+0
The warning about being able to read any MDIO device, not just the attached ethernet devices PHY applies to all MDIO drivers. So remove it. This also removes a reference to a member in phy_device which has moved. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11staging: netlogic: Fix build error due to missed API changeAndrew Lunn1-16/+13
Fix a number of build errors due to moving the phy_map and centralizing interrupt allocation. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: ethernet: faraday: Use phy_find_first() instead of open coding itGuenter Roeck1-13/+2
Use phy_find_first() to find the first phy device instead of open coding it. Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: ethernet: broadcom: Fix build errorsGuenter Roeck1-7/+2
Commit 7f854420fbfe ("phy: Add API for {un}registering an mdio device to a bus") introduces an API to access mii_bus structures, but missed to update the sb1250 driver. This results in the following build error. drivers/net/ethernet/broadcom/sb1250-mac.c: In function 'sbmac_mii_probe': drivers/net/ethernet/broadcom/sb1250-mac.c:2360:24: error: 'struct mii_bus' has no member named 'phy_map' Use phy_find_first() instead of open coding it. Commit 2220943a21e2 ("phy: Centralise print about attached phy") introduces the following build error. drivers/net/ethernet/broadcom/sb1250-mac.c: In function 'sbmac_mii_probe': drivers/net/ethernet/broadcom/sb1250-mac.c:2383:20: error: 'phydev' undeclared Fixes: 7f854420fbfe ("phy: Add API for {un}registering an mdio device to a bus") Fixes: 2220943a21e2 ("phy: Centralise print about attached phy") Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11Merge branch 'mdio-device-fixes'David S. Miller2-11/+5
Andrew Lunn says: ==================== Fix breakage from mdio device These two patches fix MIPS platforms which got broken by the recent mdio device patchset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: ethernet-rgmii.c: Fix breakage from moving phdev busAndrew Lunn1-3/+3
The mdio device patches moved the bus member in phy_device into a substructure. This driver got missed. Fix it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net: lantiq_etop.c: Use helper to find first phyAndrew Lunn1-8/+2
Make use of the helper to find the first phy device. This also fixes the compile breakage. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11stmmac: Don't exit mdio registration when mdio subnode is not found in the DTSRomain Perier1-3/+5
Originally, most of the platforms using this driver did not define an mdio subnode in the devicetree. Commit e34d65 ("stmmac: create of compatible mdio bus for stmmac driver") introduced a backward compatibily issue by using of_mdiobus_register explicitly with an mdio subnode. This patch fixes the issue by calling the function mdiobus_register, when mdio subnode is not found. The driver is now compatible with both modes. Fixes: e34d65696d2e ("stmmac: create of compatible mdio bus for stmmac driver") Signed-off-by: Romain Perier <romain.perier@gmail.com> Tested-by: Phil Reid <preid@electromag.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11Merge branch 'bpf-next'David S. Miller4-9/+40
Daniel Borkmann says: ==================== BPF update Fixes a csum issue on ingress. As mentioned previously, net-next seems just fine imho. Later on, will follow up with couple of replacements like ovs_skb_postpush_rcsum() etc. Thanks! v1 -> v2: - Added patch 1 with helper - Implemented Hannes' idea to just use csum_partial, thanks! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11bpf: add skb_postpush_rcsum and fix dev_forward_skb occasionsDaniel Borkmann2-4/+30
Add a small helper skb_postpush_rcsum() and fix up redirect locations that need CHECKSUM_COMPLETE fixups on ingress. dev_forward_skb() expects a proper csum that covers also Ethernet header, f.e. since 2c26d34bbcc0 ("net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding"), we also do skb_postpull_rcsum() after pulling Ethernet header off via eth_type_trans(). When using eBPF in a netns setup f.e. with vxlan in collect metadata mode, I can trigger the following csum issue with an IPv6 setup: [ 505.144065] dummy1: hw csum failure [...] [ 505.144108] Call Trace: [ 505.144112] <IRQ> [<ffffffff81372f08>] dump_stack+0x44/0x5c [ 505.144134] [<ffffffff81607cea>] netdev_rx_csum_fault+0x3a/0x40 [ 505.144142] [<ffffffff815fee3f>] __skb_checksum_complete+0xcf/0xe0 [ 505.144149] [<ffffffff816f0902>] nf_ip6_checksum+0xb2/0x120 [ 505.144161] [<ffffffffa08c0e0e>] icmpv6_error+0x17e/0x328 [nf_conntrack_ipv6] [ 505.144170] [<ffffffffa0898eca>] ? ip6t_do_table+0x2fa/0x645 [ip6_tables] [ 505.144177] [<ffffffffa08c0725>] ? ipv6_get_l4proto+0x65/0xd0 [nf_conntrack_ipv6] [ 505.144189] [<ffffffffa06c9a12>] nf_conntrack_in+0xc2/0x5a0 [nf_conntrack] [ 505.144196] [<ffffffffa08c039c>] ipv6_conntrack_in+0x1c/0x20 [nf_conntrack_ipv6] [ 505.144204] [<ffffffff8164385d>] nf_iterate+0x5d/0x70 [ 505.144210] [<ffffffff816438d6>] nf_hook_slow+0x66/0xc0 [ 505.144218] [<ffffffff816bd302>] ipv6_rcv+0x3f2/0x4f0 [ 505.144225] [<ffffffff816bca40>] ? ip6_make_skb+0x1b0/0x1b0 [ 505.144232] [<ffffffff8160b77b>] __netif_receive_skb_core+0x36b/0x9a0 [ 505.144239] [<ffffffff8160bdc8>] ? __netif_receive_skb+0x18/0x60 [ 505.144245] [<ffffffff8160bdc8>] __netif_receive_skb+0x18/0x60 [ 505.144252] [<ffffffff8160ccff>] process_backlog+0x9f/0x140 [ 505.144259] [<ffffffff8160c4a5>] net_rx_action+0x145/0x320 [...] What happens is that on ingress, we push Ethernet header back in, either from cls_bpf or right before skb_do_redirect(), but without updating csum. The "hw csum failure" can be fixed by using the new skb_postpush_rcsum() helper for the dev_forward_skb() case to correct the csum diff again. Thanks to Hannes Frederic Sowa for the csum_partial() idea! Fixes: 3896d655f4d4 ("bpf: introduce bpf_clone_redirect() helper") Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11net, sched: add skb_at_tc_ingress helperDaniel Borkmann2-5/+10
Add a skb_at_tc_ingress() as this will be needed elsewhere as well and can hide the ugly ifdef. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11Merge branch 'tcp-keepalive-namespaceify'David S. Miller5-30/+38
Nikolay Borisov says: ==================== Namespaceify tcp keepalive machinery The following patch series enables the tcp keepalive mechanism to be configured per net namespace. This is especially useful if you have multiple containers hosted on one node and one of them is under DoS- in such situations one thing which could be done is to configure the tcp keepalive settings such that connections for that particular container are being reset faster. Another scenario where not being able to control those knob comes per container is problematic is occurs the value of net.netfilter.nf_conntrack_tcp_timeout_established is set below the keepalive interval, in such situations the server won't send an RST packet resulting in applications not trying to reconnect and stale connection waiting. Changing the global keepalive value is a possible solution but it might interfere with other containers. The three patches gradually convert each of the affected knobs to be per netns. I thought it would be easier for review than put everything in one patch. If people deem it more appropriate to squash everything in one patch (maybe after review) I'd be more than happy to do it. The patches have been compile-tested on 4.4 and functionally tested on 3.12 and they work as expected. These are based off 4.4-rc8 ==================== Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11ipv4: Namespecify the tcp_keepalive_intvl sysctl knobNikolay Borisov5-10/+12
This is the final part required to namespaceify the tcp keep alive mechanism. Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11ipv4: Namespecify tcp_keepalive_probes sysctl knobNikolay Borisov5-10/+12
This is required to have full tcp keepalive mechanism namespace support. Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-11ipv4: Namespaceify tcp_keepalive_time sysctl knobNikolay Borisov5-10/+14
Different net namespaces might have different requirements as to the keepalive time of tcp sockets. This might be required in cases where different firewall rules are in place which require tcp timeout sockets to be increased/decreased independently of the host. Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: David S. Miller <davem@davemloft.net>