summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-08-28net: dsa: retain a per-port device_node pointerFlorian Fainelli3-0/+4
We will later use the per-port device_node pointer to fetch a bunch of port-specific properties. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28net: dsa: provide a switch device device tree node pointerFlorian Fainelli2-0/+8
We might need to fetch additional resources from the device tree node pointer, such as register ranges or other properties. Keep a device_node pointer around for this. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28net: phy: provide stub for fixed_phy_set_link_updateFlorian Fainelli1-8/+9
In preparation for updating the DSA code and avoid using ifdefs there, provide an empty stub for fixed_phy_set_link_update when CONFIG_FIXED_PHY is not selected. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28net: phy: add generic UniMAC MDIO bus driverFlorian Fainelli4-0/+260
Add a generic UniMAC MDIO bus driver and its Device Tree binding, which can be used by the BCMGENET driver as-is, and the upcoming Starfighter 2 Ethernet switch MDIO bus controller. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28net: dsa: reduce number of protocol hooksFlorian Fainelli10-105/+68
DSA is currently registering one packet_type function per EtherType it needs to intercept in the receive path of a DSA-enabled Ethernet device. Right now we have three of them: trailer, DSA and eDSA, and there might be more in the future, this will not scale to the addition of new protocols. This patch proceeds with adding a new layer of abstraction and two new functions: dsa_switch_rcv() which will dispatch into the tag-protocol specific receive function implemented by net/dsa/tag_*.c dsa_slave_xmit() which will dispatch into the tag-protocol specific transmit function implemented by net/dsa/tag_*.c When we do create the per-port slave network devices, we iterate over the switch protocol to assign the DSA-specific receive and transmit operations. A new fake ethertype value is used: ETH_P_XDSA to illustrate the fact that this is no longer going to look like ETH_P_DSA or ETH_P_TRAILER like it used to be. This allows us to greatly simplify the check in eth_type_trans() and always override the skb->protocol with ETH_P_XDSA for Ethernet switches tagged protocol, while also reducing the number repetitive slave netdevice_ops assignments. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28sungem: Fix global namespace pollution of phy accessors.David S. Miller1-17/+17
The sungem driver has "phy_read()" and "phy_write()" functions, which we need to rename because the generic phy layer is about to export generic interfaces with the same name. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28tulip: dmfe: Fix global namespace pollution of phy accessors.David S. Miller1-76/+76
The dmfe driver has "phy_read()" and "phy_write()" functions, which we need to rename because the generic phy layer is about to export generic interfaces with the same name. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28f_ncm: Don't use netdev_start_xmit().David S. Miller1-1/+9
Unfortunately, the USB gadget layer has this weird things where NULL skbs are passed into ops->ndo_start_xmit() in order to trigger the dev->wrap() calls to build packets. This is completely outside of the allowable range of sane arguments for the ndo_start_xmit method. All invocations of ndo_start_xmit() should be with non-NULL SKB arguments. Put back the direct call, but with a comment explaining how this is not acceptable in the long term. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28ethernet: arc: Add support for specific SoC layer device tree bindingsRomain Perier5-61/+129
Some platforms have special bank registers which might be used to select the correct clock or the right mode for Media Indepent Interface controllers. Sometimes, it is also required to activate vcc regulators in the right order to supply the ethernet controller at the right time. This patch is an architecture refactoring of the arc-emac device driver. It adds a new software design which allows to add specific platform glue layer. Each platform has now its own module which performs custom initialization and remove for the target and then calls to the core driver. Signed-off-by: Romain Perier <romain.perier@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28ethernet: arc: mdio changes for future SoC glue layer devtree supportRomain Perier3-6/+5
This is an api changes for the emac_mdio.c module. It will be required later when arc_emac_probe/arc_emac_remove will no longer use 'struct platform_device'. Signed-off-by: Romain Perier <romain.perier@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28ethernet: arc: remove use of 'struct platform_device'Romain Perier1-31/+33
This is a preparation of an api changes for the emac_main.c module. The involved functions are arc_emac_probe and arc_emac_remove. Signed-off-by: Romain Perier <romain.perier@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28tcp: syncookies: mark cookie_secret read_mostlyFlorian Westphal2-2/+2
only written once. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28bnx2x: Fix static checker warning regarding `txdata_ptr'Yuval Mintz1-2/+2
Incorrect checking of array instead of array contents in panic_dump flow - results of commit e261199872a2 ("bnx2x: Safe bnx2x_panic_dump()"). Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28r8152: replace strncpy with strlcpyhayeswang1-2/+2
Replace the strncpy with strlcpy, and use sizeof to determine the length. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-28net: Update sk_buff flag bit availability comment.David S. Miller1-1/+1
We lost one when xmit_more was added. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26neigh: document gc_thresh2stephen hemminger1-0/+6
Missing documentation for gc_thresh2 sysctl. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26net: bnx2x: fix build error with ptpRandy Dunlap1-0/+1
bnx2x uses ptp functions, so it should select the provider of those functions (PTP_1588_CLOCK). Fixes these build errors: drivers/built-in.o: In function `__bnx2x_remove': /home/jim/linux/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c:13409: undefined reference to `ptp_clock_unregister' drivers/built-in.o: In function `bnx2x_register_phc': /home/jim/linux/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c:13202: undefined reference to `ptp_clock_register' drivers/built-in.o: In function `bnx2x_get_ts_info': /home/jim/linux/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3498: undefined reference to `ptp_clock_index' Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26team: set IFF_TEAM_PORT priv_flag after rx_handler is registeredJiri Pirko1-14/+30
When one tries to add eth as a port into team and that eth is already in use by other rx_handler device (macvlan, bond, bridge, ...) a bug in team_port_add() causes that IFF_TEAM_PORT flag is set before rx_handler is registered. In between, netdev nofifier is called and team_device_event() sees IFF_TEAM_PORT and thinks that rx_handler_data pointer is set to team_port. But it isn't. Fix this by reordering rx_handler register and IFF_TEAM_PORT priv flag set so it is very similar to how bonding does this. Reported-by: Erik Hugne <erik.hugne@ericsson.com> Fixes: 3d249d4ca7 "net: introduce ethernet teaming device" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26bpf: x86: add missing 'shift by register' instructions to x64 eBPF JITAlexei Starovoitov2-0/+80
'shift by register' operations are supported by eBPF interpreter, but were accidently left out of x64 JIT compiler. Fix it and add a testcase. Reported-by: Brendan Gregg <brendan.d.gregg@gmail.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Fixes: 622582786c9e ("net: filter: x86: internal BPF JIT") Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26Merge branch 'bnx2x-next'David S. Miller5-18/+44
Yuval Mintz says: ==================== bnx2x: `fixes' patch-series This series contains mostly bug fixes, but never the less is intended for `net-next' and not `net', as: - Some of the fixes are quite insignificant [`VF clean statistics', `ethtool -d might cause timeout in log']. - Some only recently were submitted to `net-next' [`Fix timesync endianity']. - Some are not usually compiled as part of the kernel [`Fix stop-on-error']. Dave - please consider applying this series to `net-next'; If you prefer, I can break this series into 2 parts [one for `net' and the other for `net-next'] - but personally I don't see much benefit in it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26bnx2x: Fix timesync endianityMichal Kalderon1-2/+4
Commit eeed018cbfa30 ("bnx2x: Add timestamping and PTP hardware clock support") has a missing conversion to LE32, which will prevent the feature from working on big endian machines. Signed-off-by: Michal Kalderon <Michal.Kalderon@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26bnx2x: Be more forgiving toward SW GRODmitry Kravkov1-1/+9
This introduces 2 new relaxations in the bnx2x driver regarding GRO: 1. Don't prevent SW GRO if HW GRO is disabled. 2. If all aggregations are disabled, when GRO configuration changes there's no need to perform an inner-reload [since it will have no actual effect]. Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26bnx2x: VF clean statisticsYuval Mintz1-0/+5
During statistics initialization of a VF we need to clean its statistics. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26bnx2x: Fix stop-on-errorYuval Mintz2-6/+21
When STOP_ON_ERROR is set driver will not compile. Even if it did, traffic will not pass without this patch as several fields which are verified by FW/HW on the Tx path are not properly set. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26bnx2x: ethtool -d might cause timeout in logDmitry Kravkov1-9/+5
This changes slightly the set of registers read during `ethtool -d'. Without this change, it's possible the HW will generate a grc Attention which will be logged into system logs as `grc timeout'. Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26net: make skb an optional parameter for__skb_flow_dissect()WANG Cong2-5/+17
Fixes: commit 690e36e726d00d2 (net: Allow raw buffers to be passed into the flow dissector) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26net: fix comments for __skb_flow_get_ports()WANG Cong1-2/+4
Fixes: commit 690e36e726d00d2 (net: Allow raw buffers to be passed into the flow dissector) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26ixgbe: support skb->xmit_more in netdev_ops->ndo_start_xmit()Daniel Borkmann1-3/+4
This implements the deferred tail pointer flush API for the ixgbe driver. Similar version also proposed longer time ago by Alexander Duyck. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26net: Remove ndo_xmit_flush netdev operation, use signalling instead.David S. Miller4-56/+19
As reported by Jesper Dangaard Brouer, for high packet rates the overhead of having another indirect call in the TX path is non-trivial. There is the indirect call itself, and then there is all of the reloading of the state to refetch the tail pointer value and then write the device register. Move to a more passive scheme, which requires very light modifications to the device drivers. The signal is a new skb->xmit_more value, if it is non-zero it means that more SKBs are pending to be transmitted on the same queue as the current SKB. And therefore, the driver may elide the tail pointer update. Right now skb->xmit_more is always zero. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26Merge branch 'is_kdump_kernel'David S. Miller4-3/+7
Amir Vadai says: ==================== Make is_kdump_kernel() accessible from modules I'm re-spinning this patchset. At the begining it was suggested to use a different name for the parameter, but at the end [3] the resolution was to leave it as it is in this patch. Drivers need to know if running from kdump kernel in order to change their memory profile - since kdump environment is limited by available memory. Currently there are drivers that are using reset_devices as suggested in [2]. In [2] it was suggested to use reset_devices, but the context was, to enable driver to know when the hardware device is needed to be reset, and not if this is a kdump environment. We think that is_kdump_kernel() is better suited to select between different memory profiles. The first patch in this patchset exports a needed symbol in order to make is_kdump_kernel() accessible from the drivers. The rest of the patches change from reset_devices to is_kdump_kernel() in 2 networking drivers. The idea of this patchset was suggested by Vivek Goyal. Tested (only build) and applied on top of commit 8fc54f6: ("net: use reciprocal_scale() helper") [1] - ea1c1af: ("net/mlx4_en: Reduce memory consumption on kdump kernel") [2] - https://lkml.org/lkml/2011/1/27/341 [3] - http://www.spinics.net/lists/netdev/msg291492.html ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26net/bnx2x: Use is_kdump_kernel() to detect kdump kernelAmir Vadai2-2/+4
Use is_kdump_kernel() to detect kdump kernel, instead of reset_devices. CC: Ariel Elior <ariel.elior@qlogic.com> CC: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26net/mlx4: Use is_kdump_kernel() to detect kdump kernelAmir Vadai1-1/+2
Use is_kdump_kernel() to detect kdump kernel, instead of reset_devices. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26crash_dump: Make is_kdump_kernel() accessible from modulesAmir Vadai1-0/+1
In order to make is_kdump_kernel() accessible from modules, need to make elfcorehdr_addr exported. This was rejected in the past [1] because reset_devices was prefered in that context (reseting the device in kdump kernel), but now there are some network drivers that need to reduce memory usage when loaded from a kdump kernel. And in that context, is_kdump_kernel() suits better. [1] - https://lkml.org/lkml/2011/1/27/341 CC: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-26stmmac: simple cleanupsPavel Machek3-12/+10
This adds simple cleanups for stmmac, removing test we know is always true, fixing whitespace, and moving code out of if(). Signed-off-by: Pavel Machek <pavel@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25r8152: check code with checkpatch.plhayeswang1-31/+35
626: CHECK: Alignment should match open parenthesis 646: CHECK: Alignment should match open parenthesis 655: CHECK: Alignment should match open parenthesis 695: CHECK: Alignment should match open parenthesis 729: CHECK: Alignment should match open parenthesis 739: CHECK: Alignment should match open parenthesis 976: WARNING: externs should be avoided in .c files 1314: CHECK: Alignment should match open parenthesis 1358: WARNING: networking block comments don't use an empty /* line, use /* Comment... 1402: WARNING: networking block comments don't use an empty /* line, use /* Comment... 1521: CHECK: multiple assignments should be avoided 1775: CHECK: Alignment should match open parenthesis 1838: CHECK: multiple assignments should be avoided 1843: CHECK: multiple assignments should be avoided 1847: CHECK: multiple assignments should be avoided 1850: WARNING: Missing a blank line after declarations 1864: CHECK: Alignment should match open parenthesis 1872: CHECK: braces {} should be used on all arms of this statement 1906: CHECK: usleep_range is preferred over udelay 2865: WARNING: networking block comments don't use an empty /* line, use /* Comment... 3088: CHECK: Alignment should match open parenthesis total: 0 errors, 5 warnings, 16 checks, 3567 lines checked Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25Merge branch 'ndo_xmit_flush'David S. Miller11-27/+77
Basic deferred TX queue flushing infrastructure. Over time, and specifically and more recently at the Networking Workshop during Kernel SUmmit in Chicago, we have discussed the idea of having some way to optimize transmits of multiple TX packets at a time. There are several areas of overhead that could be amortized with such schemes. One has to do with locking and transactional overhead, the other has to do with device specific costs. This patch set here is more aimed at device specific costs. Typically a device queues up a packet in the TX queue and then has to do something to have the device start processing that new entry. Sometimes this is composed of doing an MMIO write to a "tail" register, and in other cases it can involve something as expensive as a hypervisor call. The basic setup defined here is that when the driver supports deferred TX queue flushing, ndo_start_xmit should no longer perform that operation. Instead a new operation, ndo_xmit_flush, should do it. I have converted IGB and virtio_net as example initial users. The IGB conversion is tested, virtio_net is not but it does compile :-) All ndo_start_xmit call sites have been abstracted behind a new helper called netdev_start_xmit(). This just adds the infrastructure, it does not actually add any instances of actually doing multiple ndo_start_xmit calls per ndo_xmit_flush invocation. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25virtio_net: Support netdev_ops->ndo_xmit_flush()David S. Miller1-1/+9
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25igb: Support netdev_ops->ndo_xmit_flush()David S. Miller1-11/+24
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25net: Add ops->ndo_xmit_flush()David S. Miller9-15/+44
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25ipv6: White-space cleansing : gaps between function and symbol exportIan Morris11-23/+0
This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. This patch removes some blank lines between the end of a function definition and the EXPORT_SYMBOL_GPL macro in order to prevent checkpatch warning that EXPORT_SYMBOL must immediately follow a function. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25ipv6: White-space cleansing : Structure layoutsIan Morris5-21/+13
This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. This patch addresses structure definitions, specifically it cleanses the brace placement and replaces spaces with tabs in a few places. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25ipv6: White-space cleansing : Line LayoutsIan Morris31-203/+203
This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. A number of items are addressed in this patch: * Multiple spaces converted to tabs * Spaces before tabs removed. * Spaces in pointer typing cleansed (char *)foo etc. * Remove space after sizeof * Ensure spacing around comparators such as if statements. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25net: ec_bhf: remove excessive debug messagesDarek Marcinkiewicz1-91/+10
This cuts down the number of debug information spit out by the driver. Signed-off-by: Dariusz Marcinkiewicz <reksio@newterm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25random32: improvements to prandom_bytesDaniel Borkmann2-23/+20
This patch addresses a couple of minor items, mostly addesssing prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t for length arguments, 2) We can use put_unaligned() when filling the array instead of open coding it [ perhaps some archs will further benefit from their own arch specific implementation when GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned int as type for getting the arch seed, 5) Make use of prandom_u32_max() for timer slack. Regarding the change to put_unaligned(), callers of prandom_bytes() which internally invoke prandom_bytes_state(), don't bother as they expect the array to be filled randomly and don't have any control of the internal state what-so-ever (that's also why we have periodic reseeding there, etc), so they really don't care. Now for the direct callers of prandom_bytes_state(), which are solely located in test cases for MTD devices, that is, drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}: These tests basically fill a test write-vector through prandom_bytes_state() with an a-priori defined seed each time and write that to a MTD device. Later on, they set up a read-vector and read back that blocks from the device. So in the verification phase, the write-vector is being re-setup [ so same seed and prandom_bytes_state() called ], and then memcmp()'ed against the read-vector to check if the data is the same. Akinobu, Lothar and I also tested this patch and it runs through the 3 relevant MTD test cases w/o any errors on the nandsim device (simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53 and i.MX6): # modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \ third_id_byte=0x00 fourth_id_byte=0x15 # modprobe mtd_oobtest dev=0 # modprobe mtd_pagetest dev=0 # modprobe mtd_subpagetest dev=0 We also don't have any users depending directly on a particular result of the PRNG (except the PRNG self-test itself), and that's just fine as it e.g. allowed us easily to do things like upgrading from taus88 to taus113. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Akinobu Mita <akinobu.mita@gmail.com> Tested-by: Lothar Waßmann <LW@KARO-electronics.de> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25Merge branch 'csums-next'David S. Miller12-103/+233
Tom Herbert says: ==================== net: Checksum offload changes - Part V I am working on overhauling RX checksum offload. Goals of this effort are: - Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY - Preserve CHECKSUM_COMPLETE through encapsulation layers - Don't do skb_checksum more than once per packet - Unify GRO and non-GRO csum verification as much as possible - Unify the checksum functions (checksum_init) - Simplify code What is in this fifth patch set: - Added GRO checksum validation functions - Call the GRO validations functions from TCP and GRE gro_receive - Perform checksum verification in the UDP gro_receive path using GRO functions and add support for gro_receive in UDP6 Changes in V2: - Change ip_summed to CHECKSUM_UNNECESSARY instead of moving it to CHECKSUM_COMPLETE from GRO checksum validation. This avoids performance penalty in checksumming bytes which are before the header GRO is at. Please review carefully and test if possible, mucking with basic checksum functions is always a little precarious :-) ---- Test results with this patch set are below. I did not notice any performace regression. Tests run: TCP_STREAM: super_netperf with 200 streams TCP_RR: super_netperf with 200 streams and -r 1,1 Device bnx2x (10Gbps): No GRE RSS hash (RX interrupts occur on one core) UDP RSS port hashing enabled. * GRE with checksum with IPv4 encapsulated packets With fix: TCP_STREAM 9.91% CPU utilization 5163.78 Mbps TCP_RR 50.64% CPU utilization 219/347/502 90/95/99% latencies 834103 tps Without fix: TCP_STREAM 10.05% CPU utilization 5186.22 tps TCP_RR 49.70% CPU utilization 227/338/486 90/95/99% latencies 813450 tps * GRE without checksum with IPv4 encapsulated packets With fix: TCP_STREAM 10.18% CPU utilization 5159 Mbps TCP_RR 51.86% CPU utilization 214/325/471 90/95/99% latencies 865943 tps Without fix: TCP_STREAM 10.26% CPU utilization 5307.87 Mbps TCP_RR 50.59% CPU utilization 224/325/476 90/95/99% latencies 846429 tps *** Simulate device returns CHECKSUM_COMPLETE * VXLAN with checksum With fix: TCP_STREAM 13.03% CPU utilization 9093.9 Mbps TCP_RR 95.96% CPU utilization 161/259/474 90/95/99% latencies 1.14806e+06 tps Without fix: TCP_STREAM 13.59% CPU utilization 9093.97 Mbps TCP_RR 93.95% CPU utilization 160/259/484 90/95/99% latencies 1.10262e+06 tps * VXLAN without checksum With fix: TCP_STREAM 13.28% CPU utilization 9093.87 Mbps TCP_RR 95.04% CPU utilization 155/246/439 90/95/99% latencies 1.15e+06 tps Without fix: TCP_STREAM 13.37% CPU utilization 9178.45 Mbps TCP_RR 93.74% CPU utilization 161/257/469 90/95/99% latencies 1.1068e+06 Mbps ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25gre: When GRE csum is present count as encap layer wrt csumTom Herbert1-0/+1
In GRE demux if the GRE checksum pop rcv encapsulation so that any encapsulated checksums are treated as tunnel checksums. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25udp: additional GRO supportTom Herbert4-17/+96
Implement GRO for UDPv6. Add UDP checksum verification in gro_receive for both UDP4 and UDP6 calling skb_gro_checksum_validate_zero_check. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25tcp: Call skb_gro_checksum_validateTom Herbert2-47/+6
In tcp[64]_gro_receive call skb_gro_checksum_validate to validate TCP checksum in the gro context. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25gre: call skb_gro_checksum_simple_validateTom Herbert1-36/+7
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25net: add gro_compute_pseudo functionsTom Herbert2-0/+16
Add inet_gro_compute_pseudo and ip6_gro_compute_pseudo. These are the logical equivalents of inet_compute_pseudo and ip6_compute_pseudo for GRO path. The IP header is taken from skb_gro_network_header. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>