summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2013-12-12net-gro: Prepare GRO stack for the upcoming tunneling supportJerry Chu1-1/+1
This patch modifies the GRO stack to avoid the use of "network_header" and associated macros like ip_hdr() and ipv6_hdr() in order to allow an arbitary number of IP hdrs (v4 or v6) to be used in the encapsulation chain. This lays the foundation for various IP tunneling support (IP-in-IP, GRE, VXLAN, SIT,...) to be added later. With this patch, the GRO stack traversing now is mostly based on skb_gro_offset rather than special hdr offsets saved in skb (e.g., skb->network_header). As a result all but the top layer (i.e., the the transport layer) must have hdrs of the same length in order for a pkt to be considered for aggregation. Therefore when adding a new encap layer (e.g., for tunneling), one must check and skip flows (e.g., by setting NAPI_GRO_CB(p)->same_flow to 0) that have a different hdr length. Note that unlike the network header, the transport header can and will continue to be set by the GRO code since there will be at most one "transport layer" in the encap chain. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-12macvlan: Remove custom recieve and forward handlersVlad Yasevich1-6/+1
Since now macvlan and macvtap use the same receive and forward handlers, we can remove them completely and use netif_rx and dev_forward_skb() directly. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-12ipv6: router reachability probingJiri Benc1-0/+1
RFC 4191 states in 3.5: When a host avoids using any non-reachable router X and instead sends a data packet to another router Y, and the host would have used router X if router X were reachable, then the host SHOULD probe each such router X's reachability by sending a single Neighbor Solicitation to that router's address. A host MUST NOT probe a router's reachability in the absence of useful traffic that the host would have sent to the router if it were reachable. In any case, these probes MUST be rate-limited to no more than one per minute per router. Currently, when the neighbour corresponding to a router falls into NUD_FAILED, it's never considered again. Introduce a new rt6_nud_state value, RT6_NUD_FAIL_PROBE, which suggests the route should not be used but should be probed with a single NS. The probe is ratelimited by the existing code. To better distinguish meanings of the failure values, rename RT6_NUD_FAIL_SOFT to RT6_NUD_FAIL_DO_RR. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-11ipv4: fix wildcard search with inet_confirm_addr()Nicolas Dichtel1-3/+2
Help of this function says: "in_dev: only on this interface, 0=any interface", but since commit 39a6d0630012 ("[NETNS]: Process inet_confirm_addr in the correct namespace."), the code supposes that it will never be NULL. This function is never called with in_dev == NULL, but it's exported and may be used by an external module. Because this patch restore the ability to call inet_confirm_addr() with in_dev == NULL, I partially revert the above commit, as suggested by Julian. CC: Julian Anastasov <ja@ssi.bg> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-11tipc: remove TIPC usage of field af_packet_priv in struct net_deviceYing Xue1-0/+3
TIPC is currently using the field 'af_packet_priv' in struct net_device as a handle to find the bearer instance associated to the given network device. But, by doing so it is blocking other networking cleanups, such as the one discussed here: http://patchwork.ozlabs.org/patch/178044/ This commit removes this usage from TIPC. Instead, we introduce a new field, 'tipc_ptr', to the net_device structure, to serve this purpose. When TIPC bearer is enabled, the bearer object is associated to 'tipc_ptr'. When a TIPC packet arrives in the recv_msg() upcall from a networking device, the bearer object can now be obtained from 'tipc_ptr'. When a bearer is disabled, the bearer object is detached from its underlying network device by setting 'tipc_ptr' to NULL. Additionally, an RCU lock is used to protect the new pointer. Henceforth, the existing tipc_net_lock is used in write mode to serialize write accesses to this pointer, while the new RCU lock is applied on the read side to ensure that the pointer is 100% valid within its wrapped area for all readers. Signed-off-by: Ying Xue <ying.xue@windriver.com> Cc: Patrick McHardy <kaber@trash.net> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-11ipv4: add support for IFA_FLAGS nl attributeJiri Pirko1-1/+1
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-11dn_dev: add support for IFA_FLAGS nl attributeJiri Pirko1-1/+1
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-11cipso: cleanup cipso_v4_translate() when !CONFIG_NETLABELPaul Moore1-1/+1
Don't needlessly recompute 'opt[opt_iter + 1]' as we already have it stored in 'tag_len'. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10ipv6: add ip6_flowlabel helperFlorent Fourcot1-0/+5
And use it if possible. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10ipv6: remove rcv_tclass of ipv6_pinfoFlorent Fourcot1-1/+0
tclass information in now already stored in rcv_flowinfo We do not need to store the same information twice. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10ipv6: move IPV6_TCLASS_MASK definition in ipv6.hFlorent Fourcot1-0/+1
Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10ipv6: add flowinfo for tcp6 pkt_options for all casesFlorent Fourcot1-0/+1
The current implementation of IPV6_FLOWINFO only gives a result if pktoptions is available (thanks to the ip6_datagram_recv_ctl function). It gives inconsistent results to user space, sometimes there is a result for getsockopt(IPV6_FLOWINFO), sometimes not. This patch add rcv_flowinfo to store it, and return it to the userspace in the same way than other pkt_options. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10etherdevice: Optimize a few is_<foo>_ether_addr functionsJoe Perches1-2/+23
If CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is set, several is_<foo>_ether_addr functions can be slightly improved by using u32 dereferences. I believe all current uses of is_zero_ether_addr and is_broadcast_ether_addr are u16 aligned, so always use u16 references to improve those functions performance. Document the u16 alignment requirements. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10etherdevice: Add ether_addr_equal_unalignedJoe Perches1-0/+18
Add a generic routine to test if possibly unaligned to u16 Ethernet addresses are equal. If CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is set, this uses the slightly faster generic routine ether_addr_equal, otherwise this uses memcmp. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10neigh: ipv6: respect default values set before an address is assigned to deviceJiri Pirko1-0/+7
Make the behaviour similar to ipv4. This will allow user to set sysctl default neigh param values and these values will be respected even by devices registered before (that ones what do not have address set yet). Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10neigh: restore old behaviour of default parms valuesJiri Pirko2-0/+20
Previously inet devices were only constructed when addresses are added. Therefore the default neigh parms values they get are the ones at the time of these operations. Now that we're creating inet devices earlier, this changes the behaviour of default neigh parms values in an incompatible way (see bug #8519). This patch creates a compromise by setting the default values at the same point as before but only for those that have not been explicitly set by the user since the inet device's creation. Introduced by: commit 8030f54499925d073a88c09f30d5d844fb1b3190 Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu Feb 22 01:53:47 2007 +0900 [IPV4] devinet: Register inetdev earlier. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10neigh: use tbl->family to distinguish ipv4 from ipv6Jiri Pirko1-1/+6
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10neigh: wrap proc dointvec functionsJiri Pirko1-0/+9
This will be needed later on to provide better management of default values. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10neigh: convert parms to an arrayJiri Pirko1-13/+35
This patch converts the neigh param members to an array. This allows easier manipulation which will be needed later on to provide better management of default values. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10net: phy: report link partner features through ethtoolFlorian Fainelli1-2/+3
The PHY library already reads the MII_STAT1000 and MII_LPA registers in genphy_read_status(), so extend it to also populate the PHY device link partner advertised features such that we can feed this back into ethtool when asked for it in phy_ethtool_gset(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10packet: introduce PACKET_QDISC_BYPASS socket optionDaniel Borkmann1-0/+1
This patch introduces a PACKET_QDISC_BYPASS socket option, that allows for using a similar xmit() function as in pktgen instead of taking the dev_queue_xmit() path. This can be very useful when PF_PACKET applications are required to be used in a similar scenario as pktgen, but with full, flexible packet payload that needs to be provided, for example. On default, nothing changes in behaviour for normal PF_PACKET TX users, so everything stays as is for applications. New users, however, can now set PACKET_QDISC_BYPASS if needed to prevent own packets from i) reentering packet_rcv() and ii) to directly push the frame to the driver. In doing so we can increase pps (here 64 byte packets) for PF_PACKET a bit: # CPUs -- QDISC_BYPASS -- qdisc path -- qdisc path[**] 1 CPU == 1,509,628 pps -- 1,208,708 -- 1,247,436 2 CPUs == 3,198,659 pps -- 2,536,012 -- 1,605,779 3 CPUs == 4,787,992 pps -- 3,788,740 -- 1,735,610 4 CPUs == 6,173,956 pps -- 4,907,799 -- 1,909,114 5 CPUs == 7,495,676 pps -- 5,956,499 -- 2,014,422 6 CPUs == 9,001,496 pps -- 7,145,064 -- 2,155,261 7 CPUs == 10,229,776 pps -- 8,190,596 -- 2,220,619 8 CPUs == 11,040,732 pps -- 9,188,544 -- 2,241,879 9 CPUs == 12,009,076 pps -- 10,275,936 -- 2,068,447 10 CPUs == 11,380,052 pps -- 11,265,337 -- 1,578,689 11 CPUs == 11,672,676 pps -- 11,845,344 -- 1,297,412 [...] 20 CPUs == 11,363,192 pps -- 11,014,933 -- 1,245,081 [**]: qdisc path with packet_rcv(), how probably most people seem to use it (hopefully not anymore if not needed) The test was done using a modified trafgen, sending a simple static 64 bytes packet, on all CPUs. The trick in the fast "qdisc path" case, is to avoid reentering packet_rcv() by setting the RAW socket protocol to zero, like: socket(PF_PACKET, SOCK_RAW, 0); Tradeoffs are documented as well in this patch, clearly, if queues are busy, we will drop more packets, tc disciplines are ignored, and these packets are not visible to taps anymore. For a pktgen like scenario, we argue that this is acceptable. The pointer to the xmit function has been placed in packet socket structure hole between cached_dev and prot_hook that is hot anyway as we're working on cached_dev in each send path. Done in joint work together with Jesper Dangaard Brouer. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10net: dev: move inline skb_needs_linearize helper to headerDaniel Borkmann1-0/+18
As we need it elsewhere, move the inline helper function of skb_needs_linearize() over to skbuff.h include file. While at it, also convert the return to 'bool' instead of 'int' and add a proper kernel doc. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-5/+5
Merge 'net' into 'net-next' to get the AF_PACKET bug fix that Daniel's direct transmit changes depend upon. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10pkt_sched: give visibility to mq slave qdiscsEric Dumazet1-0/+1
Commit 6da7c8fcbcbd ("qdisc: allow setting default queuing discipline") added the ability to change default qdisc from pfifo_fast to say fq But as most modern ethernet devices are multiqueue, we cant really see all the statistics from "tc -s qdisc show", as the default root qdisc is mq. This patch adds the calls to qdisc_list_add() to mq and mqprio Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-07ether_addr_equal: Optimize implementation, remove unused compare_ether_addrJoe Perches1-33/+18
Add a new check for CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to reduce the number of or's used in the ether_addr_equal comparison to very slightly improve function performance. Simplify the ether_addr_equal_64bits implementation. Integrate and remove the zap_last_2bytes helper as it's now used only once. Remove the now unused compare_ether_addr function. Update the unaligned-memory-access documentation to remove the compare_ether_addr description and show how unaligned accesses could occur with ether_addr_equal. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-07ipv6 addrconf: introduce IFA_F_MANAGETEMPADDR to tell kernel to manage ↵Jiri Pirko1-0/+1
temporary addresses Creating an address with this flag set will result in kernel taking care of temporary addresses in the same way as if the address was created by kernel itself (after RA receive). This allows userspace applications implementing the autoconfiguration (NetworkManager for example) to implement ipv6 addresses privacy. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-07ipv6 addrconf: extend ifa_flags to u32Jiri Pirko3-3/+7
There is no more space in u8 ifa_flags. So do what davem suffested and add another netlink attr called IFA_FLAGS for carry more flags. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-07net: introduce dev_consume_skb_any()Eric Dumazet1-9/+44
Some network drivers use dev_kfree_skb_any() and dev_kfree_skb_irq() helpers to free skbs, both for dropped packets and TX completed ones. We need to separate the two causes to get better diagnostics given by dropwatch or "perf record -e skb:kfree_skb" This patch provides two new helpers, dev_consume_skb_any() and dev_consume_skb_irq() to be used for consumed skbs. __dev_kfree_skb_irq() is slightly optimized to remove one atomic_dec_and_test() in fast path, and use this_cpu_{r|w} accessors. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06net: phy: breakdown PHY_*_FEATURES definesFlorian Fainelli1-7/+16
Breakdown the PHY_*_FEATURES into per speed defines such that we can easily re-use them individually. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06Merge branch 'for-davem' of ↵David S. Miller5-62/+189
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Please pull this batch of updates intended for the 3.14 stream... For the mac80211 bits, Johannes says: "I have various improvements/cleanups/fixes all over, but the shortlog shows that Luis's regulatory work and mesh work from the cozybit folks are the biggest ones, along with the CSA fixes." Along with that, we have big batches of updates to brcmfmac, rtlwifi, and ath9k. There are updates to wcn36xx, rt2x00, and a handful of others as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06tcp: auto corkingEric Dumazet2-0/+2
With the introduction of TCP Small Queues, TSO auto sizing, and TCP pacing, we can implement Automatic Corking in the kernel, to help applications doing small write()/sendmsg() to TCP sockets. Idea is to change tcp_push() to check if the current skb payload is under skb optimal size (a multiple of MSS bytes) If under 'size_goal', and at least one packet is still in Qdisc or NIC TX queues, set the TCP Small Queue Throttled bit, so that the push will be delayed up to TX completion time. This delay might allow the application to coalesce more bytes in the skb in following write()/sendmsg()/sendfile() system calls. The exact duration of the delay is depending on the dynamics of the system, and might be zero if no packet for this flow is actually held in Qdisc or NIC TX ring. Using FQ/pacing is a way to increase the probability of autocorking being triggered. Add a new sysctl (/proc/sys/net/ipv4/tcp_autocorking) to control this feature and default it to 1 (enabled) Add a new SNMP counter : nstat -a | grep TcpExtTCPAutoCorking This counter is incremented every time we detected skb was under used and its flush was deferred. Tested: Interesting effects when using line buffered commands under ssh. Excellent performance results in term of cpu usage and total throughput. lpq83:~# echo 1 >/proc/sys/net/ipv4/tcp_autocorking lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128 9410.39 Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128': 35209.439626 task-clock # 2.901 CPUs utilized 2,294 context-switches # 0.065 K/sec 101 CPU-migrations # 0.003 K/sec 4,079 page-faults # 0.116 K/sec 97,923,241,298 cycles # 2.781 GHz [83.31%] 51,832,908,236 stalled-cycles-frontend # 52.93% frontend cycles idle [83.30%] 25,697,986,603 stalled-cycles-backend # 26.24% backend cycles idle [66.70%] 102,225,978,536 instructions # 1.04 insns per cycle # 0.51 stalled cycles per insn [83.38%] 18,657,696,819 branches # 529.906 M/sec [83.29%] 91,679,646 branch-misses # 0.49% of all branches [83.40%] 12.136204899 seconds time elapsed lpq83:~# echo 0 >/proc/sys/net/ipv4/tcp_autocorking lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128 6624.89 Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128': 40045.864494 task-clock # 3.301 CPUs utilized 171 context-switches # 0.004 K/sec 53 CPU-migrations # 0.001 K/sec 4,080 page-faults # 0.102 K/sec 111,340,458,645 cycles # 2.780 GHz [83.34%] 61,778,039,277 stalled-cycles-frontend # 55.49% frontend cycles idle [83.31%] 29,295,522,759 stalled-cycles-backend # 26.31% backend cycles idle [66.67%] 108,654,349,355 instructions # 0.98 insns per cycle # 0.57 stalled cycles per insn [83.34%] 19,552,170,748 branches # 488.244 M/sec [83.34%] 157,875,417 branch-misses # 0.81% of all branches [83.34%] 12.130267788 seconds time elapsed Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06netfilter: Fix FSF address in file headersJeff Kirsher1-2/+1
Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: netfilter@vger.kernel.org CC: Pablo Neira Ayuso <pablo@netfilter.org> CC: Patrick McHardy <kaber@trash.net> CC: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06include/net/: Fix FSF address in file headersJeff Kirsher28-76/+32
Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06sctp: Fix FSF address in file headersJeff Kirsher8-24/+16
Several files refer to an old address for the Free Software Foundation in the file header comment. Resolve by replacing the address with the URL <http://www.gnu.org/licenses/> so that we do not have to keep updating the header comments anytime the address changes. CC: Vlad Yasevich <vyasevich@gmail.com> CC: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06Merge branch 'master' of ↵John W. Linville5-62/+189
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
2013-12-06tcp_memcontrol: Cleanup/fix cg_proto->memory_pressure handling.Eric W. Biederman1-4/+2
kill memcg_tcp_enter_memory_pressure. The only function of memcg_tcp_enter_memory_pressure was to reduce deal with the unnecessary abstraction that was tcp_memcontrol. Now that struct tcp_memcontrol is gone remove this unnecessary function, the unnecessary function pointer, and modify sk_enter_memory_pressure to set this field directly, just as sk_leave_memory_pressure cleas this field directly. This fixes a small bug I intruduced when killing struct tcp_memcontrol that caused memcg_tcp_enter_memory_pressure to never be called and thus failed to ever set cg_proto->memory_pressure. Remove the cg_proto enter_memory_pressure function as it now serves no useful purpose. Don't test cg_proto->memory_presser in sk_leave_memory_pressure before clearing it. The test was originally there to ensure that the pointer was non-NULL. Now that cg_proto is not a pointer the pointer does not matter. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06xen-netback: fix fragment detection in checksum setupPaul Durrant2-1/+3
The code to detect fragments in checksum_setup() was missing for IPv4 and too eager for IPv6. (It transpires that Windows seems to send IPv6 packets with a fragment header even if they are not a fragment - i.e. offset is zero, and M bit is not set). This patch also incorporates a fix to callers of maybe_pull_tail() where skb->network_header was being erroneously added to the length argument. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> cc: David Miller <davem@davemloft.net> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06Merge branch 'siocghwtstamp' of ↵David S. Miller2-9/+10
git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next Ben Hutchings says: ==================== SIOCGHWTSTAMP ioctl 1. Add the SIOCGHWTSTAMP ioctl and update the timestamping documentation. 2. Implement SIOCGHWTSTAMP in most drivers that support SIOCSHWTSTAMP. 3. Add a test program to exercise SIOC{G,S}HWTSTAMP. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-03Revert "net: Handle CHECKSUM_COMPLETE more adequately in pskb_trim_rcsum()."David S. Miller1-21/+18
This reverts commit 018c5bba052b3a383d83cf0c756da0e7bc748397. It causes regressions for people using chips driven by the sungem driver. Suspicion is that the skb->csum value isn't being adjusted properly. The change also has a bug in that if __pskb_trim() fails, we'll leave a corruped skb->csum value in there. We would really need to revert it to it's original value in that case. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-03PCI / tg3: Give up chip reset and carrier loss handling if PCI device is not ↵Rafael J. Wysocki1-0/+1
present Modify tg3_chip_reset() and tg3_close() to check if the PCI network adapter device is accessible at all in order to skip poking it or trying to handle a carrier loss in vain when that's not the case. Introduce a special PCI helper function pci_device_is_present() for this purpose. Of course, this uncovers the lack of the appropriate RTNL locking in tg3_suspend() and tg3_resume(), so add that locking in there too. These changes prevent tg3 from burning a CPU at 100% load level for solid several seconds after the Thunderbolt link is disconnected from a Matrox DS1 docking station. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-02Merge branch 'for-john' of ↵John W. Linville5-62/+189
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
2013-12-02Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: - Correction of fuzzy and fragile IRQ_RETVAL macro - IRQ related resume fix affecting only XEN - ARM/GIC fix for chained GIC controllers * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: Gic: fix boot for chained gics irq: Enable all irqs unconditionally in irq_resume genirq: Correct fuzzy and fragile IRQ_RETVAL() definition
2013-12-02Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Various smaller fixlets, all over the place" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/doc: Fix generation of device-drivers sched: Expose preempt_schedule_irq() sched: Fix a trivial typo in comments sched: Remove unused variable in 'struct sched_domain' sched: Avoid NULL dereference on sd_busy sched: Check sched_domain before computing group power MAINTAINERS: Update file patterns in the lockdep and scheduler entries
2013-12-02Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds3-0/+27
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc kernel and tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools lib traceevent: Fix conversion of pointer to integer of different size perf/trace: Properly use u64 to hold event_id perf: Remove fragile swevent hlist optimization ftrace, perf: Avoid infinite event generation loop tools lib traceevent: Fix use of multiple options in processing field perf header: Fix possible memory leaks in process_group_desc() perf header: Fix bogus group name perf tools: Tag thread comm as overriden
2013-12-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds9-5/+15
Pull networking updates from David Miller: "Here is a pile of bug fixes that accumulated while I was in Europe" 1) In fixing kernel leaks to userspace during copying of socket addresses, we broke a case that used to work, namely the user providing a buffer larger than the in-kernel generic socket address structure. This broke Ruby amongst other things. Fix from Dan Carpenter. 2) Fix regression added by byte queue limit support in 8139cp driver, from Yang Yingliang. 3) The addition of MSG_SENDPAGE_NOTLAST buggered up a few sendpage implementations, they should just treat it the same as MSG_MORE. Fix from Richard Weinberger and Shawn Landden. 4) Handle icmpv4 errors received on ipv6 SIT tunnels correctly, from Oussama Ghorbel. In particular we should send an ICMPv6 unreachable in such situations. 5) Fix some regressions in the recent genetlink fixes, in particular get the pmcraid driver to use the new safer interfaces correctly. From Johannes Berg. 6) macvtap was converted to use a per-cpu set of statistics, but some code was still bumping tx_dropped elsewhere. From Jason Wang. 7) Fix build failure of xen-netback due to missing include on some architectures, from Andy Whitecroft. 8) macvtap double counts received packets in statistics, fix from Vlad Yasevich. 9) Fix various cases of using *_STATS_BH() when *_STATS() is more appropriate. From Eric Dumazet and Hannes Frederic Sowa. 10) Pktgen ipsec mode doesn't update the ipv4 header length and checksum properly after encapsulation. Fix from Fan Du. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits) net/mlx4_en: Remove selftest TX queues empty condition {pktgen, xfrm} Update IPv4 header total len and checksum after tranformation virtio_net: make all RX paths handle erors consistently virtio_net: fix error handling for mergeable buffers virtio_net: Fixed a trivial typo (fitler --> filter) netem: fix gemodel loss generator netem: fix loss 4 state model netem: missing break in ge loss generator net/hsr: Support iproute print_opt ('ip -details ...') net/hsr: Very small fix of comment style. MAINTAINERS: Added net/hsr/ maintainer ipv6: fix possible seqlock deadlock in ip6_finish_output2 ixgbe: Make ixgbe_identify_qsfp_module_generic static ixgbe: turn NETIF_F_HW_L2FW_DOFFLOAD off by default ixgbe: ixgbe_fwd_ring_down needs to be static e1000: fix possible reset_task running after adapter down e1000: fix lockdep warning in e1000_reset_task e1000: prevent oops when adapter is being closed and reset simultaneously igb: Fixed Wake On LAN support inet: fix possible seqlock deadlocks ...
2013-12-02cfg80211/mac80211/ath6kl: acquire wdev lock outside ch_switch_notifySimon Wunderlich1-1/+2
The channel switch notification should be sent under the wdev/sdata-lock, preferably in the same moment as the channel change happens, to avoid races by other callers (e.g. start/stop_ap). This also adds the previously missing sdata_lock protection in csa_finalize_work. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-12-02cfg80211: aggregate mgmt_tx parameters into a structAndrei Otcheretianski1-3/+25
Change cfg80211 and mac80211 to use cfg80211_mgmt_tx_params struct to aggregate parameters for mgmt_tx functions. This makes the functions' signatures less clumsy and allows less painful parameters extension. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> [fix all other drivers] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-12-02cfg80211: fix reporting 5/10 MHz support to user spaceFelix Fietkau1-0/+8
nla_put_flag needs a real nl80211 attribute id, not a wiphy flag bit. While at it, split 5 and 10 MHz capability flags in case we ever need to support hardware that can only do one of the two. Also move the flag settings to the split-only information so we don't increase the space needed for old userspace. Signed-off-by: Felix Fietkau <nbd@openwrt.org> [change location of flag setting] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-11-30net/hsr: Support iproute print_opt ('ip -details ...')Arvid Brodin1-1/+3
This implements the rtnl_link_ops fill_info routine for HSR. Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-29sctp: Restore 'resent' bit to avoid retransmitted chunks for RTT measurementsXufeng Zhang1-0/+1
Currently retransmitted DATA chunks could also be used for RTT measurements since there are no flag to identify whether the transmitted DATA chunk is a new one or a retransmitted one. This problem is introduced by commit ae19c5486 ("sctp: remove 'resent' bit from the chunk") which inappropriately removed the 'resent' bit completely, instead of doing this, we should set the resent bit only for the retransmitted DATA chunks. Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>