kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2014-07-10	ipconfig: move ic_dev_xid under IPCONFIG_BOOTP	Fabian Frederick	1	-2/+1
	ic_dev_xid is only used in __init ic_bootp_recv under IPCONFIG_BOOTP and __init ic_dynamic under IPCONFIG_DYNAMIC(which is itself defined with the same IPCONFIG_BOOTP) This patch fixes the following warning when IPCONFIG_BOOTP is not set: >> net/ipv4/ipconfig.c:146:15: warning: 'ic_dev_xid' defined but not used [-Wunused-variable] static __be32 ic_dev_xid; /* Device under configuration */ Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: netdev@vger.kernel.org Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-09	netpoll: fix use after free	david decotigny	1	-1/+2
	After a bonding master reclaims the netpoll info struct, slaves could still hold a pointer to the reclaimed data. This patch fixes it: as soon as netpoll_async_cleanup is called for a slave (eg. when un-enslaved), we make sure that this slave doesn't point to the data. Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-09	bridge: export knowledge about the presence of IGMP/MLD queriers	Linus Lüssing	1	-0/+37
	With this patch other modules are able to ask the bridge whether an IGMP or MLD querier exists on the according, bridged link layer. Multicast snooping can only be performed if a valid, selected querier exists on a link. Just like the bridge only enables its multicast snooping if a querier exists, e.g. batman-adv too can only activate its multicast snooping in bridged scenarios if a querier is present. For instance this export avoids having to reimplement IGMP/MLD querier message snooping and parsing in e.g. batman-adv, when multicast optimizations for bridged scenarios are added in the future. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-09	tipc: fix a memleak when sending data	Erik Hugne	1	-1/+4
	This fixes a regression bug caused by: 067608e9d019d6477fd45dd948e81af0e5bf599f ("tipc: introduce direct iovec to buffer chain fragmentation function") If data is sent on a nonblocking socket and the destination link is congested, the buffer chain is leaked. We fix this by freeing the chain in this case. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-09	net: sctp: Inline the functions from command.c	David Laight	2	-69/+1
	sctp_init_cmd_seq() and sctp_next_cmd() are only called from one place. The call sequence for sctp_add_cmd_sf() is likely to be longer than the inlined code. With sctp_add_cmd_sf() inlined the compiler can optimise repeated calls. Signed-off-by: David Laight <david.laight@aculab.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-09	appletalk: fix a coccinella warning in net/appletalk/ddp.c	wangweidong	1	-1/+1
	This warning is introduced by commit 7b30600cc6 ("appletalk: fix checkpatch error with indent"), So fix it. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next	David S. Miller	37	-2180/+2148
	John W. Linville says: ==================== pull request: wireless-next 2014-07-03 Please pull this first batch of wireless updates intended for the 3.17 stream... For the mac80211 bits, Johannes says: "The biggest thing here is probably Arik's TDLS rework, beyond that we have smaller improvements and features like David's scanning IE thing, Luca's queue work, some CSA work, etc. Also your PID rate control removal, of course." For the iwlwifi bits, Emmanuel says: "I have here a whole bunch of various things. Andy contributes better debug prints for dvm specific flows and a module parameter to completely disable power save for dvm. Andrei is sharing the premises of his work on CSA - more to come. Eran and Liad keep on working on the new devices. I have the regular amount of BT Coex stuff and I continue to work on the firmware error report system adding more debug capabilities. More to come on that subject too." On top of that, there are some cleanups to the new rsi driver, some continuing improvements to the rtl818x drivers, and the usual bundles of updates to ath9k, b43, mwifiex, wil6210, and a few other bits here and there. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-09	net: filter: move load_pointer() into filter.h	Zi Shen Lim	1	-12/+3
	load_pointer() is already a static inline function. Let's move it into filter.h so BPF JIT implementations can reuse this function. Since we're exporting this function, let's also rename it to bpf_load_pointer() for clarity. Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com> Reviewed-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net/hsr: Fix NULL pointer dereference on incomplete hsr_newlink() params.	Arvid Brodin	1	-2/+6
	If none of the slave interfaces are specified, struct nlattr *data[] may be NULL. Make sure to check for that. While I'm at it, fix the horrible error messages displayed when only one of the slave interfaces isn't specified. Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net/hsr: Better frame dispatch	Arvid Brodin	11	-586/+656
	This patch removes the separate paths for frames coming from the outside, and frames sent from the HSR device, and instead makes all frames go through hsr_forward_skb() in hsr_forward.c. This greatly improves code readability and also opens up the possibility for future support of the HSR Interlink device that is the basis for HSR RedBoxes and HSR QuadBoxes, as well as VLAN compatibility. Other improvements: * A reduction in the number of times an skb is copied on machines without HAVE_EFFICIENT_UNALIGNED_ACCESS, which improves throughput somewhat. * Headers are now created using the standard eth_header(), and using the standard hard_header_len. * Each HSR slave now gets its own private skb, so slave-specific fields can be correctly set. Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net/hsr: Added SET_NETDEV_DEVTYPE and features \|= NETIF_F_NETNS_LOCAL to ↵	Arvid Brodin	1	-3/+11
	dev_setup. Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net/hsr: Implemented .ndo_fix_features (better device features handling).	Arvid Brodin	2	-8/+49
	Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net/hsr: Use list_head (and rcu) instead of array for slave devices.	Arvid Brodin	9	-405/+438
	Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net/hsr: Move slave init to hsr_slave.c.	Arvid Brodin	6	-164/+205
	Also try to prevent some possible slave dereference race conditions. This is finalized in the next patch, which abandons the slave array in favour of a list_head list and list RCU. Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net/hsr: Operstate handling cleanup.	Arvid Brodin	3	-22/+30
	Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net/hsr: Move to per-hsr device prune timer.	Arvid Brodin	5	-27/+15
	Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net/hsr: Switch from dev_add_pack() to netdev_rx_handler_register()	Arvid Brodin	6	-247/+283
	Also move the frame receive handler to hsr_slave.c. Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net/hsr: Better variable names and update of contact info.	Arvid Brodin	8	-305/+304
	Signed-off-by: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	ipconfig: add static to local variable	Fabian Frederick	1	-1/+1
	ic_dev_xid is only used in ipconfig.c Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: netdev@vger.kernel.org Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	mac802154: at86rf230: add hw flags and merge ops	Alexander Aring	1	-14/+46
	This patch adds new mac802154 hw flags for transmit power, csma and listen before transmit (lbt). These flags indicates that the transceiver supports these features. If the flags are set and the driver doesn't implement the necessary functions, then ieee802154_register_device returns -ENOSYS "Function not implemented". This patch merges also all at86rf230 operations into one operations structure and set the right hw flags for the at86rf230 transceivers. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net: Only do flow_dissector hash computation once per packet	Tom Herbert	1	-0/+2
	Add sw_hash flag to skbuff to indicate that skb->hash was computed from flow_dissector. This flag is checked in skb_get_hash to avoid repeatedly trying to compute the hash (ie. in the case that no L4 hash can be computed). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	ipv6: Implement automatic flow label generation on transmit	Tom Herbert	6	-5/+29
	Automatically generate flow labels for IPv6 packets on transmit. The flow label is computed based on skb_get_hash. The flow label will only automatically be set when it is zero otherwise (i.e. flow label manager hasn't set one). This supports the transmit side functionality of RFC 6438. Added an IPv6 sysctl auto_flowlabels to enable/disable this behavior system wide, and added IPV6_AUTOFLOWLABEL socket option to enable this functionality per socket. By default, auto flowlabels are disabled to avoid possible conflicts with flow label manager, however if this feature proves useful we may want to enable it by default. It should also be noted that FreeBSD has already implemented automatic flow labels (including the sysctl and socket option). In FreeBSD, automatic flow labels default to enabled. Performance impact: Running super_netperf with 200 flows for TCP_RR and UDP_RR for IPv6. Note that in UDP case, __skb_get_hash will be called for every packet with explains slight regression. In the TCP case the hash is saved in the socket so there is no regression. Automatic flow labels disabled: TCP_RR: 86.53% CPU utilization 127/195/322 90/95/99% latencies 1.40498e+06 tps UDP_RR: 90.70% CPU utilization 118/168/243 90/95/99% latencies 1.50309e+06 tps Automatic flow labels enabled: TCP_RR: 85.90% CPU utilization 128/199/337 90/95/99% latencies 1.40051e+06 UDP_RR 92.61% CPU utilization 115/164/236 90/95/99% latencies 1.4687e+06 Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	flow_dissector: Use IPv6 flow label in flow_dissector	Tom Herbert	1	-0/+17
	This patch implements the receive side to support RFC 6438 which is to use the flow label as an ECMP hash. If an IPv6 flow label is set in a packet we can use this as input for computing an L4-hash. There should be no need to parse any transport headers in this case. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	vxlan: Call udp_flow_src_port	Tom Herbert	1	-4/+1
	In vxlan and OVS vport-vxlan call common function to get source port for a UDP tunnel. Removed vxlan_src_port since the functionality is now in udp_flow_src_port. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net: Call skb_get_hash in get_xps_queue and __skb_tx_hash	Tom Herbert	1	-24/+5
	Call standard function to get a packet hash instead of taking this from skb->sk->sk_hash or only using skb->protocol. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net: Save TX flow hash in sock and set in skbuf on xmit	Tom Herbert	5	-0/+10
	For a connected socket we can precompute the flow hash for setting in skb->hash on output. This is a performance advantage over calculating the skb->hash for every packet on the connection. The computation is done using the common hash algorithm to be consistent with computations done for packets of the connection in other states where thers is no socket (e.g. time-wait, syn-recv, syn-cookies). This patch adds sk_txhash to the sock structure. inet_set_txhash and ip6_set_txhash functions are added which are called from points in TCP and UDP where socket moves to established state. skb_set_hash_from_sk is a function which sets skb->hash from the sock txhash value. This is called in UDP and TCP transmit path when transmitting within the context of a socket. Tested: ran super_netperf with 200 TCP_RR streams over a vxlan interface (in this case skb_get_hash called on every TX packet to create a UDP source port). Before fix: 95.02% CPU utilization 154/256/505 90/95/99% latencies 1.13042e+06 tps Time in functions: 0.28% skb_flow_dissect 0.21% __skb_get_hash After fix: 94.95% CPU utilization 156/254/485 90/95/99% latencies 1.15447e+06 Neither __skb_get_hash nor skb_flow_dissect appear in perf Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	flow_dissector: Abstract out hash computation	Tom Herbert	1	-16/+28
	Move the hash computation located in __skb_get_hash to be a separate function which takes flow_keys as input. This will allow flow hash computation in other contexts where we only have addresses and ports. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	6lowpan: mac802154: fix coding style issues	Varka Bhadram	15	-145/+153
	This patch fixed the coding style issues reported by checkpatch.pl following issues fixed: CHECK: Alignment should match open parenthesis WARNING: line over 80 characters CHECK: Blank lines aren't necessary before a close brace '}' WARNING: networking block comments don't use an empty /* line, use /* Comment... WARNING: Missing a blank line after declarations WARNING: networking block comments start with * on subsequent lines CHECK: braces {} should be used on all arms of this statement Signed-off-by: Varka Bhadram <varkab@cdac.in> Tested-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	netlink: Fix do_one_broadcast() prototype.	Rami Rosen	1	-9/+6
	This patch changes the prototype of the do_one_broadcast() method so that it will return void. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	tipc: fix link acknowledge logic in receive path	Erik Hugne	1	-5/+5
	Link state acks triggered from the receive path is done before the last received packet have been processed by the link layer. The effect of this is that the last received packet will not be included in the ack. This causes problems if the link window is set to TIPC_MIN_LINK_WIN, where the ack interval will be equal to the link tolerance, and the link enters a stop-and-go behavior. We move the ack logic to after link state processing, just before the packet is delivered to higher layers. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Carl Sigurjonsson <carl.sigurjonsson@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	tipc: refactor message delivery out of tipc_rcv	Erik Hugne	1	-44/+80
	This is a cosmetic change, separating message delivery from the link state processing. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	tcp: switch snt_synack back to measuring transmit time of first SYNACK	Neal Cardwell	2	-4/+0
	Always store in snt_synack the time at which the server received the first client SYN and attempted to send the first SYNACK. Recent commit aa27fc501 ("tcp: tcp_v[46]_conn_request: fix snt_synack initialization") resolved an inconsistency between IPv4 and IPv6 in the initialization of snt_synack. This commit brings back the idea from 843f4a55e (tcp: use tcp_v4_send_synack on first SYN-ACK), which was going for the original behavior of snt_synack from the commit where it was added in 9ad7c049f0f79 ("tcp: RFC2988bis + taking RTT sample from 3WHS for the passive open side") in v3.1. In addition to being simpler (and probably a tiny bit faster), unconditionally storing the time of the first SYNACK attempt has been useful because it allows calculating a performance metric quantifying how long it took to establish a passive TCP connection. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Cc: Octavian Purdila <octavian.purdila@intel.com> Cc: Jerry Chu <hkchu@google.com> Acked-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	batman-adv: Use kasprintf	Himangi Saraogi	1	-16/+8
	kasprintf combines kmalloc and sprintf, and takes care of the size calculation itself. The semantic patch that makes this change is as follows: // <smpl> @@ expression a,flag; expression list args; statement S; @@ a = - \(kmalloc\\|kzalloc\)(...,flag) + kasprintf(flag,args) <... when != a if (a == NULL \|\| ...) S ...> - sprintf(a,args); // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	vlan: Pass SIOC[SG]HWTSTAMP ioctls to real device	Stefan Sørensen	1	-0/+2
	This allows applications to enable hardware timestamping without being aware of it being a vlan device and figuring out the real device. Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	ptp: Classify ptp over ip over vlan packets	Stefan Sørensen	1	-6/+58
	This extends the ptp bpf to also match ptp over ip over vlan packets. The ptp classes are changed to orthogonal bitfields representing version, transport and vlan values to simplify matching. Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08	net: Simplify ptp class checks	Stefan Sørensen	1	-37/+20
	Replace two switch statements enumerating all valid ptp classes with an if statement matching for not PTP_CLASS_NONE. Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-03	net: sctp: only warn in proc_sctp_do_alpha_beta if write	Daniel Borkmann	1	-2/+3
	Only warn if the value is written to alpha or beta. We don't care emitting a one-time warning when only reading it. Reported-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-03	net: sctp: improve timer slack calculation for transport HBs	Daniel Borkmann	1	-8/+9
	RFC4960, section 8.3 says: On an idle destination address that is allowed to heartbeat, it is recommended that a HEARTBEAT chunk is sent once per RTO of that destination address plus the protocol parameter 'HB.interval', with jittering of +/- 50% of the RTO value, and exponential backoff of the RTO if the previous HEARTBEAT is unanswered. Currently, we calculate jitter via sctp_jitter() function first, and then add its result to the current RTO for the new timeout: TMO = RTO + (RAND() % RTO) - (RTO / 2) `------------------------^-=> sctp_jitter() Instead, we can just simplify all this by directly calculating: TMO = (RTO / 2) + (RAND() % RTO) With the help of prandom_u32_max(), we don't need to open code our own global PRNG, but can instead just make use of the per CPU implementation of prandom with better quality numbers. Also, we can now spare us the conditional for divide by zero check since no div or mod operation needs to be used. Note that prandom_u32_max() won't emit the same result as a mod operation, but we really don't care here as we only want to have a random number scaled into RTO interval. Note, exponential RTO backoff is handeled elsewhere, namely in sctp_do_8_2_transport_strike(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-03	net/caif/caif_socket.c: remove unnecessary null test before ↵	Fabian Frederick	1	-2/+1
	debugfs_remove_recursive based on checkpatch: "debugfs_remove_recursive(NULL) is safe this check is probably not required" Cc: Dmitry Tarnyagin <dmitry.tarnyagin@lockless.no> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02	inet: move ipv6only in sock_common	Eric Dumazet	5	-11/+8
	When an UDP application switches from AF_INET to AF_INET6 sockets, we have a small performance degradation for IPv4 communications because of extra cache line misses to access ipv6only information. This can also be noticed for TCP listeners, as ipv6_only_sock() is also used from __inet_lookup_listener()->compute_score() This is magnified when SO_REUSEPORT is used. Move ipv6only into struct sock_common so that it is available at no extra cost in lookups. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02	pktgen: RCU-ify "if_list" to remove lock in next_to_run()	Jesper Dangaard Brouer	1	-49/+52
	The if_lock()/if_unlock() in next_to_run() adds a significant overhead, because its called for every packet in busy loop of pktgen_thread_worker(). (Thomas Graf originally pointed me at this lock problem). Removing these two "LOCK" operations should in theory save us approx 16ns (8ns x 2), as illustrated below we do save 16ns when removing the locks and introducing RCU protection. Performance data with CLONE_SKB==100000, TX-size=512, rx-usecs=30: (single CPU performance, ixgbe 10Gbit/s, E5-2630) * Prev : 5684009 pps --> 175.93ns (1/568400910^9) RCU-fix: 6272204 pps --> 159.43ns (1/627220410^9) Diff : +588195 pps --> -16.50ns To understand this RCU patch, I describe the pktgen thread model below. In pktgen there is several kernel threads, but there is only one CPU running each kernel thread. Communication with the kernel threads are done through some thread control flags. This allow the thread to change data structures at a know synchronization point, see main thread func pktgen_thread_worker(). Userspace changes are communicated through proc-file writes. There are three types of changes, general control changes "pgctrl" (func:pgctrl_write), thread changes "kpktgend_X" (func:pktgen_thread_write), and interface config changes "etcX@N" (func:pktgen_if_write). Userspace "pgctrl" and "thread" changes are synchronized via the mutex pktgen_thread_lock, thus only a single userspace instance can run. The mutex is taken while the packet generator is running, by pgctrl "start". Thus e.g. "add_device" cannot be invoked when pktgen is running/started. All "pgctrl" and all "thread" changes, except thread "add_device", communicate via the thread control flags. The main problem is the exception "add_device", that modifies threads "if_list" directly. Fortunately "add_device" cannot be invoked while pktgen is running. But there exists a race between "rem_device_all" and "add_device" (which normally don't occur, because "rem_device_all" waits 125ms before returning). Background'ing "rem_device_all" and running "add_device" immediately allow the race to occur. The race affects the threads (list of devices) "if_list". The if_lock is used for protecting this "if_list". Other readers are given lock-free access to the list under RCU read sections. Note, interface config changes (via proc) can occur while pktgen is running, which worries me a bit. I'm assuming proc_remove() takes appropriate locks, to assure no writers exists after proc_remove() finish. I've been running a script exercising the race condition (leading me to fix the proc_remove order), without any issues. The script also exercises concurrent proc writes, while the interface config is getting removed. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02	pktgen: avoid expensive set_current_state() call in loop	Jesper Dangaard Brouer	1	-6/+3
	Avoid calling set_current_state() inside the busy-loop in pktgen_thread_worker(). In case of pkt_dev->delay, then it is still used/enabled in pktgen_xmit() via the spin() call. The set_current_state(TASK_INTERRUPTIBLE) uses a xchg, which implicit is LOCK prefixed. I've measured the asm LOCK operation to take approx 8ns on this E5-2630 CPU. Performance increase corrolate with this measurement. Performance data with CLONE_SKB==100000, rx-usecs=30: (single CPU performance, ixgbe 10Gbit/s, E5-2630) * Prev: 5454050 pps --> 183.35ns (1/545405010^9) Now: 5684009 pps --> 175.93ns (1/568400910^9) Diff: +229959 pps --> -7.42ns Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02	openvswitch: introduce rtnl ops stub	Jiri Pirko	3	-1/+26
	This stub now allows userspace to see IFLA_INFO_KIND for ovs master and IFLA_INFO_SLAVE_KIND for slave. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02	rtnetlink: allow to register ops without ops->setup set	Jiri Pirko	2	-3/+11
	So far, it is assumed that ops->setup is filled up. But there might be case that ops might make sense even without ->setup. In that case, forbid to newlink and dellink. This allows to register simple rtnl link ops containing only ->kind. That allows consistent way of passing device kind (either device-kind or slave-kind) to userspace. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02	net: fix some typos in comment	Ying Xue	2	-4/+4
	In commit 371121057607e3127e19b3fa094330181b5b031e("net: QDISC_STATE_RUNNING dont need atomic bit ops") the __QDISC_STATE_RUNNING is renamed to __QDISC___STATE_RUNNING, but the old names existing in comment are not replaced with the new name completely. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-01	ipv6: Allow accepting RA from local IP addresses.	Ben Greear	2	-8/+23
	This can be used in virtual networking applications, and may have other uses as well. The option is disabled by default. A specific use case is setting up virtual routers, bridges, and hosts on a single OS without the use of network namespaces or virtual machines. With proper use of ip rules, routing tables, veth interface pairs and/or other virtual interfaces, and applications that can bind to interfaces and/or IP addresses, it is possibly to create one or more virtual routers with multiple hosts attached. The host interfaces can act as IPv6 systems, with radvd running on the ports in the virtual routers. With the option provided in this patch enabled, those hosts can now properly obtain IPv6 addresses from the radvd. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-01	ipv6: Add more debugging around accept-ra logic.	Ben Greear	1	-8/+43
	This is disabled by default, just like similar debug info already in this module. But, makes it easier to find out why RA is not being accepted when debugging strange behaviour. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-30	tcp: tcp_conn_request: fix build error when IPv6 is disabled	Octavian Purdila	1	-1/+3
	Fixes build error introduced by commit 1fb6f159fd21c64 (tcp: add tcp_conn_request): net/ipv4/tcp_input.c: In function 'pr_drop_req': net/ipv4/tcp_input.c:5889:130: error: 'struct sock_common' has no member named 'skc_v6_daddr' Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-28	tcp: add tcp_conn_request	Octavian Purdila	3	-244/+152
	Create tcp_conn_request and remove most of the code from tcp_v4_conn_request and tcp_v6_conn_request. Signed-off-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-28	tcp: add queue_add_hash to tcp_request_sock_ops	Octavian Purdila	2	-2/+4
	Add queue_add_hash member to tcp_request_sock_ops so that we can later unify tcp_v4_conn_request and tcp_v6_conn_request. Signed-off-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>