summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2015-11-03mac80211: Remove WARN_ON_ONCE in ieee80211_recalc_smpsAndrei Otcheretianski1-1/+7
The recalc_smps work can run after the station disassociates. At this stage we already released the channel, but the work will be cancelled only when the interface stops. In this scenario we can hit the warning in ieee80211_recalc_smps, so just remove it. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-11-03mac80211: use freezable workqueue for restart workEliad Peller2-1/+12
Requesting hw restart during suspend might result in the restart work being executed after mac80211 and the hw are suspended. Solve the race by simply scheduling the restart work on a freezable workqueue. Note that there can be some cases of reconfiguration on resume (besides the hardware restart): * wowlan is not configured - All the interfaces removed were removed on suspend, and drv_stop() was called. At this point the driver shouldn't expect for hw_restart anyway, so we can simply cancel it (on resume). * wowlan is configured, drv_resume() == 1 There is no definitive expected behavior in this case, as each driver might have different expectations (e.g. setting some flags on suspend/restart vs. not handling spurious recovery). For now, simply let the hw_restart work run again after resume, and hope the driver will handle it well (or at least initiate another hw restart). Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-11-03mac80211: Fix local deauth while associatingAndrei Otcheretianski1-0/+19
Local request to deauthenticate wasn't handled while associating, thus the association could continue even when the user space required to disconnect. Cc: stable@vger.kernel.org Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-11-03mac80211: allow null chandef in tracingArik Nemtsov1-5/+5
In TDLS channel-switch operations the chandef can sometimes be NULL. Avoid an oops in the trace code for these cases and just print a chandef full of zeros. Cc: stable@vger.kernel.org Fixes: a7a6bdd0670fe ("mac80211: introduce TDLS channel switch ops") Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-11-03nl80211: Fix potential memory leak from parse_acl_dataOla Olsson1-6/+6
If parse_acl_data succeeds but the subsequent parsing of smps attributes fails, there will be a memory leak due to early returns. Fix that by moving the ACL parsing later. Cc: stable@vger.kernel.org Fixes: 18998c381b19b ("cfg80211: allow requesting SMPS mode on ap start") Signed-off-by: Ola Olsson <ola.olsson@sonymobile.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-11-03mac80211: fix divide by zero when NOA updateJanusz.Dziedzic@tieto.com1-0/+7
In case of one shot NOA the interval can be 0, catch that instead of potentially (depending on the driver) crashing like this: divide error: 0000 [#1] SMP [...] Call Trace: <IRQ> [<ffffffffc08e891c>] ieee80211_extend_absent_time+0x6c/0xb0 [mac80211] [<ffffffffc08e8a17>] ieee80211_update_p2p_noa+0xb7/0xe0 [mac80211] [<ffffffffc069cc30>] ath9k_p2p_ps_timer+0x170/0x190 [ath9k] [<ffffffffc070adf8>] ath_gen_timer_isr+0xc8/0xf0 [ath9k_hw] [<ffffffffc0691156>] ath9k_tasklet+0x296/0x2f0 [ath9k] [<ffffffff8107ad65>] tasklet_action+0xe5/0xf0 [...] Cc: stable@vger.kernel.org [3.16+, due to d463af4a1c34 using it] Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-10-22Merge tag 'mac80211-next-for-davem-2015-10-21' of ↵David S. Miller41-659/+910
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Here's another set of patches for the current cycle: * I merged net-next back to avoid a conflict with the * cfg80211 scheduled scan API extensions * preparations for better scan result timestamping * regulatory cleanups * mac80211 statistics cleanups * a few other small cleanups and fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22net: hisilicon: deals with the sub ctrl by sysconyankejian1-10/+11
the global Soc configuration is treated by syscon, and sub ctrl bus is Soc bus. it has to be treated by syscon. Signed-off-by: yankejian <yankejian@huawei.com> Signed-off-by: lisheng <lisheng011@huawei.com> Signed-off-by: lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22Merge branch 'cxgb4-trivial-fixes'David S. Miller3-92/+108
Hariprasad Shenai says: ==================== Trivial fixes for cxgb4 driver This patch series updates driver description for next gen. adapters, updates firmware info., returns error for setup_rss error case, restores L1 configuration in case of FW rejects new config, updates and aligns ethtool get stats settings, etc This patch series has been created against net-next tree and includes patches on cxgb4 and cxgb4vf driver. We have included all the maintainers of respective drivers. Kindly review the change and let us know in case of any review comments. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22cxgb4: Update ethtool get_drvinfo to get regdump lenHariprasad Shenai1-1/+4
Update ethtool get_drvinfo to display regdump len and also update firmware string version print to display N/A in case FW isn't present Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22cxgb4: Use vmalloc, if kmalloc failsHariprasad Shenai1-4/+4
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22cxgb4: Return error if setup_rss is called before probeHariprasad Shenai1-4/+8
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22cxgb4/cxgb4vf: Update driver desc. to include Chelsio T6 adapterHariprasad Shenai2-2/+3
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22cxgb4: Add info print to display number of MSI-X vectors allocatedHariprasad Shenai1-0/+4
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22cxgb4: Restore L1 cfg, if FW rejects new L1 cfg settingsHariprasad Shenai1-4/+11
In the ethtool set_settings() routine we need to remember our old L1 Configuration in case the firmware rejects the request and then restore that. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22cxgb4: Don't disallow turning off auto-negotiationHariprasad Shenai1-4/+1
For {1, 10, 40} Gb/s. Prohibiting turning off autonegotiation isn't anywhere in the standard. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22cxgb4: Align ethtool get stat settingsHariprasad Shenai1-73/+73
Align the ethtool get stats settings with the rest so it looks uniform Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22openvswitch: Use dev_queue_xmit for vport send.Pravin B Shelar8-45/+40
With use of lwtunnel, we can directly call dev_queue_xmit() rather than calling netdev vport send operation. Following change make tunnel vport code bit cleaner. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22openvswitch: Fix incorrect type use.Pravin B Shelar1-3/+3
Patch fixes following sparse warning. net/openvswitch/flow_netlink.c:583:30: warning: incorrect type in assignment (different base types) net/openvswitch/flow_netlink.c:583:30: expected restricted __be16 [usertype] ipv4 net/openvswitch/flow_netlink.c:583:30: got int Fixes: 6b26ba3a7d ("openvswitch: netlink attributes for IPv6 tunneling") Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22Merge branch 'bpf-perf'David S. Miller10-6/+308
Alexei Starovoitov says: ==================== bpf_perf_event_output helper Over the last year there were multiple attempts to let eBPF programs output data into perf events by He Kuang and Wangnan. The last one was: https://lkml.org/lkml/2015/7/20/736 It was almost perfect with exception that all bpf programs would sent data into one global perf_event. This patch set takes different approach by letting user space open independent PERF_COUNT_SW_BPF_OUTPUT events, so that program output won't collide. Wangnan is working on corresponding perf patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22samples: bpf: add bpf_perf_event_output exampleAlexei Starovoitov4-0/+236
Performance test and example of bpf_perf_event_output(). kprobe is attached to sys_write() and trivial bpf program streams pid+cookie into userspace via PERF_COUNT_SW_BPF_OUTPUT event. Usage: $ sudo ./bld_x64/samples/bpf/trace_output recv 2968913 events per sec Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22bpf: introduce bpf_perf_event_output() helperAlexei Starovoitov5-1/+62
This helper is used to send raw data from eBPF program into special PERF_TYPE_SOFTWARE/PERF_COUNT_SW_BPF_OUTPUT perf_event. User space needs to perf_event_open() it (either for one or all cpus) and store FD into perf_event_array (similar to bpf_perf_event_read() helper) before eBPF program can send data into it. Today the programs triggered by kprobe collect the data and either store it into the maps or print it via bpf_trace_printk() where latter is the debug facility and not suitable to stream the data. This new helper replaces such bpf_trace_printk() usage and allows programs to have dedicated channel into user space for post-processing of the raw data collected. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22perf: pad raw data samples automaticallyAlexei Starovoitov1-5/+10
Instead of WARN_ON in perf_event_output() on unpaded raw samples, pad them automatically. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22ipvlan: read direct ifindex instead of iflinkBrenden Blanco1-2/+2
In the ipv4 outbound path of an ipvlan device in l3 mode, the ifindex is being grabbed from dev_get_iflink. This works for the physical device case, since as the documentation of that function notes: "Physical interfaces have the same 'ifindex' and 'iflink' values.". However, if the master device is a veth, and the pairs are in separate net namespaces, the route lookup will fail with -ENODEV due to outer veth pair being in a separate namespace from the ipvlan master/routing namespace. ns0 | ns1 | ns2 veth0a--|--veth0b--|--ipvl0 In ipvlan_process_v4_outbound(), a packet sent from ipvl0 in the above configuration will pass fl.flowi4_oif == veth0a to ip_route_output_flow(), but *net == ns1. Notice also that ipv6 processing is not using iflink. Since there is a discrepancy in usage, fixup both v4 and v6 case to use local dev variable. Tested this with l3 ipvlan on top of veth, as well as with single physical interface in the top namespace. Signed-off-by: Brenden Blanco <bblanco@plumgrid.com> Reviewed-by: Jiri Benc <jbenc@redhat.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22tcp: fastopen: limit max_qlenEric Dumazet1-1/+2
Allowing an application to set whatever limit for the list of recently RST fastopen sessions [1] is not wise, as it open ways to deplete kernel memory. Cap the user provided limit by somaxconn sysctl, like listen() backlog. [1] https://tools.ietf.org/html/rfc7413#section-5.1 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22net: mdio-gpio: move platform data headerVivien Didelot2-1/+1
This header file only contains the platform data structure definition, so move it to the include/linux/platform_data/ directory. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22ARM: gemini: remove unnecessary mdio-gpio includesVivien Didelot3-3/+0
Remove the inclusion of linux/mdio-gpio.h in nas4220b, wbd111 and wbd222 boards since mdio-gpio is not used. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22net: hisilicon: fix ptr_ret.cocci warningsWu Fengguang1-4/+1
drivers/net/ethernet/hisilicon/hns/hnae.c:442:1-3: WARNING: PTR_ERR_OR_ZERO can be used Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR Generated by: scripts/coccinelle/api/ptr_ret.cocci CC: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22ipv6: gro: support sit protocolEric Dumazet1-0/+12
Tom Herbert added SIT support to GRO with commit 19424e052fb4 ("sit: Add gro callbacks to sit_offload"), later reverted by Herbert Xu. The problem came because Tom patch was building GRO packets without proper meta data : If packets were locally delivered, we would not care. But if packets needed to be forwarded, GSO engine was not able to segment individual segments. With the following patch, we correctly set skb->encapsulation and inner network header. We also update gso_type. Tested: Server : netserver modprobe dummy ifconfig dummy0 8.0.0.1 netmask 255.255.255.0 up arp -s 8.0.0.100 4e:32:51:04:47:e5 iptables -I INPUT -s 10.246.7.151 -j TEE --gateway 8.0.0.100 ifconfig sixtofour0 sixtofour0 Link encap:IPv6-in-IPv4 inet6 addr: 2002:af6:798::1/128 Scope:Global inet6 addr: 2002:af6:798::/128 Scope:Global UP RUNNING NOARP MTU:1480 Metric:1 RX packets:411169 errors:0 dropped:0 overruns:0 frame:0 TX packets:409414 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:20319631739 (20.3 GB) TX bytes:29529556 (29.5 MB) Client : netperf -H 2002:af6:798::1 -l 1000 & Checked on server traffic copied on dummy0 and verify segments were properly rebuilt, with proper IP headers, TCP checksums... tcpdump on eth0 shows proper GRO aggregation takes place. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22net: dummy: add more featuresEric Dumazet1-1/+5
While testing my SIT/GRO patch using netfilter TEE module and a dummy device, I found some features were missing : TSO IPv6, UFO, and encapsulated traffic. ethtool -k dummy0 now gives : ... tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: on tx-tcp6-segmentation: on udp-fragmentation-offload: on ... tx-gre-segmentation: on tx-ipip-segmentation: on tx-sit-segmentation: on tx-udp_tnl-segmentation: on Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-22netlink: Rightsize IFLA_AF_SPEC size calculationArad, Ronen5-28/+11
if_nlmsg_size() overestimates the minimum allocation size of netlink dump request (when called from rtnl_calcit()) or the size of the message (when called from rtnl_getlink()). This is because ext_filter_mask is not supported by rtnl_link_get_af_size() and rtnl_link_get_size(). The over-estimation is significant when at least one netdev has many VLANs configured (8 bytes for each configured VLAN). This patch-set "rightsizes" the protocol specific attribute size calculation by propagating ext_filter_mask to rtnl_link_get_af_size() and adding this a argument to get_link_af_size op in rtnl_af_ops. Bridge module already used filtering aware sizing for notifications. br_get_link_af_size_filtered() is consistent with the modified get_link_af_size op so it replaces br_get_link_af_size() in br_af_ops. br_get_link_af_size() becomes unused and thus removed. Signed-off-by: Ronen Arad <ronen.arad@intel.com> Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21Adding switchdev ageing notification on port bridgedElad Raz1-0/+12
Configure ageing time to the HW for newly bridged device CC: Scott Feldman <sfeldma@gmail.com> CC: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Elad Raz <eladr@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21Merge branch 'tcp-rack'David S. Miller11-86/+286
Yuchung Cheng says: ==================== RACK loss detection RACK (Recent ACK) loss recovery uses the notion of time instead of packet sequence (FACK) or counts (dupthresh). It's inspired by the FACK heuristic in tcp_mark_lost_retrans(): when a limited transmit (new data packet) is sacked in recovery, then any retransmission sent before that newly sacked packet was sent must have been lost, since at least one round trip time has elapsed. But that existing heuristic from tcp_mark_lost_retrans() has several limitations: 1) it can't detect tail drops since it depends on limited transmit 2) it's disabled upon reordering (assumes no reordering) 3) it's only enabled in fast recovery but not timeout recovery RACK addresses these limitations with a core idea: an unacknowledged packet P1 is deemed lost if a packet P2 that was sent later is is s/acked, since at least one round trip has passed. Since RACK cares about the time sequence instead of the data sequence of packets, it can detect tail drops when a later retransmission is s/acked, while FACK or dupthresh can't. For reordering RACK uses a dynamically adjusted reordering window ("reo_wnd") to reduce false positives on ever (small) degree of reordering, similar to the delayed Early Retransmit. In the current patch set RACK is only a supplemental loss detection and does not trigger fast recovery. However we are developing RACK to replace or consolidate FACK/dupthresh, early retransmit, and thin-dupack. These heuristics all implicitly bear the time notion. For example, the delayed Early Retransmit is simply applying RACK to trigger the fast recovery with small inflight. RACK requires measuring the minimum RTT. Tracking a global min is less robust due to traffic engineering pathing changes. Therefore it uses a windowed filter by Kathleen Nichols. The min RTT can also be useful for various other purposes like congestion control or stat monitoring. This patch has been used on Google servers for well over 1 year. RACK has also been implemented in the QUIC protocol. We are submitting an IETF draft as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21tcp: use RACK to detect lossesYuchung Cheng5-2/+109
This patch implements the second half of RACK that uses the the most recent transmit time among all delivered packets to detect losses. tcp_rack_mark_lost() is called upon receiving a dubious ACK. It then checks if an not-yet-sacked packet was sent at least "reo_wnd" prior to the sent time of the most recently delivered. If so the packet is deemed lost. The "reo_wnd" reordering window starts with 1msec for fast loss detection and changes to min-RTT/4 when reordering is observed. We found 1msec accommodates well on tiny degree of reordering (<3 pkts) on faster links. We use min-RTT instead of SRTT because reordering is more of a path property but SRTT can be inflated by self-inflicated congestion. The factor of 4 is borrowed from the delayed early retransmit and seems to work reasonably well. Since RACK is still experimental, it is now used as a supplemental loss detection on top of existing algorithms. It is only effective after the fast recovery starts or after the timeout occurs. The fast recovery is still triggered by FACK and/or dupack threshold instead of RACK. We introduce a new sysctl net.ipv4.tcp_recovery for future experiments of loss recoveries. For now RACK can be disabled by setting it to 0. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21tcp: track the packet timings in RACKYuchung Cheng6-0/+60
This patch is the first half of the RACK loss recovery. RACK loss recovery uses the notion of time instead of packet sequence (FACK) or counts (dupthresh). It's inspired by the previous FACK heuristic in tcp_mark_lost_retrans(): when a limited transmit (new data packet) is sacked, then current retransmitted sequence below the newly sacked sequence must been lost, since at least one round trip time has elapsed. But it has several limitations: 1) can't detect tail drops since it depends on limited transmit 2) is disabled upon reordering (assumes no reordering) 3) only enabled in fast recovery ut not timeout recovery RACK (Recently ACK) addresses these limitations with the notion of time instead: a packet P1 is lost if a later packet P2 is s/acked, as at least one round trip has passed. Since RACK cares about the time sequence instead of the data sequence of packets, it can detect tail drops when later retransmission is s/acked while FACK or dupthresh can't. For reordering RACK uses a dynamically adjusted reordering window ("reo_wnd") to reduce false positives on ever (small) degree of reordering. This patch implements tcp_advanced_rack() which tracks the most recent transmission time among the packets that have been delivered (ACKed or SACKed) in tp->rack.mstamp. This timestamp is the key to determine which packet has been lost. Consider an example that the sender sends six packets: T1: P1 (lost) T2: P2 T3: P3 T4: P4 T100: sack of P2. rack.mstamp = T2 T101: retransmit P1 T102: sack of P2,P3,P4. rack.mstamp = T4 T205: ACK of P4 since the hole is repaired. rack.mstamp = T101 We need to be careful about spurious retransmission because it may falsely advance tp->rack.mstamp by an RTT or an RTO, causing RACK to falsely mark all packets lost, just like a spurious timeout. We identify spurious retransmission by the ACK's TS echo value. If TS option is not applicable but the retransmission is acknowledged less than min-RTT ago, it is likely to be spurious. We refrain from using the transmission time of these spurious retransmissions. The second half is implemented in the next patch that marks packet lost using RACK timestamp. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21tcp: skb_mstamp_after helperYuchung Cheng1-0/+9
a helper to prepare the first main RACK patch. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21tcp: add tcp_tsopt_ecr_before helperYuchung Cheng1-2/+7
a helper to prepare the main RACK patch Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21tcp: remove tcp_mark_lost_retrans()Yuchung Cheng3-73/+0
Remove the existing lost retransmit detection because RACK subsumes it completely. This also stops the overloading the ack_seq field of the skb control block. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21tcp: track min RTT using windowed min-filterYuchung Cheng7-5/+100
Kathleen Nichols' algorithm for tracking the minimum RTT of a data stream over some measurement window. It uses constant space and constant time per update. Yet it almost always delivers the same minimum as an implementation that has to keep all the data in the window. The measurement window is tunable via sysctl.net.ipv4.tcp_min_rtt_wlen with a default value of 5 minutes. The algorithm keeps track of the best, 2nd best & 3rd best min values, maintaining an invariant that the measurement time of the n'th best >= n-1'th best. It also makes sure that the three values are widely separated in the time window since that bounds the worse case error when that data is monotonically increasing over the window. Upon getting a new min, we can forget everything earlier because it has no value - the new min is less than everything else in the window by definition and it's the most recent. So we restart fresh on every new min and overwrites the 2nd & 3rd choices. The same property holds for the 2nd & 3rd best. Therefore we have to maintain two invariants to maximize the information in the samples, one on values (1st.v <= 2nd.v <= 3rd.v) and the other on times (now-win <=1st.t <= 2nd.t <= 3rd.t <= now). These invariants determine the structure of the code The RTT input to the windowed filter is the minimum RTT measured from ACK or SACK, or as the last resort from TCP timestamps. The accessor tcp_min_rtt() returns the minimum RTT seen in the window. ~0U indicates it is not available. The minimum is 1usec even if the true RTT is below that. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21tcp: apply Kern's check on RTTs used for congestion controlYuchung Cheng1-4/+1
Currently ca_seq_rtt_us does not use Kern's check. Fix that by checking if any packet acked is a retransmit, for both RTT used for RTT estimation and congestion control. Fixes: 5b08e47ca ("tcp: prefer packet timing to TS-ECR for RTT") Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21Merge branch 'master' of ↵David S. Miller16-219/+611
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2015-10-19 This series contains updates to i40e and i40evf only. Kiran adds a spinlock around code accessing VSI MAC filter list to ensure that we are synchronizing access to the filter list, otherwise we can end up with multiple accesses at the same time which can cause the VSI MAC filter list to get in an unstable or corrupted state. Jesse fixes overlong BIT defines, where the RSS enabling call were mistakenly missed. Also fixes a bug where the enable function was enabling the interrupt twice while trying to update the two interrupt throttle rate thresholds for Rx and Tx, while refactoring the IRQ enable function to simplify reading the flow. Addressed the high CPU utilization of some small streaming workloads that the driver should reduce CPU in. Anjali fixes two X722 issues with respect to EEPROM checksum verify and reading NVM version info. Fixed where a mask value was accidentally replaced with a bit mask causing Flow Director sideband to be broken. Alex Duyck fixes areas of the drivers which run from hard interrupt context or with interrupts already disabled in netpoll, so use napi_schedule_irqoff() instead of napi_schedule(). Mitch fixes the VF drivers to not easily give up when it is not able to communicate with the PF driver. Carolyn fixes a problem where our tools MAC loopback test, after driver unbind would fail because the hardware was configured for multiqueue and unbind operation did not clear this configuration. Also fixed a issue where the NVMUpdate tool gets bad data from the PHY when using the PHY NVM feature because of contention on the MDIO interface from getting PHY capability calls from the driver during regular operations. Catherine fixed an issue where we were checking if autoneg was allowed to change before checking if autoneg was changing, these checks need to be in the reverse order. Jean Sacren fixes up an function header comment to align the kernel-docs with the actual code. v2: Cleaned up the use of spin_is_locked() in patch 1 based on feedback from David Miller, since it always evaluates to zero on uni-processor builds ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-21mac80211: move station statistics into sub-structsJohannes Berg11-185/+178
Group station statistics by where they're (mostly) updated (TX, RX and TX-status) and group them into sub-structs of the struct sta_info. Also rename the variables since the grouping now makes it obvious where they belong. This makes it easier to identify where the statistics are updated in the code, and thus easier to think about them. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-10-21mac80211: move beacon_loss_count into ifmgdJohannes Berg5-14/+12
There's little point in keeping (and even sending to userspace) the beacon_loss_count value per station, since it can only apply to the AP on a managed-mode connection. Move the value to ifmgd, advertise it only in managed mode, and remove it from ethtool as it's available through better interfaces. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-10-21mac80211: remove sta->last_ack_signalJohannes Berg3-7/+0
This file only feeds a debugfs file that isn't very useful, so remove it. If necessary, we can add other ways to get this information, for example in the NL80211_CMD_PROBE_CLIENT response. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-10-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller510-2206/+4030
Conflicts: drivers/net/usb/asix_common.c net/ipv4/inet_connection_sock.c net/switchdev/switchdev.c In the inet_connection_sock.c case the request socket hashing scheme is completely different in net-next. The other two conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-20i40e/i40evf: Bump i40e to 1.3.38 and i40evf to 1.3.25Catherine Sullivan2-2/+2
Bump. Change-ID: Id0a7ecaa491f88ce94c9eba4901e592a56044ee0 Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-20i40e: declare rather than initialize int objectJean Sacren1-1/+1
'err' would be overwritten immediately, so we should declare it only rather than initialize it to zero. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-20i40e: fix kernel-doc argument nameJean Sacren1-1/+1
The second argument name in the kernel-doc argument list for i40e_features_check() was slightly off. Fix it for the kernel doc. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-20i40e: Move error message to debug levelCatherine Sullivan1-3/+3
There is an error coming back from get_phy_capabilities that does not seem to have any functional implications. We will continue looking into why this error message is occurring, but in the meantime, we will move it to debug to avoid confusion. Change-ID: I9091754bf62c066ddedeb249923d85606e2d68ed Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-10-20i40e: Fix order of checks when enabling/disabling autoneg in ethtoolCatherine Sullivan1-13/+16
We were previously checking if autoneg was allowed to change before checking if autoneg was changing. We need to do this in the other order or else we will erroneously return EINVAL when autoneg is not changing. Change-ID: Iff9f7d1c9bddc1ad1e5d227d4f42754f90155410 Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>