kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2015-10-28	igb: do not re-init SR-IOV during probe	Stefan Assmann	1	-1/+1
	[ Upstream commit 6423fc34160939142d72ffeaa2db6408317f54df ] During driver probing the following code path is triggered. igb_probe ->igb_sw_init ->igb_probe_vfs ->igb_pci_enable_sriov ->igb_sriov_reinit Doing the SR-IOV re-init is not necessary during probing since we're starting from scratch. Here we can call igb_enable_sriov() right away. Running igb_sriov_reinit() during igb_probe() also seems to cause occasional packet loss on some onboard 82576 NICs. Reproduced on Dell and HP servers with onboard 82576 NICs. Example: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01) Subsystem: Dell Device [1028:0481] Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-10-27	igb: Fix oops caused by missing queue pairing	Shota Suzuki	3	-2/+20
	[ Upstream commit 72ddef0506da852dc82f078f37ced8ef4d74a2bf ] When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is set if adapter->rss_queues exceeds half of max_rss_queues in igb_init_queue_configuration(). On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of queues exceeds half of max_combined in igb_set_channels() when changing the number of queues by "ethtool -L". In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size of adapter->msix_entries[], an overflow can occur in igb_set_interrupt_capability(), which in turn leads to an oops. Fix this problem as follows: - When changing the number of queues by "ethtool -L", set IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver. - When increasing the size of q_vector, reallocate it appropriately. (With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.) Another possible way to fix this problem is to cap the queues at its initial number, which is the number of the initial online cpus. But this is not the optimal way because we cannot increase queues when another cpu becomes online. Note that before commit cd14ef54d25b ("igb: Change to use statically allocated array for MSIx entries"), this problem did not cause oops but just made the number of queues become 1 because of entering msi_only mode in igb_set_interrupt_capability(). Fixes: 907b7835799f ("igb: Add ethtool support to configure number of channels") CC: stable <stable@vger.kernel.org> Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-05-18	e1000: add dummy allocator to fix race condition between mtu change and netpoll	Sabrina Dubroca	1	-1/+9
	[ Upstream commit 08e8331654d1d7b2c58045e549005bc356aa7810 ] There is a race condition between e1000_change_mtu's cleanups and netpoll, when we change the MTU across jumbo size: Changing MTU frees all the rx buffers: e1000_change_mtu -> e1000_down -> e1000_clean_all_rx_rings -> e1000_clean_rx_ring Then, close to the end of e1000_change_mtu: pr_info -> ... -> netpoll_poll_dev -> e1000_clean -> e1000_clean_rx_irq -> e1000_alloc_rx_buffers -> e1000_alloc_frag And when we come back to do the rest of the MTU change: e1000_up -> e1000_configure -> e1000_configure_rx -> e1000_alloc_jumbo_rx_buffers alloc_jumbo finds the buffers already != NULL, since data (shared with page in e1000_rx_buffer->rxbuf) has been re-alloc'd, but it's garbage, or at least not what is expected when in jumbo state. This results in an unusable adapter (packets don't get through), and a NULL pointer dereference on the next call to e1000_clean_rx_ring (other mtu change, link down, shutdown): BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81194d6e>] put_compound_page+0x7e/0x330 [...] Call Trace: [<ffffffff81195445>] put_page+0x55/0x60 [<ffffffff815d9f44>] e1000_clean_rx_ring+0x134/0x200 [<ffffffff815da055>] e1000_clean_all_rx_rings+0x45/0x60 [<ffffffff815df5e0>] e1000_down+0x1c0/0x1d0 [<ffffffff811e2260>] ? deactivate_slab+0x7f0/0x840 [<ffffffff815e21bc>] e1000_change_mtu+0xdc/0x170 [<ffffffff81647050>] dev_set_mtu+0xa0/0x140 [<ffffffff81664218>] do_setlink+0x218/0xac0 [<ffffffff814459e9>] ? nla_parse+0xb9/0x120 [<ffffffff816652d0>] rtnl_newlink+0x6d0/0x890 [<ffffffff8104f000>] ? kvm_clock_read+0x20/0x40 [<ffffffff810a2068>] ? sched_clock_cpu+0xa8/0x100 [<ffffffff81663802>] rtnetlink_rcv_msg+0x92/0x260 By setting the allocator to a dummy version, netpoll can't mess up our rx buffers. The allocator is set back to a sane value in e1000_configure_rx. Fixes: edbbb3ca1077 ("e1000: implement jumbo receive with partial descriptors") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
2015-01-27	i40e: adds FCoE configure option	Vasu Dev	3	-3/+14
	commit 776d4e9f5c0229037707f692b386b1f2a5bac054 upstream. Adds FCoE config option I40E_FCOE, so that FCoE can be enabled as needed but otherwise have it disabled by default. This also eliminate multiple FCoE config checks, instead now just one config check for CONFIG_I40E_FCOE. The I40E FCoE was added with 3.17 kernel and therefore this patch shall be applied to stable 3.17 kernel also. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-26	net: Check for presence of IFLA_AF_SPEC	Thomas Graf	1	-0/+2
	ndo_bridge_setlink() is currently only called on the slave if IFLA_AF_SPEC is set but this is a very fragile assumption and may change in the future. Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26	net: Validate IFLA_BRIDGE_MODE attribute length	Thomas Graf	1	-0/+3
	Payload is currently accessed blindly and may exceed valid message boundaries. Fixes: a77dcb8c8 ("be2net: set and query VEB/VEPA mode of the PF interface") Fixes: 815cccbf1 ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf") Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-23	igb: Fixes needed for surprise removal support	Carolyn Wyborny	1	-7/+16
	This patch adds some checks in order to prevent panic's on surprise removal of devices during S0, S3, S4. Without this patch, Thunderbolt type device removal will panic the system. Signed-off-by: Yanir Lubetkin <yanirx.lubetkin@intel.com> Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-23	ixgbe: fix use after free adapter->state test in ixgbe_remove/ixgbe_probe	Daniel Borkmann	1	-2/+6
	While working on a different issue, I noticed an annoying use after free bug on my machine when unloading the ixgbe driver: [ 8642.318797] ixgbe 0000:02:00.1: removed PHC on p2p2 [ 8642.742716] ixgbe 0000:02:00.1: complete [ 8642.743784] BUG: unable to handle kernel paging request at ffff8807d3740a90 [ 8642.744828] IP: [<ffffffffa01c77dc>] ixgbe_remove+0xfc/0x1b0 [ixgbe] [ 8642.745886] PGD 20c6067 PUD 81c1f6067 PMD 81c15a067 PTE 80000007d3740060 [ 8642.746956] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC [ 8642.748039] Modules linked in: [...] [ 8642.752929] CPU: 1 PID: 1225 Comm: rmmod Not tainted 3.18.0-rc2+ #49 [ 8642.754203] Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 1.1b 11/01/2013 [ 8642.755505] task: ffff8807e34d3fe0 ti: ffff8807b7204000 task.ti: ffff8807b7204000 [ 8642.756831] RIP: 0010:[<ffffffffa01c77dc>] [<ffffffffa01c77dc>] ixgbe_remove+0xfc/0x1b0 [ixgbe] [...] [ 8642.774335] Stack: [ 8642.775805] ffff8807ee824098 ffff8807ee824098 ffffffffa01f3000 ffff8807ee824000 [ 8642.777326] ffff8807b7207e18 ffffffff8137720f ffff8807ee824098 ffff8807ee824098 [ 8642.778848] ffffffffa01f3068 ffff8807ee8240f8 ffff8807b7207e38 ffffffff8144180f [ 8642.780365] Call Trace: [ 8642.781869] [<ffffffff8137720f>] pci_device_remove+0x3f/0xc0 [ 8642.783395] [<ffffffff8144180f>] __device_release_driver+0x7f/0xf0 [ 8642.784876] [<ffffffff814421f8>] driver_detach+0xb8/0xc0 [ 8642.786352] [<ffffffff814414a9>] bus_remove_driver+0x59/0xe0 [ 8642.787783] [<ffffffff814429d0>] driver_unregister+0x30/0x70 [ 8642.789202] [<ffffffff81375c65>] pci_unregister_driver+0x25/0xa0 [ 8642.790657] [<ffffffffa01eb38e>] ixgbe_exit_module+0x1c/0xc8e [ixgbe] [ 8642.792064] [<ffffffff810f93a2>] SyS_delete_module+0x132/0x1c0 [ 8642.793450] [<ffffffff81012c61>] ? do_notify_resume+0x61/0xa0 [ 8642.794837] [<ffffffff816d2029>] system_call_fastpath+0x12/0x17 The issue is that test_and_set_bit() done on adapter->state is being performed after the netdevice has been freed via free_netdev(). When netdev is being allocated on initialization time, it allocates a private area, here struct ixgbe_adapter, that resides after the net_device structure. In ixgbe_probe(), the device init routine, we set up the adapter after alloc_etherdev_mq() on the private area and add a reference for the pci_dev as well via pci_set_drvdata(). Both in the error path of ixgbe_probe(), but also on module unload when ixgbe_remove() is being called, commit 41c62843eb6a ("ixgbe: Fix rcu warnings induced by LER") accesses adapter after free_netdev(). The patch stores the result in a bool and thus fixes above oops on my side. Fixes: 41c62843eb6a ("ixgbe: Fix rcu warnings induced by LER") Cc: stable <stable@vger.kernel.org> Cc: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-23	ixgbe: Correctly disable VLAN filter in promiscuous mode	Vlad Yasevich	1	-2/+2
	IXGBE adapter seems to require that VLAN filtering be enabled if VMDQ or SRIOV are enabled. When those functions are disabled, VLAN filtering may be disabled in promiscuous mode. Prior to commit a9b8943ee129 ("ixgbe: remove vlan_filter_disable and enable functions") The logic was correct. However, after the commit the logic got reversed and VLAN filtered in now turned on when VMDQ/SRIOV is disabled. This patch changes the condition to enable hw vlan filtered when VMDQ or SRIOV is enabled. Fixes: a9b8943ee129 ("ixgbe: remove vlan_filter_disable and enable functions") Cc: stable <stable@vger.kernel.org> CC: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-12	ixgbe: phy: fix uninitialized status in ixgbe_setup_phy_link_tnx	Daniel Borkmann	1	-3/+1
	Status variable is never initialized, can carry an arbitrary value on the stack and thus may let the function fail. Fixes: e90dd2645664 ("ixgbe: Make return values more direct") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-30	ixgbe: fix race when setting advertised speed	Emil Tantilov	1	-0/+4
	Following commands: modprobe ixgbe ifconfig ethX up ethtool -s ethX advertise 0x020 can lead to "setup link failed with code -14" error due to the setup_link call racing with the SFP detection routine in the watchdog. This patch resolves this issue by protecting the setup_link call with check for __IXGBE_IN_SFP_INIT. Reported-by: Scott Harrison <scoharr2@cisco.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-30	ixgbe: need not repeat init skb with NULL	Junwei Zhang	1	-1/+1
	Signed-off-by: Martin Zhang <martinbj2008@gmail.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-30	igb: don't reuse pages with pfmemalloc flag	Roman Gushchin	1	-1/+5
	Incoming packet is dropped silently by sk_filter(), if the skb was allocated from pfmemalloc reserves and the corresponding socket is not marked with the SOCK_MEMALLOC flag. Igb driver allocates pages for DMA with __skb_alloc_page(), which calls alloc_pages_node() with the __GFP_MEMALLOC flag. So, in case of OOM condition, igb can get pages with pfmemalloc flag set. If an incoming packet hits the pfmemalloc page and is large enough (small packets are copying into the memory, allocated with netdev_alloc_skb_ip_align(), so they are not affected), it will be dropped. This behavior is ok under high memory pressure, but the problem is that the igb driver reuses these mapped pages. So, packets are still dropping even if all memory issues are gone and there is a plenty of free memory. In my case, some TCP sessions hang on a small percentage (< 0.1%) of machines days after OOMs. Fix this by avoiding reuse of such pages. Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> Tested-by: Aaron Brown "aaron.f.brown@intel.com" Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-30	e1000: unset IFF_UNICAST_FLT on WMware 82545EM	Francesco Ruggeri	1	-1/+4
	VMWare's e1000 implementation does not seem to support unicast filtering. This can be observed by configuring a macvlan interface on eth0 in a VM in VMWare Fusion 5.0.5, and trying to use that interface instead of eth0. Tested on 3.16. Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-26	i40e: _MASK vs _SHIFT typo in i40e_handle_mdd_event()	Dan Carpenter	1	-2/+2
	We accidentally mask by the _SHIFT variable. It means that "event" is always zero. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-16	ixgbe: check for vfs outside of sriov_num_vfs before dereference	Emil Tantilov	1	-0/+3
	The check for vfinfo is not sufficient because it does not protect against specifying vf that is outside of sriov_num_vfs range. All of the ndo functions have a check for it except for ixgbevf_ndo_set_spoofcheck(). The following patch is all we need to protect against this panic: ip link set p96p1 vf 0 spoofchk off BUG: unable to handle kernel NULL pointer dereference at 0000000000000052 IP: [<ffffffffa044a1c1>] ixgbe_ndo_set_vf_spoofchk+0x51/0x150 [ixgbe] Reported-by: Thierry Herbelot <thierry.herbelot@6wind.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Acked-by: Thierry Herbelot <thierry.herbelot@6wind.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-16	fm10k: Add CONFIG_FM10K_VXLAN configuration option	Andy Zhou	2	-3/+14
	Compiling with CONFIG_FM10K=y and VXLAN=m resulting in linking error: drivers/built-in.o: In function `fm10k_open': (.text+0x1f9d7a): undefined reference to `vxlan_get_rx_port' make: *** [vmlinux] Error 1 The fix follows the same strategy as I40E. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-16	fm10k: Unlock mailbox on VLAN addition failures	Matthew Vick	1	-3/+4
	After grabbing the mailbox lock and detecting an error, the lock must be released before the error code can be returned. Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-16	fm10k: Check the host state when bringing the interface up	Matthew Vick	1	-0/+1
	Set the flag to fetch the host state before kicking off the service task that reads the host state when bringing the interface back up. Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-14	fm10k: Add skb->xmit_more support	Alexander Duyck	1	-31/+34
	This change adds support for skb->xmit_more based on the changes that were made to igb to support the feature. The main changes are moving up the check for maybe_stop_tx so that we can check netif_xmit_stopped to determine if we must write the tail because we can add no further buffers. Acked-by: Matthew Vick <matthew.vick@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-10	ixgbe: fix race accessing page->_count	Eric Dumazet	1	-5/+3
	This is illegal to use atomic_set(&page->_count, 2) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-10	igb: fix race accessing page->_count	Eric Dumazet	1	-4/+3
	This is illegal to use atomic_set(&page->_count, 2) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-10	fm10k: fix race accessing page->_count	Eric Dumazet	1	-4/+3
	This is illegal to use atomic_set(&page->_count, 2) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-10	networking: fm10k: Fix build failure	Pranith Kumar	1	-0/+1
	The latest linus git tip (3.18-rc1) fails with the following build failure. Fix this by making PTP support explicit for fm10k driver. rivers/built-in.o: In function `fm10k_ptp_register': (.text+0x12e760): undefined reference to `ptp_clock_registER' drivers/built-in.o: In function `fm10k_ptp_unregister': (.text+0x12e7dc): undefined reference to `ptp_clock_unregister' Makefile:930: recipe for target 'vmlinux' failed Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-09	i40e: skb->xmit_more support	Eric Dumazet	1	-44/+46
	Support skb->xmit_more in i40e is straightforward : we need to move around i40e_maybe_stop_tx() call to correctly test netif_xmit_stopped() before taking the decision to not kick the NIC. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-02	igb: bump version to 5.2.15	Todd Fujinaka	1	-1/+1
	Bump version Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02	i40e/igb: Convert to dev_consume_skb_any()	Rick Jones	2	-2/+2
	Convert two more Intel NIC drivers to dev_consume_skb_any() to help make dropped packet profiling sane. Signed-off-by: Rick Jones <rick.jones2@hp.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02	igb: remove blocking phy read from inside spinlock	Bernhard Kaindl	3	-18/+0
	Remove a source of latency spikes (in my case up to 10ms) by not calling code that uses mdelay() for feeding a phy statistic (rx errors for idle symbols - not data -> idle_errors) while being called with a spinlock held. As idle_errors isn't read, this patch only removes unused code and data. Later, more complicated changes may be applied to address the spinlock and allow for some PHY diagnostics by harvesting this PHY stats register fully. This patch is designed to fix the issue and be safe for longterm/stable. For the Intel e1000e driver, the same change was applied in 2008 with commit 23033fad5be0 ("e1000e: remove phy read from inside spinlock"). The mdelay is triggered by HW/SW semaphores, thus it depends on the HW. I've HW that triggers it even when idle. Others may trigger it only e.g. when Ethernet ports aquire or loose the link or on ifconfig up / down. We've noticed this first from delays in frame rx/tx due to the mdelay(). Example command for checking if the issue is triggered: cyclictest -Smp1 (Look for occasional "Max:" values > 4000 or use -b 4000 to stop if greater) It was observed with I350 ports connected to other I350 ports, but not if driver and EEPROM was modified to run the I350 in EEPROM-less mode. phy_stats.idle_errors and .receive_errors (isn't touched) occupy 64 not used bits in the adapter struct: Their allocation may be removed as well. Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Todd Fujinaka <todd.fujinaka@intel.com> Fixes: 12dcd86b75d5 ("igb: fix stats handling") (this added the spin_lock) Signed-off-by: Bernhard Kaindl <bk-linux@use.startmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02	ixgbe: delete one duplicate marcro definition of IXGBE_MAX_L2A_QUEUES	Ethan Zhao	1	-1/+0
	There is typo in ixgbe.h, two marcro definition of IXGBE_MAX_L2A_QUEUES to 4, delete one, clear the compiler warning. Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02	ixgbe: fix setting of TXDCTL.WTRHESH when ITR is set to 0 and no BQL	Emil Tantilov	2	-6/+1
	This patch consolidates the logic behind dynamically setting TXDCTL.WTHRESH depending on interrupt throttle rate (ITR) setting regardless of BQL. Previously TXDCTL.WTHRESH was dynamically being set only with BQL being enabled, but we have to set it regardless of BQL when ITR is low to avoid Tx stalls/hangs. CC: John Greene <jogreene@redhat.com> Reported by: Masayuki Gouji <gouji.masayuki@jp.fujitsu.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02	ixgbe: remove wait loop on autoneg for copper devices	Emil Tantilov	1	-41/+0
	This patch removes couple of wait loops on autoneg that are not needed. During validation we noticed that the loops always time out, so there should be no user impact. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02	ixgbe: Convert the normal transmit complete path to dev_consume_skb_any()	Rick Jones	1	-1/+1
	Convert the normal packet completion path to dev_consume_skb_any() so packet drop profiling via dropwatch or perf top -G -e skb_kfree_skb is not cluttered with false hits. Compile tested only. There is a dev_kfree_skb_any() in the routine ixgbe_ptp_tx_hwtstamp() in ixgbe_ptp.c that looks like a conversion candidate but I wasn't familiar enough with the code to pull the trigger. Signed-off-by: Rick Jones <rick.jones2@hp.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02	fm10k: Correctly set the number of Tx queues	Alexander Duyck	1	-1/+5
	The number of Tx queues was not being updated due to some issues when generating the patches. This change makes sure to add the lines necessary to update the number of Tx queues correctly. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02	fm10k: Reduce buffer size when pages are larger than 4K	Alexander Duyck	1	-6/+2
	This change reduces the buffer size to 2K for all page sizes. The basic idea is that since most frames only have a 1500 MTU supporting a buffer size larger than this is somewhat wasteful. As such I have reduced the size to 2K for all page sizes which will allow for more uses per page. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-01	fm10k: using vmalloc requires including linux/vmalloc.h	Stephen Rothwell	1	-0/+2
	Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-23	fm10k: Add support for PTP	Alexander Duyck	7	-1/+683
	This change adds support for the Linux PTP Hardware clock and timestamping functionality provided by the hardware. There are actually two cases that this timestamping is meant to support. The first case would be an ordinary clock scenario. In this configuration the host interface does not have access to BAR 4. However all of the host interfaces should be locked into the same boundary clock region and as such they are all on the same clock anyway. With this being the case they can synchronize among themselves and only need to adjust the offset since they are all on the same clock with the same frequency. The second case is a boundary clock scenario. This is a special case and would require both BAR 4 access, and a means of presenting a netdev per boundary region. The current plan is to use DSA at some point in the future to provide these interfaces, but the DSA portion is still under development. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add support for ptp to hw specific files	Alexander Duyck	5	-0/+186
	This change adds the messaging support needed to support PTP. In the case of Tx timestamps it is necessary for the Switch Management entity to return the frames via the mailbox as the host interface cannot know which port the timestamp will be delivered to. In addition there is only one clock on the entire switch, as such the entity that has BAR 4 access is the only one who can actually update the frequency as it is the only one with access. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add support for debugfs	Alexander Duyck	5	-1/+298
	This patch adds limited debugfs support for the driver. Most of the functionality needed for dumping registers is already provided via ethtool. The only thing we saw that we really neeed was the ability to dump the descriptor rings so as such this patch will add a fm10k directory containing a listing of directories each one with a unique PCI Bus, Device, and Function number. Each of those BDF directories will have a list of q_vectors, and the q_vectors will contain a file for each of the Rx/Tx rings that are a part of the vector. For example: # ls -RD /sys/kernel/debug/fm10k/ /sys/kernel/debug/fm10k/: 0000:01:00.0 /sys/kernel/debug/fm10k/0000:01:00.0: q_vector.000 q_vector.001 q_vector.002 q_vector.003 /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000: rx_ring.000 tx_ring.000 /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.001: rx_ring.001 tx_ring.001 /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.002: rx_ring.002 tx_ring.002 /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.003: rx_ring.003 tx_ring.003 # cat /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000/rx_ring.000 DES DATA RSS STATERR LENGTH VLAN DGLORT SGLORT TIMESTAMP --------------------------------------------------------------------------- 000 0x00000000 0x00000000 0x00000003 0x002a 0x0000 0x0000 0x0000 0x13951807dc4fedf0 001 0x00000000 0x00000000 0x00000003 0x002a 0x0000 0x0000 0x0000 0x1395180906c9f2c8 002 0x3731c000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000 003 0x3731d000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000 004 0xaab3a000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000 ... # cat /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000/tx_ring.000 DES BUFFER_ADDRESS LENGTH VLAN MSS HDRLEN FLAGS --------------------------------------------------------- 000 0x00000000aa8a1002 0x005a 0x0000 0x0000 0x0000 0xc0 001 0x00000000aa8a2002 0x005a 0x0000 0x0000 0x0000 0xc0 002 0x000000006bc13202 0x004e 0x0000 0x0000 0x0000 0xc0 003 0x000000006bc13c02 0x002a 0x0000 0x0000 0x0000 0xe1 004 0x000000006bc13602 0x0062 0x0000 0x0000 0x0000 0xc0 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add support for IEEE DCBx	Alexander Duyck	4	-4/+185
	This patch adds support for management of the limited QOS features of the FM10000 interface. Specifically we can support up to 8 traffic classes, however the part only provides 1 Rx and 1 Tx FIFO in the host interface and as a result this can lead to head-of-line blocking on Rx. This can be avoided by setting PFC only for priorities that cannot afford to drop frames. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add support for SR-IOV to driver	Alexander Duyck	5	-2/+614
	This patch combines the recently added VF messaging and configuration functionality with the interfaces provided by the kernel to allow for configuration and management of SR-IOV. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add support for SR-IOV to PF core files	Alexander Duyck	3	-1/+874
	This change adds a set of functions to fm10k_pf.c which allows for configuring the VF via a set of standardized TLV messages. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add support for VF	Alexander Duyck	9	-4/+837
	This patch provides the functions necessary to configure the VF making use of the same API pointers as the PF. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add support for PF <-> VF mailbox	Alexander Duyck	2	-0/+834
	This patch adds support for the PF <-> VF mailbox. It functions similar to the PF <-> SM mailbox however there are several modifications made to improve the reliability of the mailbox itself. In addition the PF/VF mailbox is much smaller an only supports a total size of 16 DWORDs vs the 1024 DWORDS provided for the PF/SM mailbox. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add support for MACVLAN acceleration	Alexander Duyck	4	-1/+198
	This patch adds support for L2 MACVLAN by making use of the fact that the RRC provides a unique tag per filter called a Global Resource Tag, or GLORT. In the case of this offload what I have done is assigned a linear block of these so that each GLORT represents one of the MACVLAN netdevs. By doing this I can share the Rx queues and Tx queues for all of the MACVLAN netdevs while allowing them to be demuxed in the Rx cleanup path. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add support for netdev offloads	Alexander Duyck	2	-3/+459
	This patch adds support for basic offloads including TSO, Tx checksum, Rx checksum, Rx hash, and the same features applied to VXLAN/NVGRE tunnels. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add support for multiple queues	Alexander Duyck	4	-0/+247
	This patch takes the driver from supporting a single queue to supporting multiple queues. The upper queue limit for the PF is 128 queues and the upper limit for the VF is (128 / num_vfs) rounded down to nearest power of 2. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add support for PCI power management and error handling	Alexander Duyck	1	-0/+221
	Add PCI power management and error handling to allow the device to support suspend/resume and recovery of any PCIe errors. The fm10k devices do not support wake on LAN, and there is no plan to add this as a feature. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add ethtool support	Alexander Duyck	4	-1/+873
	This patch adds basic ethtool support to the device to allow for configuration. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add transmit and receive fastpath and interrupt handlers	Alexander Duyck	4	-2/+1038
	This change adds the transmit and receive fastpath and interrupt handlers. With this code in place the network device is now able to send and receive frames over the network interface using a single queue. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> CC: Rick Jones <rick.jones2@hp.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-09-23	fm10k: Add Tx/Rx hardware ring bring-up/tear-down	Alexander Duyck	3	-0/+653
	This patch adds support for allocating, configuring, and freeing Tx/Rx ring resources. With these changes in place the descriptor queues are in a state where they are ready to transmit or receive if provided buffers. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>