summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/igb/igb_main.c
AgeCommit message (Collapse)AuthorFilesLines
2019-11-12igb: Fix constant media auto sense switching when no cable is connectedManfred Rudigier1-1/+2
[ Upstream commit 8d5cfd7f76a2414e23c74bb8858af7540365d985 ] At least on the i350 there is an annoying behavior that is maybe also present on 82580 devices, but was probably not noticed yet as MAS is not widely used. If no cable is connected on both fiber/copper ports the media auto sense code will constantly swap between them as part of the watchdog task and produce many unnecessary kernel log messages. The swap code responsible for this behavior (switching to fiber) should not be executed if the current media type is copper and there is no signal detected on the fiber port. In this case we can safely wait until the AUTOSENSE_EN bit is cleared. Signed-off-by: Manfred Rudigier <manfred.rudigier@omicronenergy.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-08igb: Fix WARN_ONCE on runtime suspendArvind Sankar1-49/+8
[ Upstream commit dabb8338be533c18f50255cf39ff4f66d4dabdbe ] The runtime_suspend device callbacks are not supposed to save configuration state or change the power state. Commit fb29f76cc566 ("igb: Fix an issue that PME is not enabled during runtime suspend") changed the driver to not save configuration state during runtime suspend, however the driver callback still put the device into a low-power state. This causes a warning in the pci pm core and results in pci_pm_runtime_suspend not calling pci_save_state or pci_finish_runtime_suspend. Fix this by not changing the power state either, leaving that to pci pm core, and make the same change for suspend callback as well. Also move a couple of defines into the appropriate header file instead of inline in the .c file. Fixes: fb29f76cc566 ("igb: Fix an issue that PME is not enabled during runtime suspend") Signed-off-by: Arvind Sankar <niveditas98@gmail.com> Reviewed-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-12igb: Fix an issue that PME is not enabled during runtime suspendKai-Heng Feng1-3/+5
[ Upstream commit 1fb3a7a75e2efcc83ef21f2434069cddd6fae6f5 ] I210 ethernet card doesn't wakeup when a cable gets plugged. It's because its PME is not set. Since commit 42eca2302146 ("PCI: Don't touch card regs after runtime suspend D3"), if the PCI state is saved, pci_pm_runtime_suspend() stops calling pci_finish_runtime_suspend(), which enables the PCI PME. To fix the issue, let's not to save PCI states when it's runtime suspend, to let the PCI subsystem enables PME. Fixes: 42eca2302146 ("PCI: Don't touch card regs after runtime suspend D3") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-08-03igb: Fix queue selection on MAC filters on i210Vinicius Costa Gomes1-2/+7
[ Upstream commit 4dc93fcf0b95dc3fda4db917effae31fbb8ad2a8 ] On the RAH registers there are semantic differences on the meaning of the "queue" parameter for traffic steering depending on the controller model: there is the 82575 meaning, which "queue" means a RX Hardware Queue, and the i350 meaning, where it is a reception pool. The previous behaviour was having no effect for i210 based controllers because the QSEL bit of the RAH register wasn't being set. This patch separates the condition in discrete cases, so the different handling is clearer. Fixes: 83c21335c876 ("igb: improve MAC filter handling") Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-26igb: Allow to remove administratively set MAC on VFsCorinna Vinschen1-11/+31
[ Upstream commit 177132df5e45b134c147f419f567a3b56aafaf2b ] Before libvirt modifies the MAC address and vlan tag for an SRIOV VF for use by a virtual machine (either using vfio device assignment or macvtap passthru mode), it saves the current MAC address and vlan tag so that it can reset them to their original value when the guest is done. Libvirt can't leave the VF MAC set to the value used by the now-defunct guest since it may be started again later using a different VF, but it certainly shouldn't just pick any random value, either. So it saves the state of everything prior to using the VF, and resets it to that. The igb driver initializes the MAC addresses of all VFs to 00:00:00:00:00:00, and reports that when asked (via an RTM_GETLINK netlink message, also visible in the list of VFs in the output of "ip link show"). But when libvirt attempts to restore the MAC address back to 00:00:00:00:00:00 (using an RTM_SETLINK netlink message) the kernel responds with "Invalid argument". Forbidding a reset back to the original value leaves the VF MAC at the value set for the now-defunct virtual machine. Especially on a system with NetworkManager enabled, this has very bad consequences, since NetworkManager forces all interfacess to be IFF_UP all the time - if the same virtual machine is restarted using a different VF (or even on a different host), there will be multiple interfaces watching for traffic with the same MAC address. To allow libvirt to revert to the original state, we need a way to remove the administrative set MAC on a VF, to allow normal host operation again, and to reset/overwrite the VF MAC via VF netdev. This patch implements the outlined scenario by allowing to set the VF MAC to 00:00:00:00:00:00 via RTM_SETLINK on the PF. igb_ndo_set_vf_mac resets the IGB_VF_FLAG_PF_SET_MAC flag to 0, so it's possible to reset the VF MAC back to the original value via the VF netdev. Note: Recent patches to libvirt allow for a workaround if the NIC isn't capable of resetting the administrative MAC back to all 0, but in theory the NIC should allow resetting the MAC in the first place. Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Tested-by: Aaron Brown <arron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-03igb: Free IRQs when device is hotpluggedLyude Paul1-1/+1
commit 888f22931478a05bc81ceb7295c626e1292bf0ed upstream. Recently I got a Caldigit TS3 Thunderbolt 3 dock, and noticed that upon hotplugging my kernel would immediately crash due to igb: [ 680.825801] kernel BUG at drivers/pci/msi.c:352! [ 680.828388] invalid opcode: 0000 [#1] SMP [ 680.829194] Modules linked in: igb(O) thunderbolt i2c_algo_bit joydev vfat fat btusb btrtl btbcm btintel bluetooth ecdh_generic hp_wmi sparse_keymap rfkill wmi_bmof iTCO_wdt intel_rapl x86_pkg_temp_thermal coretemp crc32_pclmul snd_pcm rtsx_pci_ms mei_me snd_timer memstick snd pcspkr mei soundcore i2c_i801 tpm_tis psmouse shpchp wmi tpm_tis_core tpm video hp_wireless acpi_pad rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci mfd_core xhci_pci xhci_hcd i2c_hid i2c_core [last unloaded: igb] [ 680.831085] CPU: 1 PID: 78 Comm: kworker/u16:1 Tainted: G O 4.15.0-rc3Lyude-Test+ #6 [ 680.831596] Hardware name: HP HP ZBook Studio G4/826B, BIOS P71 Ver. 01.03 06/09/2017 [ 680.832168] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 680.832687] RIP: 0010:free_msi_irqs+0x180/0x1b0 [ 680.833271] RSP: 0018:ffffc9000030fbf0 EFLAGS: 00010286 [ 680.833761] RAX: ffff8803405f9c00 RBX: ffff88033e3d2e40 RCX: 000000000000002c [ 680.834278] RDX: 0000000000000000 RSI: 00000000000000ac RDI: ffff880340be2178 [ 680.834832] RBP: 0000000000000000 R08: ffff880340be1ff0 R09: ffff8803405f9c00 [ 680.835342] R10: 0000000000000000 R11: 0000000000000040 R12: ffff88033d63a298 [ 680.835822] R13: ffff88033d63a000 R14: 0000000000000060 R15: ffff880341959000 [ 680.836332] FS: 0000000000000000(0000) GS:ffff88034f440000(0000) knlGS:0000000000000000 [ 680.836817] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 680.837360] CR2: 000055e64044afdf CR3: 0000000001c09002 CR4: 00000000003606e0 [ 680.837954] Call Trace: [ 680.838853] pci_disable_msix+0xce/0xf0 [ 680.839616] igb_reset_interrupt_capability+0x5d/0x60 [igb] [ 680.840278] igb_remove+0x9d/0x110 [igb] [ 680.840764] pci_device_remove+0x36/0xb0 [ 680.841279] device_release_driver_internal+0x157/0x220 [ 680.841739] pci_stop_bus_device+0x7d/0xa0 [ 680.842255] pci_stop_bus_device+0x2b/0xa0 [ 680.842722] pci_stop_bus_device+0x3d/0xa0 [ 680.843189] pci_stop_and_remove_bus_device+0xe/0x20 [ 680.843627] trim_stale_devices+0xf3/0x140 [ 680.844086] trim_stale_devices+0x94/0x140 [ 680.844532] trim_stale_devices+0xa6/0x140 [ 680.845031] ? get_slot_status+0x90/0xc0 [ 680.845536] acpiphp_check_bridge.part.5+0xfe/0x140 [ 680.846021] acpiphp_hotplug_notify+0x175/0x200 [ 680.846581] ? free_bridge+0x100/0x100 [ 680.847113] acpi_device_hotplug+0x8a/0x490 [ 680.847535] acpi_hotplug_work_fn+0x1a/0x30 [ 680.848076] process_one_work+0x182/0x3a0 [ 680.848543] worker_thread+0x2e/0x380 [ 680.848963] ? process_one_work+0x3a0/0x3a0 [ 680.849373] kthread+0x111/0x130 [ 680.849776] ? kthread_create_worker_on_cpu+0x50/0x50 [ 680.850188] ret_from_fork+0x1f/0x30 [ 680.850601] Code: 43 14 85 c0 0f 84 d5 fe ff ff 31 ed eb 0f 83 c5 01 39 6b 14 0f 86 c5 fe ff ff 8b 7b 10 01 ef e8 b7 e4 d2 ff 48 83 78 70 00 74 e3 <0f> 0b 49 8d b5 a0 00 00 00 e8 62 6f d3 ff e9 c7 fe ff ff 48 8b [ 680.851497] RIP: free_msi_irqs+0x180/0x1b0 RSP: ffffc9000030fbf0 As it turns out, normally the freeing of IRQs that would fix this is called inside of the scope of __igb_close(). However, since the device is already gone by the point we try to unregister the netdevice from the driver due to a hotplug we end up seeing that the netif isn't present and thus, forget to free any of the device IRQs. So: make sure that if we're in the process of dismantling the netdev, we always allow __igb_close() to be called so that IRQs may be freed normally. Additionally, only allow igb_close() to be called from __igb_close() if it hasn't already been called for the given adapter. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 9474933caf21 ("igb: close/suspend race in netif_device_detach") Cc: Todd Fujinaka <todd.fujinaka@intel.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-25igb: check memory allocation failureChristophe JAILLET1-0/+2
[ Upstream commit 18eb86362a52f0af933cc0fd5e37027317eb2d1c ] Check memory allocation failures and return -ENOMEM in such cases, as already done for other memory allocations in this function. This avoids NULL pointers dereference. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Acked-by: PJ Waskiewicz <peter.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-30igb: Use smp_rmb rather than read_barrier_dependsBrian King1-1/+1
commit c4cb99185b4cc96c0a1c70104dc21ae14d7e7f28 upstream. The original issue being fixed in this patch was seen with the ixgbe driver, but the same issue exists with igb as well, as the code is very similar. read_barrier_depends is not sufficient to ensure loads following it are not speculatively loaded out of order by the CPU, which can result in stale data being loaded, causing potential system crashes. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-26igb: Fix TX map failure pathJean-Philippe Brucker1-1/+1
When the driver cannot map a TX buffer, instead of rolling back gracefully and retrying later, we currently get a panic: [ 159.885994] igb 0000:00:00.0: TX DMA map failed [ 159.886588] Unable to handle kernel paging request at virtual address ffff00000a08c7a8 ... [ 159.897031] PC is at igb_xmit_frame_ring+0x9c8/0xcb8 Fix the erroneous test that leads to this situation. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-09igb: do not drop PF mailbox lock after read of VF messageGreg Edwards1-4/+10
When the PF receives a mailbox message from the VF, it grabs the mailbox lock, reads the VF message from the mailbox, ACKs the message and drops the lock. While the PF is performing the action for the VF message, nothing prevents another VF message from being posted to the mailbox. The current code handles this condition by just dropping any new VF messages without processing them. This results in a mailbox timeout in the VM for posted messages waiting for an ACK, and the VF is reset by the igbvf_watchdog_task in the VM. Given the right sequence of VF messages and mailbox timeouts, this condition can go on ad infinitum. Modify the PF mailbox read method to take an 'unlock' argument that optionally leaves the mailbox locked by the PF after reading the VF message. This ensures another VF message is not posted to the mailbox until after the PF has completed processing the VF message and written its reply. Signed-off-by: Greg Edwards <gedwards@ddn.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-09igb: Remove incorrect "unexpected SYS WRAP" log messageCorinna Vinschen1-2/+0
TSAUXC.DisableSystime is never set, so SYSTIM runs into a SYS WRAP every 1100 secs on 80580/i350/i354 (40 bit SYSTIM) and every 35000 secs on 80576 (45 bit SYSTIM). This wrap event sets the TSICR.SysWrap bit unconditionally. However, checking TSIM at interrupt time shows that this event does not actually cause the interrupt. Rather, it's just bycatch while the actual interrupt is caused by, for instance, TSICR.TXTS. The conclusion is that the SYS WRAP is actually expected, so the "unexpected SYS WRAP" message is entirely bogus and just helps to confuse users. Drop it. Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-09igb: protect TX timestamping from API misuseCliff Spradlin1-1/+2
HW timestamping can only be requested for a packet if the NIC is first setup via ioctl(SIOCSHWTSTAMP). If this step was skipped, then the igb driver still allowed TX packets to request HW timestamping. In this situation, the _IGB_PTP_TX_IN_PROGRESS flag was set and would never clear. This prevented any future HW timestamping requests to succeed. Fix this by checking that the NIC is configured for HW TX timestamping before accepting a HW TX timestamping request. Signed-off-by: Cliff Spradlin <cspradlin@google.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-08-09igb: Fix error of RX network flow classificationGangfeng Huang1-2/+2
After add an ethertype filter, if user change the adapter speed several times, the error "ethtool -N: etype filters are all used" is reported by igb driver. In older patch, function igb_nfc_filter_exit() and igb_nfc_filter_restore() is not paried. igb_nfc_filter_restore() exist in igb_up(), but function igb_nfc_filter_exit() is exist in __igb_close(). In the process of speed changing, only igb_nfc_filter_restore() is called, it will take a position of ethertype bitmap. Reproduce steps: Step 1: Add a etype filter by ethtool $ethtool -N eth0 flow-type ether proto 0x88F8 action 1 Step 2: Change the adapter speed to 100M/full duplex $ethtool -s eth0 speed 100 duplex full Step 3: Change the adapter speed to 1000M/full duplex ethtool -s eth0 speed 1000 duplex full Repeat step2 and step3, then dmesg the system log, you can find the error message, add new ethtype filter is also failed. This fixing is move igb_nfc_filter_exit() from __igb_close() to igb_down() to make igb_nfc_filter_restore()/igb_nfc_filter_exit() is paired. Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-08igb: make a few local functions staticColin Ian King1-6/+6
Clean up a few sparse warnings, these following functions can be made static: drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_add_mac_filter' was not declared. Should it be static? drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_del_mac_filter' was not declared. Should it be static? drivers/net/ethernet/intel/igb/igb_main.c: warning: symbol 'igb_set_vf_mac_filter' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: Remove useless argumentBenjamin Poirier1-5/+5
Given that all callers of igb_update_stats() pass the same two arguments: (adapter, &adapter->stats64), the second argument can be removed. Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: check for Tx timestamp timeouts during watchdogJacob Keller1-0/+1
The igb driver has logic to handle only one Tx timestamp at a time, using a state bit lock to avoid multiple requests at once. It may be possible, if incredibly unlikely, that a Tx timestamp event is requested but never completes. Since we use an interrupt scheme to determine when the Tx timestamp occurred we would never clear the state bit in this case. Add an igb_ptp_tx_hang() function similar to the already existing igb_ptp_rx_hang() function. This function runs in the watchdog routine and makes sure we eventually recover from this case instead of permanently disabling Tx timestamps. Note: there is no currently known way to cause this without hacking the driver code to force it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: add statistic indicating number of skipped Tx timestampsJacob Keller1-0/+2
The igb driver can only handle one Tx timestamp request at a time. This means it is possible for an application timestamp request to be ignored. There is no easy way for an administrator to determine if this occurred. Add a new statistic which tracks this, tx_hwtstamp_skipped. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: avoid permanent lock of *_PTP_TX_IN_PROGRESSJacob Keller1-5/+18
The igb driver uses a state bit lock to avoid handling more than one Tx timestamp request at once. This is required because hardware is limited to a single set of registers for Tx timestamps. The state bit lock is not properly cleaned up during igb_xmit_frame_ring() if the transmit fails such as due to DMA or TSO failure. In some hardware this results in blocking timestamps until the service task times out. In other hardware this results in a permanent lock of the timestamp bit because we never receive an interrupt indicating the timestamp occurred, since indeed the packet was never transmitted. Fix this by checking for DMA and TSO errors in igb_xmit_frame_ring() and properly cleaning up after ourselves when these occur. Reported-by: Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-06-06igb: mark PM functions as __maybe_unusedArnd Bergmann1-13/+5
The new wake function is only used by the suspend/resume handlers that are defined in inside of an #ifdef, which can cause this harmless warning: drivers/net/ethernet/intel/igb/igb_main.c:7988:13: warning: 'igb_deliver_wake_packet' defined but not used [-Wunused-function] Removing the #ifdef, instead using a __maybe_unused annotation simplifies the code and avoids the warning. Fixes: b90fa8763560 ("igb: Enable reading of wake up packet") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-21igb: Enable reading of wake up packetKim Tatt Chuah1-1/+35
Currently, in igb_resume(), igb driver ignores the Wake Up Status (WUS) and Wake Up Packet Memory (WUPM) registers. This patch enables the igb driver to read the WUPM if the controller was woken by a wake up packet that is not more than 128 bytes long (maximum WUPM size), then pass it up the kernel network stack. Signed-off-by: Kim Tatt Chuah <kim.tatt.chuah@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-21igb/igbvf: Add VF MAC filter request capabilitiesYury Kylulin1-13/+134
Add functionality for the VF to request up to 3 additional MAC filters. This is done using existing E1000_VF_SET_MAC_ADDR message, but with additional message info - E1000_VF_MAC_FILTER_CLR to clear all unicast MAC filters previously set for this VF and E1000_VF_MAC_FILTER_ADD to add MAC filter. Additional filters can be added only in case if administrator did not set VF MAC explicitly and allowed to change default MAC to the VF. Due to the limited number of RAR entries reserve at least 3 MAC filters for the PF. If SRIOV is supported by the NIC after this change RAR entries starting from 1 to (RAR MAX ENTRIES - NUM SRIOV VFS) will be used for PF and VF MAC filters. Signed-off-by: Yury Kylulin <yury.kylulin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-04-21igb: improve MAC filter handlingYury Kylulin1-65/+181
Using the work which was done for ixgbe driver by Jacob Keller commit 5d7daa35b9eb ("ixgbe: improve mac filter handling") and Alexander Duyck commit 0f079d22834a ("ixgbe: Use __dev_uc_sync and __dev_uc_unsync for unicast addresses") and out-of-tree igb driver add functionality to manage (add and delete) MAC filters. Signed-off-by: Yury Kylulin <yury.kylulin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb/ixgbe: Fix typo in igb_build_skb and/or ixgbe_build_skb code commentAlexander Duyck1-1/+1
There was a typo that I had left in the code comments for the igb and ixgbe functions that enabled build_skb support. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Re-add support for build_skb in igbAlexander Duyck1-0/+47
This reverts commit f9d40f6a9921 ("igb: Revert support for build_skb in igb") and adds a few changes to update it to work with the latest version of igb. We are now able to revert the removal of this due to the fact that with the recent changes to the page count and the use of DMA_ATTR_SKIP_CPU_SYNC we can make the pages writable so we should not be invalidating the additional data added when we call build_skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Break out Rx buffer page managementAlexander Duyck1-114/+121
At this point we have 2 to 3 paths that can be taken depending on what Rx modes are enabled. In order to better support that and improve the maintainability I am breaking out the common bits from those paths and making them into their own functions. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Add support for padding packetAlexander Duyck1-2/+12
With the size of the frame limited we can now write to an offset within the buffer instead of having to write at the very start of the buffer. The advantage to this is that it allows us to leave padding room for things like supporting XDP in the future. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Add support for using order 1 pages to receive large framesAlexander Duyck1-17/+47
This patch adds support for using 3K buffers in order 1 pages the same way we were using 2K buffers in 4K pages. We are reserving 1K of room for now to have space available for future headroom and tailroom when we enable build_skb support. One side effect of this patch is that we can end up using a larger buffer if jumbo frames is enabled. The impact shouldn't be too great, but it could hurt small packet performance for UDP workloads if jumbo frames is enabled as the truesize of frames will be larger. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Use page_address offset from page instead of masking virtual addressAlexander Duyck1-6/+5
Update the handling of page addresses so that we always refer to them using a void pointer, and try to use the consistent name of va indicating we are working with a virtual address. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Limit maximum frame Rx based on MTUAlexander Duyck1-4/+17
In order to support the use of build_skb going forward it will be necessary to place a maximum limit on the amount of data we can receive when jumbo frames is not enabled. In order to do this I am adding a new upper limit for receive based on the size of a 2K buffer minus padding. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Don't bother clearing Tx buffer_info in igb_clean_tx_ringAlexander Duyck1-47/+73
In the case of the Tx rings we need to only clear the Tx buffer_info when we are resetting the rings. Ideally we do this when we configure the ring to bring it back up instead of when we are taking it down in order to avoid dirtying pages we don't need to. In addition we don't need to clear the Tx descriptor ring since we will fully repopulate it when we begin transmitting frames and next_to_watch can be cleared to prevent the ring from being cleaned beyond that point instead of needing to touch anything in the Tx descriptor ring. Finally with these changes we can avoid having to reset the skb member of the Tx buffer_info structure in the cleanup path since the skb will always be associated with the first buffer which has next_to_watch set. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Clear Rx buffer_info in configure instead of cleanAlexander Duyck1-14/+10
This change makes it so that instead of going through the entire ring on Rx cleanup we only go through the region that was designated to be cleaned up and stop when we reach the region where new allocations should start. In addition we can avoid having to perform a memset on the Rx buffer_info structures until we are about to start using the ring again. By deferring this we can avoid dirtying the cache any more than we have to which can help to improve the time needed to bring the interface down and then back up again in a reset or suspend/resume cycle. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Use length to determine if descriptor is doneAlexander Duyck1-6/+8
This change makes it so that we use the length of the packet instead of the DD status bit to determine if a new descriptor is ready to be processed. The obvious advantage is that it cuts down on reads as we don't really even need the DD bit if going from a 0 to a non-zero value on size is enough to inform us that the packet has been completed. In addition I have updated the code so that we only reset the Rx descriptor length for descriptor zero when resetting a ring instead of having to do a memset with 0 over the entire ring. By doing this we can save some time on initialization. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-03-17igb: Add support for DMA_ATTR_WEAK_ORDERINGAlexander Duyck1-3/+3
Since we are already using DMA attributes in igb for Rx there is no reason why we can't also apply DMA_ATTR_WEAK_ORDERING which is needed on some platforms to improve performance. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-19net: Remove usage of net_device last_rx memberTobias Klauser1-3/+3
The network stack no longer uses the last_rx member of struct net_device since the bonding driver switched to use its own private last_rx in commit 9f242738376d ("bonding: use last_arp_rx in slave_last_rx()"). However, some drivers still (ab)use the field for their own purposes and some driver just update it without actually using it. Previously, there was an accompanying comment for the last_rx member added in commit 4dc89133f49b ("net: add a comment on netdev->last_rx") which asked drivers not to update is, unless really needed. However, this commend was removed in commit f8ff080dacec ("bonding: remove useless updating of slave->dev->last_rx"), so some drivers added later on still did update last_rx. Remove all usage of last_rx and switch three drivers (sky2, atp and smc91c92_cs) which actually read and write it to use their own private copy in netdev_priv. Compile-tested with allyesconfig and allmodconfig on x86 and arm. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Mirko Lindner <mlindner@marvell.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-3/+3
Two AF_* families adding entries to the lockdep tables at the same time. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11mm: rename __page_frag functions to __page_frag_cache, drop order from drainAlexander Duyck1-3/+3
This patch does two things. First it goes through and renames the __page_frag prefixed functions to __page_frag_cache so that we can be clear that we are draining or refilling the cache, not the frags themselves. Second we drop the order parameter from __page_frag_cache_drain since we don't actually need to pass it since all fragments are either order 0 or must be a compound page. Link: http://lkml.kernel.org/r/20170104023954.13451.5678.stgit@localhost.localdomain Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-09net: make ndo_get_stats64 a void functionstephen hemminger1-6/+4
The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06igb: close/suspend race in netif_device_detachTodd Fujinaka1-9/+12
Similar to ixgbe, when an interface is part of a namespace it is possible that igb_close() may be called while __igb_shutdown() is running which ends up in a double free WARN and/or a BUG in free_msi_irqs(). Extend the rtnl_lock() to protect the call to netif_device_detach() and igb_clear_interrupt_scheme() in __igb_shutdown() and check for netif_device_present() to avoid calling igb_clear_interrupt_scheme() a second time in igb_close(). Also extend the rtnl lock in igb_resume() to netif_device_attach(). Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-06igb: re-assign hw address pointer on reset after PCI errorGuilherme G Piccoli1-0/+5
Whenever the igb driver detects the result of a read operation returns a value composed only by F's (like 0xFFFFFFFF), it will detach the net_device, clear the hw_addr pointer and warn to the user that adapter's link is lost - those steps happen on igb_rd32(). In case a PCI error happens on Power architecture, there's a recovery mechanism called EEH, that will reset the PCI slot and call driver's handlers to reset the adapter and network functionality as well. We observed that once hw_addr is NULL after the error is detected on igb_rd32(), it's never assigned back, so in the process of resetting the network functionality we got a NULL pointer dereference in both igb_configure_tx_ring() and igb_configure_rx_ring(). In order to avoid such bug, this patch re-assigns the hw_addr value in the slot_reset handler. Reported-by: Anthony H Thai <ahthai@us.ibm.com> Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com> Signed-off-by: Guilherme G Piccoli <gpiccoli@linux.vnet.ibm.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-01-06igb: use igb_adapter->io_addr instead of e1000_hw->hw_addrCao jin1-2/+2
When running as guest, under certain condition, it will oops as following. writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr is NULL. While other register access won't oops kernel because they use wr32/rd32 which have a defense against NULL pointer. [ 141.225449] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal) error received: id=0101 [ 141.225523] igb 0000:01:00.1: PCIe Bus Error: severity=Uncorrected (Fatal), type=Unaccessible, id=0101(Unregistered Agent ID) [ 141.299442] igb 0000:01:00.1: broadcast error_detected message [ 141.300539] igb 0000:01:00.0 enp1s0f0: PCIe link lost, device now detached [ 141.351019] igb 0000:01:00.1 enp1s0f1: PCIe link lost, device now detached [ 143.465904] pcieport 0000:00:1c.0: Root Port link has been reset [ 143.465994] igb 0000:01:00.1: broadcast slot_reset message [ 143.466039] igb 0000:01:00.0: enabling device (0000 -> 0002) [ 144.389078] igb 0000:01:00.1: enabling device (0000 -> 0002) [ 145.312078] igb 0000:01:00.1: broadcast resume message [ 145.322211] BUG: unable to handle kernel paging request at 0000000000003818 [ 145.361275] IP: [<ffffffffa02fd38d>] igb_configure_tx_ring+0x14d/0x280 [igb] [ 145.400048] PGD 0 [ 145.438007] Oops: 0002 [#1] SMP A similar issue & solution could be found at: http://patchwork.ozlabs.org/patch/689592/ Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-12-15igb: update code to better handle incrementing page countAlexander Duyck1-7/+17
Update the driver code so that we do bulk updates of the page reference count instead of just incrementing it by one reference at a time. The advantage to doing this is that we cut down on atomic operations and this in turn should give us a slight improvement in cycles per packet. In addition if we eventually move this over to using build_skb the gains will be more noticeable. Link: http://lkml.kernel.org/r/20161110113616.76501.17072.stgit@ahduyck-blue-test.jf.intel.com Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Cc: Helge Deller <deller@gmx.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Keguang Zhang <keguang.zhang@gmail.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Tobias Klauser <tklauser@distanz.ch> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-15igb: update driver to make use of DMA_ATTR_SKIP_CPU_SYNCAlexander Duyck1-20/+33
The ARM architecture provides a mechanism for deferring cache line invalidation in the case of map/unmap. This patch makes use of this mechanism to avoid unnecessary synchronization. A secondary effect of this change is that the portion of the page that has been synchronized for use by the CPU should be writable and could be passed up the stack (at least on ARM). The last bit that occurred to me is that on architectures where the sync_for_cpu call invalidates cache lines we were prefetching and then invalidating the first 128 bytes of the packet. To avoid that I have moved the sync up to before we perform the prefetch and allocate the skbuff so that we can actually make use of it. Link: http://lkml.kernel.org/r/20161110113611.76501.98897.stgit@ahduyck-blue-test.jf.intel.com Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Cc: Helge Deller <deller@gmx.de> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Keguang Zhang <keguang.zhang@gmail.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Tobias Klauser <tklauser@distanz.ch> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-2/+6
Couple conflicts resolved here: 1) In the MACB driver, a bug fix to properly initialize the RX tail pointer properly overlapped with some changes to support variable sized rings. 2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix overlapping with a reorganization of the driver to support ACPI, OF, as well as PCI variants of the chip. 3) In 'net' we had several probe error path bug fixes to the stmmac driver, meanwhile a lot of this code was cleaned up and reorganized in 'net-next'. 4) The cls_flower classifier obtained a helper function in 'net-next' called __fl_delete() and this overlapped with Daniel Borkamann's bug fix to use RCU for object destruction in 'net'. It also overlapped with Jiri's change to guard the rhashtable_remove_fast() call with a check against tc_skip_sw(). 5) In mlx4, a revert bug fix in 'net' overlapped with some unrelated changes in 'net-next'. 6) In geneve, a stale header pointer after pskb_expand_head() bug fix in 'net' overlapped with a large reorganization of the same code in 'net-next'. Since the 'net-next' code no longer had the bug in question, there was nothing to do other than to simply take the 'net-next' hunks. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-01igb/igbvf: Don't use lco_csum to compute IPv4 checksumAlexander Duyck1-2/+6
In the case of IPIP and SIT tunnel frames the outer transport header offset is actually set to the same offset as the inner transport header. This results in the lco_csum call not doing any checksum computation over the inner IPv4/v6 header data. In order to account for that I am updating the code so that we determine the location to start the checksum ourselves based on the location of the IPv4 header and the length. Fixes: e10715d3e961 ("igb/igbvf: Add support for GSO partial") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18ethernet/intel: use core min/max MTU checkingJarod Wilson1-11/+4
e100: min_mtu 68, max_mtu 1500 - remove e100_change_mtu entirely, is identical to old eth_change_mtu, and no longer serves a purpose. No need to set min_mtu or max_mtu explicitly, as ether_setup() will already set them to 68 and 1500. e1000: min_mtu 46, max_mtu 16110 e1000e: min_mtu 68, max_mtu varies based on adapter fm10k: min_mtu 68, max_mtu 15342 - remove fm10k_change_mtu entirely, does nothing now i40e: min_mtu 68, max_mtu 9706 i40evf: min_mtu 68, max_mtu 9706 igb: min_mtu 68, max_mtu 9216 - There are two different "max" frame sizes claimed and both checked in the driver, the larger value wasn't relevant though, so I've set max_mtu to the smaller of the two values here to retain identical behavior. igbvf: min_mtu 68, max_mtu 9216 - Same issue as igb duplicated ixgb: min_mtu 68, max_mtu 16114 - Also remove pointless old == new check, as that's done in dev_set_mtu ixgbe: min_mtu 68, max_mtu 9710 ixgbevf: min_mtu 68, max_mtu dependent on hardware/firmware - Some hw can only handle up to max_mtu 1504 on a vf, others 9710 CC: netdev@vger.kernel.org CC: intel-wired-lan@lists.osuosl.org CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-28igb: bump version to igb-5.4.0Todd Fujinaka1-1/+1
Bump igb version to match other igb drivers. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-09-24net: Update API for VF vlan protocol 802.1ad supportMoshe Shemesh1-3/+6
Introduce new rtnl UAPI that exposes a list of vlans per VF, giving the ability for user-space application to specify it for the VF, as an option to support 802.1ad. We adjusted IP Link tool to support this option. For future use cases, the new UAPI supports multiple vlans. For now we limit the list size to a single vlan in kernel. Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward compatibility with older versions of IP Link tool. Add a vlan protocol parameter to the ndo_set_vf_vlan callback. We kept 802.1Q as the drivers' default vlan protocol. Suitable ip link tool command examples: Set vf vlan protocol 802.1ad: ip link set eth0 vf 1 vlan 100 proto 802.1ad Set vf to VST (802.1Q) mode: ip link set eth0 vf 1 vlan 100 proto 802.1Q Or by omitting the new parameter ip link set eth0 vf 1 vlan 100 Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-19igb: add support of RX network flow classificationGangfeng Huang1-0/+45
This patch is meant to allow for RX network flow classification to insert and remove Rx filter by ethtool. Ethtool interface has it's own rules manager Show all filters: $ ethtool -n eth0 4 RX rings available Total 2 rules Signed-off-by: Ruhao Gao <ruhao.gao@ni.com> Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-08-03Merge tag 'pci-v4.8-changes' of ↵Linus Torvalds1-7/+3
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Highlights: - ARM64 support for ACPI host bridges - new drivers for Axis ARTPEC-6 and Marvell Aardvark - new pci_alloc_irq_vectors() interface for MSI-X, MSI, legacy INTx - pci_resource_to_user() cleanup (more to come) Detailed summary: Enumeration: - Move ecam.h to linux/include/pci-ecam.h (Jayachandran C) - Add parent device field to ECAM struct pci_config_window (Jayachandran C) - Add generic MCFG table handling (Tomasz Nowicki) - Refactor pci_bus_assign_domain_nr() for CONFIG_PCI_DOMAINS_GENERIC (Tomasz Nowicki) - Factor DT-specific pci_bus_find_domain_nr() code out (Tomasz Nowicki) Resource management: - Add devm_request_pci_bus_resources() (Bjorn Helgaas) - Unify pci_resource_to_user() declarations (Bjorn Helgaas) - Implement pci_resource_to_user() with pcibios_resource_to_bus() (microblaze, powerpc, sparc) (Bjorn Helgaas) - Request host bridge window resources (designware, iproc, rcar, xgene, xilinx, xilinx-nwl) (Bjorn Helgaas) - Make PCI I/O space optional on ARM32 (Bjorn Helgaas) - Ignore write combining when mapping I/O port space (Bjorn Helgaas) - Claim bus resources on MIPS PCI_PROBE_ONLY set-ups (Bjorn Helgaas) - Remove unicore32 pci=firmware command line parameter handling (Bjorn Helgaas) - Support I/O resources when parsing host bridge resources (Jayachandran C) - Add helpers to request/release memory and I/O regions (Johannes Thumshirn) - Use pci_(request|release)_mem_regions (NVMe, lpfc, GenWQE, ethernet/intel, alx) (Johannes Thumshirn) - Extend pci=resource_alignment to specify device/vendor IDs (Koehrer Mathias (ETAS/ESW5)) - Add generic pci_bus_claim_resources() (Lorenzo Pieralisi) - Claim bus resources on ARM32 PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi) - Remove ARM32 and ARM64 arch-specific pcibios_enable_device() (Lorenzo Pieralisi) - Add pci_unmap_iospace() to unmap I/O resources (Sinan Kaya) - Remove powerpc __pci_mmap_set_pgprot() (Yinghai Lu) PCI device hotplug: - Allow additional bus numbers for hotplug bridges (Keith Busch) - Ignore interrupts during D3cold (Lukas Wunner) Power management: - Enforce type casting for pci_power_t (Andy Shevchenko) - Don't clear d3cold_allowed for PCIe ports (Mika Westerberg) - Put PCIe ports into D3 during suspend (Mika Westerberg) - Power on bridges before scanning new devices (Mika Westerberg) - Runtime resume bridge before rescan (Mika Westerberg) - Add runtime PM support for PCIe ports (Mika Westerberg) - Remove redundant check of pcie_set_clkpm (Shawn Lin) Virtualization: - Add function 1 DMA alias quirk for Marvell 88SE9182 (Aaron Sierra) - Add DMA alias quirk for Adaptec 3805 (Alex Williamson) - Mark Atheros AR9485 and QCA9882 to avoid bus reset (Chris Blake) - Add ACS quirk for Solarflare SFC9220 (Edward Cree) MSI: - Fix PCI_MSI dependencies (Arnd Bergmann) - Add pci_msix_desc_addr() helper (Christoph Hellwig) - Switch msix_program_entries() to use pci_msix_desc_addr() (Christoph Hellwig) - Make the "entries" argument to pci_enable_msix() optional (Christoph Hellwig) - Provide sensible IRQ vector alloc/free routines (Christoph Hellwig) - Spread interrupt vectors in pci_alloc_irq_vectors() (Christoph Hellwig) Error Handling: - Bind DPC to Root Ports as well as Downstream Ports (Keith Busch) - Remove DPC tristate module option (Keith Busch) - Convert Downstream Port Containment driver to use devm_* functions (Mika Westerberg) Generic host bridge driver: - Select IRQ_DOMAIN (Arnd Bergmann) - Claim bus resources on PCI_PROBE_ONLY set-ups (Lorenzo Pieralisi) ACPI host bridge driver: - Add ARM64 acpi_pci_bus_find_domain_nr() (Tomasz Nowicki) - Add ARM64 ACPI support for legacy IRQs parsing and consolidation with DT code (Tomasz Nowicki) - Implement ARM64 AML accessors for PCI_Config region (Tomasz Nowicki) - Support ARM64 ACPI-based PCI host controller (Tomasz Nowicki) Altera host bridge driver: - Check link status before retrain link (Ley Foon Tan) - Poll for link up status after retraining the link (Ley Foon Tan) Axis ARTPEC-6 host bridge driver: - Add PCI_MSI_IRQ_DOMAIN dependency (Arnd Bergmann) - Add DT binding for Axis ARTPEC-6 PCIe controller (Niklas Cassel) - Add Axis ARTPEC-6 PCIe controller driver (Niklas Cassel) Intel VMD host bridge driver: - Use lock save/restore in interrupt enable path (Jon Derrick) - Select device dma ops to override (Keith Busch) - Initialize list item in IRQ disable (Keith Busch) - Use x86_vector_domain as parent domain (Keith Busch) - Separate MSI and MSI-X vector sharing (Keith Busch) Marvell Aardvark host bridge driver: - Add DT binding for the Aardvark PCIe controller (Thomas Petazzoni) - Add Aardvark PCI host controller driver (Thomas Petazzoni) - Add Aardvark PCIe support for Armada 3700 (Thomas Petazzoni) Microsoft Hyper-V host bridge driver: - Fix interrupt cleanup path (Cathy Avery) - Don't leak buffer in hv_pci_onchannelcallback() (Vitaly Kuznetsov) - Handle all pending messages in hv_pci_onchannelcallback() (Vitaly Kuznetsov) NVIDIA Tegra host bridge driver: - Program PADS_REFCLK_CFG* always, not just on legacy SoCs (Stephen Warren) - Program PADS_REFCLK_CFG* registers with per-SoC values (Stephen Warren) - Use lower-case hex consistently for register definitions (Thierry Reding) - Use generic pci_remap_iospace() rather than ARM32-specific one (Thierry Reding) - Stop setting pcibios_min_mem (Thierry Reding) Renesas R-Car host bridge driver: - Drop gen2 dummy I/O port region (Bjorn Helgaas) TI DRA7xx host bridge driver: - Fix return value in case of error (Christophe JAILLET) Xilinx AXI host bridge driver: - Fix return value in case of error (Christophe JAILLET) Miscellaneous: - Make bus_attr_resource_alignment static (Ben Dooks) - Include <asm/dma.h> for isa_dma_bridge_buggy (Ben Dooks) - MAINTAINERS: Add file patterns for PCI device tree bindings (Geert Uytterhoeven) - Make host bridge drivers explicitly non-modular (Paul Gortmaker)" * tag 'pci-v4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (125 commits) PCI: xgene: Make explicitly non-modular PCI: thunder-pem: Make explicitly non-modular PCI: thunder-ecam: Make explicitly non-modular PCI: tegra: Make explicitly non-modular PCI: rcar-gen2: Make explicitly non-modular PCI: rcar: Make explicitly non-modular PCI: mvebu: Make explicitly non-modular PCI: layerscape: Make explicitly non-modular PCI: keystone: Make explicitly non-modular PCI: hisi: Make explicitly non-modular PCI: generic: Make explicitly non-modular PCI: designware-plat: Make it explicitly non-modular PCI: artpec6: Make explicitly non-modular PCI: armada8k: Make explicitly non-modular PCI: artpec: Add PCI_MSI_IRQ_DOMAIN dependency PCI: Add ACS quirk for Solarflare SFC9220 arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700 PCI: aardvark: Add Aardvark PCI host controller driver dt-bindings: add DT binding for the Aardvark PCIe controller PCI: tegra: Program PADS_REFCLK_CFG* registers with per-SoC values ...
2016-06-29igb: Only DMA sync frame lengthAndrew Lunn1-3/+4
On some platforms, syncing a buffer for DMA is expensive. Rather than sync the whole 2K receive buffer, only synchronise the length of the frame, which will typically be the MTU, or a much smaller TCP ACK. For an IMX6Q, this gives around 6% increased TCP receive performance, which is cache operations bound and reduces CPU load for TCP transmit. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>