summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2015-04-15Merge branch 'master' of ↵David S. Miller12-138/+215
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2015-04-14 This series contains updates to fm10k only. Fixed transmit statistics which was actually using values from the receive ring, instead of the transmit ring. Fixed up spelling mistakes in code comments and resolved unused argument warnings. Added support for netconsole. Fixed up statistic reporting so that we are only reporting from actual queues as well as display PF only stats for just the PF and not the VF. Also fixed an issue that when returning virtualization queues from the VF back to the PF, we were retaining the VF rate limiter. Fixed up the driver to use a separate workqueue, which helps reduce and stabilize latency between scheduling the work in our interrupt and actually performing the work. Fixed a bug where the VF tried to set a multicast address before requesting the required xcast mode. Fix VF multicast update since VFs were being improperly added to the switch's mutlicast group. The error stems from the fact that incorrect arguments were passed to the update_mc_addr(). Thanks to Alex Duyck for the extensive review. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-15fm10k: Bump driver version to 0.15.2Jeff Kirsher1-1/+1
With the recent driver changes, bump the version. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
2015-04-15fm10k: corrected VF multicast updateJeff Kirsher1-2/+5
VFs were being improperly added to the switch's multicast group. The error stems from the fact that incorrect arguments were passed to the "update_mc_addr" function. It would seem to be a copy paste error since the parameters are similar to the "update_uc_addr" function. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: mbx_update_max_size does not drop all oversized messagesJeff Kirsher1-3/+6
When we call update_max_size it does not drop all oversized messages. This is due to the difficulty in performing this operation, since it is a FIFO which makes updating anything other than head or tail very difficult. To fix this, modify validate_msg_size to ensure that we error out later when trying to transmit the message that could be oversized. This will generally be a rare condition, as it requires the FIFO to include a message larger than the max_size negotiated during mailbox connect. Note that max_size is always smaller than rx.size so it should be safe to use here. Also, update the update_max_size function header comment to clearly indicate that it does not drop all oversized messages, but only those at the head of the FIFO. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: reset head instead of calling update_max_sizeJeff Kirsher1-2/+16
When we forcefully shutdown the mailbox, we then go about resetting max size to 0, and clearing all messages in the FIFO. Instead, we should just reset the head pointer so that the FIFO becomes empty, rather than changing the max size to 0. This helps prevent increment in tx_dropped counter during mailbox negotiation, which is confusing to viewers of Linux ethtool statistics output. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: renamed mbx_tx_dropped to mbx_tx_oversizedJeff Kirsher1-1/+1
The use of dropped doesn't really mean dropped mailbox messages, but rather specifically messages which were too large to fit in the remote Rx FIFO. Rename the stat to more clearly indicate what it means. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller36-579/+739
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next A final pull request, I know it's very late but this time I think it's worth a bit of rush. The following patchset contains Netfilter/nf_tables updates for net-next, more specifically concatenation support and dynamic stateful expression instantiation. This also comes with a couple of small patches. One to fix the ebtables.h userspace header and another to get rid of an obsolete example file in tree that describes a nf_tables expression. This time, I decided to paste the original descriptions. This will result in a rather large commit description, but I think these bytes to keep. Patrick McHardy says: ==================== netfilter: nf_tables: concatenation support The following patches add support for concatenations, which allow multi dimensional exact matches in O(1). The basic idea is to split the data registers, currently consisting of 4 registers of 16 bytes each, into smaller units, 16 registers of 4 bytes each, and making sure each register store always leaves the full 32 bit in a well defined state, meaning smaller stores will zero the remaining bits. Based on that, we can load multiple adjacent registers with different values, thereby building a concatenated bigger value, and use that value for set lookups. Sets are changed to use variable sized extensions for their key and data values, removing the fixed limit of 16 bytes while saving memory if less space is needed. As a side effect, these patches will allow some nice optimizations in the future, like using jhash2 in nft_hash, removing the masking in nft_cmp_fast, optimized data comparison using 32 bit word size etc. These are not done so far however. The patches are split up as follows: * the first five patches add length validation to register loads and stores to make sure we stay within bounds and prepare the validation functions for the new addressing mode * the next patches prepare for changing to 32 bit addressing by introducing a struct nft_regs, which holds the verdict register as well as the data registers. The verdict members are moved to a new struct nft_verdict to allow to pull struct nft_data out of the stack. * the next patches contain preparatory conversions of expressions and sets to use 32 bit addressing * the next patch introduces so far unused register conversion helpers for parsing and dumping register numbers over netlink * following is the real conversion to 32 bit addressing, consisting of replacing struct nft_data in struct nft_regs by an array of u32s and actually translating and validating the new register numbers. * the final two patches add support for variable sized data items and variable sized keys / data in set elements The patches have been verified to work correctly with nft binaries using both old and new addressing. ==================== Patrick McHardy says: ==================== netfilter: nf_tables: dynamic stateful expression instantiation The following patches are the grand finale of my nf_tables set work, using all the building blocks put in place by the previous patches to support something like iptables hashlimit, but a lot more powerful. Sets are extended to allow attaching expressions to set elements. The dynset expression dynamically instantiates these expressions based on a template when creating new set elements and evaluates them for all new or updated set members. In combination with concatenations this effectively creates state tables for arbitrary combinations of keys, using the existing expression types to maintain that state. Regular set GC takes care of purging expired states. We currently support two different stateful expressions, counter and limit. Using limit as a template we can express the functionality of hashlimit, but completely unrestricted in the combination of keys. Using counter we can perform accounting for arbitrary flows. The following examples from patch 5/5 show some possibilities. Userspace syntax is still WIP, especially the listing of state tables will most likely be seperated from normal set listings and use a more structured format: 1. Limit the rate of new SSH connections per host, similar to iptables hashlimit: flow ip saddr timeout 60s \ limit 10/second \ accept 2. Account network traffic between each set of /24 networks: flow ip saddr & 255.255.255.0 . ip daddr & 255.255.255.0 \ counter 3. Account traffic to each host per user: flow skuid . ip daddr \ counter 4. Account traffic for each combination of source address and TCP flags: flow ip saddr . tcp flags \ counter The resulting set content after a Xmas-scan look like this: { 192.168.122.1 . fin | psh | urg : counter packets 1001 bytes 40040, 192.168.122.1 . ack : counter packets 74 bytes 3848, 192.168.122.1 . psh | ack : counter packets 35 bytes 3144 } In the future the "expressions attached to elements" will be extended to also support user created non-stateful expressions to allow to efficiently select beween a set of parameter sets, f.i. a set of log statements with different prefixes based on the interface, which currently require one rule each. This will most likely have to wait until the next kernel version though. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-15fm10k: update xcast mode before synchronizing multicast addressesJeff Kirsher1-11/+11
When the PF receives a request to update a multicast address for the VF, it checks the enabled multicast mode first. Fix a bug where the VF tried to set a multicast address before requesting the required xcast mode. This ensures the multicast addresses are honored as long as the xcast mode was allowed. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: start service timer on probeJeff Kirsher1-3/+6
Since the service task handles varying work that doesn't all require the interface to be up, launch the service timer immediately. This ensures that we continually check the mailbox, as well as handle other tasks while the device is down. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: fix function header commentJeff Kirsher1-2/+1
The header comment included a miscopy of a C-code line, and also mis-used Rx FIFO when it clearly meant Tx FIFO Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: comment next_vf_mbx flowJeff Kirsher1-0/+11
Add a header comment explaining why we have the somewhat crazy mailbox flow. This flow is necessary as it prevents the PF<->SM mailbox from being flooded by the VF messages, which normally trigger a message to the PF. This helps prevent the case where we see a PF mailbox timeout. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: don't handle mailbox events in iov_event path and always process mailboxJeff Kirsher2-32/+5
Since we already schedule the service task, we can just wait for this task to handle the mailbox events from the VF. This reduces some complex code flow, and makes it so we have a single path for handling the VF messages. There is a possibility that we have a slight delay in handling VF messages, but it should be minimal. The result of tx_complete and !rx_ready is insufficient to determine whether we need to process the mailbox. There is a possible race condition whereby the VF fills up the mbmem for us, but we have already recently processed the mailboxes in the interrupt. During this time, the interrupt is disabled. Thus, our Rx FIFO is empty, but the mbmem now has data in it. Since we continually check whether Rx FIFO is empty, we then never call process. This results in the possibility to prevent PF from handling the VF mailbox messages. Instead, just call process every time, despite the fact that we may or may not have anything to process for the VF. There should be minimal overhead for doing this, and it resolves an issue where the VF never comes up due to never getting response for its SET_LPORT_STATE message. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: use separate workqueue for fm10k driverJeff Kirsher3-1/+16
Since we run the watchdog periodically, which might take a while and potentially monopolize the system default workqueue, create our own separate work queue. This also helps reduce and stabilize latency between scheduling the work in our interrupt and actually performing the work. Still use a timer for the regular scheduled interval but queue the work onto its own work queue. It seemed overkill to create a single workqueue per interface, so we just spawn a single work queue for all interfaces upon driver load. For this reason, use a multi-threaded workqueue with one thread per processor, rather than single threaded queue. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: Set PF queues to unlimited bandwidth during virtualizationJeff Kirsher1-1/+2
When returning virtualization queues from the VF back to the PF, do not retain the VF rate limiter. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Todd Russell <todd.a.russell@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: expose tx_timeout_count as an ethtool statJeff Kirsher1-0/+2
Named it tx_hang_count to differentiate it from tx_hwtstamp_timeout. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: only increment tx_timeout_count in Tx hang pathJeff Kirsher1-1/+0
We were incrementing the tx_timeout_count for both the Tx hang and then for all reset flows. Instead, we should only increment tx_timeout_count in the Tx hang path, so that our Tx hang counter does not increment when it was not caused by a Tx hang. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: remove extraneous "Reset interface" messageJeff Kirsher1-1/+0
Since we already print this message when a reset is requested via the RESET_REQUESTED flag, we do not need to print it before setting the flag. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: separate PF only stats so that VF does not display themJeff Kirsher1-14/+37
This patch resolves an issue with ethtool stats displaying useless values on the VF, because some stats simply have no meaning to the VF. Resolve this by splitting these out into PF_STATS and only showing them if we aren't the VF. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: use hw->mac.max_queues for statsJeff Kirsher2-8/+12
Even though it shouldn't strictly matter, don't count queue stats higher than the max_queues value stored for this mac. This ensures that we don't attempt to check queues which don't belong to use in VFs. This shouldn't be a visible change, as the VFs should see zero for queues which don't belong to them. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: only show actual queues, not the maximum in hardwareJeff Kirsher1-2/+3
Currently, we show statistics for all 128 queues, even though we don't necessarily have that many queues available especially in the VF case. Instead, use the hw->mac.max_queues value, which tells us how many queues we actually have, rather than the space for the rings we allocated. In this way, we prevent dumping statistics that are useless on the VF. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: allow creation of VLAN on default vidJeff Kirsher1-4/+4
Previously, the user was not allowed to create a VLAN interface on top of the switch default vid. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: fix unused warningsJeff Kirsher7-37/+35
The were several functions which had parameters which were never or sometimes used in functions. To resolve possible compiler warnings, use __always_unused or __maybe_unused kernel macros to resolve. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: Add netconsole supportJeff Kirsher3-0/+28
This change adds a function called "fm10k_netpoll" that's used to define "ndo_poll_controller" in "fm10k_netdev_ops". This is required to enable support for "netconsole" in fm10k. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: Have the VF get the default VLAN during initJeff Kirsher1-1/+5
Currently, the VFs do not read the default VLAN during initialization, so they will not be able to indicate untagged frames properly. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: Correct spelling mistakeJeff Kirsher1-2/+2
Corrected a spelling mistake that was found over time. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-15fm10k: Remove redundant rx_errors in ethtoolJeff Kirsher3-7/+4
Output of ethtool was reporting 2 rx_errors entries. This change removes one of the redundant entries. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
2015-04-15fm10k: Corrected an error in Tx statisticsJeff Kirsher1-2/+2
The function collecting Tx statistics was actually using values from the RX ring. Thus, Tx and Rx statistics values reported by "ifconfig" will return identical values. This change corrects this error and the Tx statistics is now reading from the Tx ring. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
2015-04-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller17-37/+85
The dwmac-socfpga.c conflict was a case of a bug fix overlapping changes in net-next to handle an error pointer differently. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14Merge branch 'cxgb4-next'David S. Miller2-17/+62
Hariprasad Shenai says: ==================== cxgb4: Misc. fixes for sge Increases value of MAX_IMM_TX_PKT_LEN to improve latency, fill freelist starving threshold based on adapter type, add comments for tx flits and sge length code and don't call t4_slow_intr_handler when we are not master PF. This patch series has been created against net-next tree and includes patches on cxgb4 driver We have included all the maintainers of respective drivers. Kindly review the change and let us know in case of any review comments. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14cxgb4: Don't call t4_slow_intr_handler when we're not the Master PFHariprasad Shenai2-3/+6
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14cxgb4: Add comment for calculate tx flits and sge length codeHariprasad Shenai1-1/+35
Add comment for tx filt and sge length calucaltion code, also remove a hardcoded value Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14cxgb4: Use device node in page allocationHariprasad Shenai1-2/+4
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14cxgb4: Freelist starving threshold varies from adapter to adapterHariprasad Shenai1-10/+16
fl_starv_thres could be different from adapter to adapter, don't use hardcoded values Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14cxgb4: Increased the value of MAX_IMM_TX_PKT_LEN from 128 to 256 bytesHariprasad Shenai1-1/+1
This allows a significant latency drop for packets of sizes between 128 and 192 bytes Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14bgmac: drop ring->num_slotsFelix Fietkau2-15/+15
The ring size is always known at compile time, so make the code a bit more efficient Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14bgmac: fix DMA rx corruptionFelix Fietkau1-9/+18
The driver needs to inform the hardware about the first invalid (not yet filled) rx slot, by writing its DMA descriptor pointer offset to the BGMAC_DMA_RX_INDEX register. This register was set to a value exceeding the rx ring size, effectively allowing the hardware constant access to the full ring, regardless of which slots are initialized. To fix this issue, always mark the last filled rx slot as invalid. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14bgmac: simplify dma init/cleanupFelix Fietkau1-39/+40
Instead of allocating buffers at device init time and initializing descriptors at device open, do both at the same time (during open). Free all buffers when closing the device. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Acked-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14bgmac: increase rx ring size from 511 to 512Felix Fietkau1-1/+1
Limiting it to 511 looks like a failed attempt at leaving one descriptor empty to allow the hardware to stop processing a buffer that has not been prepared yet. However, this doesn't work because this affects the total ring size as well Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14bgmac: add check for oversized packetsFelix Fietkau1-0/+7
In very rare cases, the MAC can catch an internal buffer that is bigger than it's supposed to be. Instead of crashing the kernel, simply pass the buffer back to the hardware Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14bgmac: simplify/optimize rx DMA error handlingFelix Fietkau1-37/+34
Allocate a new buffer before processing the completed one. If allocation fails, reuse the old buffer. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Acked-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14bgmac: set received skb headroom to NET_SKB_PADFelix Fietkau2-7/+11
A packet buffer offset of 30 bytes is inefficient, because the first 2 bytes end up in a different cacheline. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14bgmac: leave interrupts disabled as long as there is work to doFelix Fietkau2-22/+10
Always poll rx and tx during NAPI poll instead of relying on the status of the first interrupt. This prevents bgmac_poll from leaving unfinished work around until the next IRQ. In my tests this makes bridging/routing throughput under heavy load more stable and ensures that no new IRQs arrive as long as bgmac_poll uses up the entire budget. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14bgmac: simplify tx ring index handlingFelix Fietkau2-29/+23
Keep incrementing ring->start and ring->end instead of pointing it to the actual ring slot entry. This simplifies the calculation of the number of free slots. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Acked-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14toshiba: Remove celleb from Kconfig optionsDaniel Axtens1-2/+2
The toshiba drivers had celleb as an optional dependency. celleb has been dropped [1], so clean that out of Kconfig. [1] http://patchwork.ozlabs.org/patch/451730/ CC: netdev@vger.kernel.org CC: Valentin Rothberg <valentinrothberg@gmail.com> CC: mpe@ellerman.id.au CC: linuxppc-dev@lists.ozlabs.org Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14hv_netvsc: Implement partial copy into send bufferHaiyang Zhang3-19/+46
If remaining space in a send buffer slot is too small for the whole message, we only copy the RNDIS header and PPI data into send buffer, so we can batch one more packet each time. It reduces the vmbus per-message overhead. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-14Merge branch 'for-davem' of ↵David S. Miller87-387/+494
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Al Viro says: ==================== netdev-related stuff in vfs.git There are several commits sitting in vfs.git that probably ought to go in via net-next.git. First of all, there's merge with vfs.git#iocb - that's Christoph's aio rework, which has triggered conflicts with the ->sendmsg() and ->recvmsg() patches a while ago. It's not so much Christoph's stuff that ought to be in net-next, as (pretty simple) conflict resolution on merge. The next chunk is switch to {compat_,}import_iovec/import_single_range - new safer primitives for initializing iov_iter. The primitives themselves come from vfs/git#iov_iter (and they are used quite a lot in vfs part of queue), conversion of net/socket.c syscalls belongs in net-next, IMO. Next there's afs and rxrpc stuff from dhowells. And then there's sanitizing kernel_sendmsg et.al. + missing inlined helper for "how much data is left in msg->msg_iter" - this stuff is used in e.g. cifs stuff, but it belongs in net-next. That pile is pullable from git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-davem I'll post the individual patches in there in followups; could you take a look and tell if everything in there is OK with you? ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13tcp/dccp: get rid of central timewait timerEric Dumazet11-386/+69
Using a timer wheel for timewait sockets was nice ~15 years ago when memory was expensive and machines had a single processor. This does not scale, code is ugly and source of huge latencies (Typically 30 ms have been seen, cpus spinning on death_lock spinlock.) We can afford to use an extra 64 bytes per timewait sock and spread timewait load to all cpus to have better behavior. Tested: On following test, /proc/sys/net/ipv4/tcp_tw_recycle is set to 1 on the target (lpaa24) Before patch : lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0 419594 lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0 437171 While test is running, we can observe 25 or even 33 ms latencies. lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23 ... 1000 packets transmitted, 1000 received, 0% packet loss, time 20601ms rtt min/avg/max/mdev = 0.020/0.217/25.771/1.535 ms, pipe 2 lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23 ... 1000 packets transmitted, 1000 received, 0% packet loss, time 20702ms rtt min/avg/max/mdev = 0.019/0.183/33.761/1.441 ms, pipe 2 After patch : About 90% increase of throughput : lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0 810442 lpaa23:~# ./super_netperf 200 -H lpaa24 -t TCP_CC -l 60 -- -p0,0 800992 And latencies are kept to minimal values during this load, even if network utilization is 90% higher : lpaa24:~# ping -c 1000 -i 0.02 -qn lpaa23 ... 1000 packets transmitted, 1000 received, 0% packet loss, time 19991ms rtt min/avg/max/mdev = 0.023/0.064/0.360/0.042 ms Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13netfilter: Fix format string of nfnetlink_log proc fileRichard Weinberger1-1/+1
The printed values are all of type unsigned integer, therefore use %u instead of %d. Otherwise an user can face negative values. Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13netfilter: Fix format string of nfnetlink_queue proc fileRichard Weinberger1-1/+1
The printed values are all of type unsigned integer, therefore use %u instead of %d. Otherwise an user can face negative values. Fixes: $ cat /proc/net/netfilter/nfnetlink_queue 0 29508 278 2 65531 0 2004213241 -2129885586 1 1 -27747 0 2 65531 0 0 0 1 2 -27748 0 2 65531 0 0 0 1 Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-13netfilter: Fix portid typesRichard Weinberger2-6/+5
The netlink portid is an unsigned integer, use this type also in netfilter. Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>