summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2015-02-03net/mlx4_core: Fix kernel Oops (mem corruption) when working with more than ↵Jack Morgenstein2-2/+3
80 VFs Commit de966c592802 (net/mlx4_core: Support more than 64 VFs) was meant to allow up to 126 VFs. However, due to leaving MLX4_MFUNC_MAX too low, using more than 80 VFs resulted in memory corruptions (and Oopses) when more than 80 VFs were requested. In addition, the number of slaves was left too high. This commit fixes these issues. Fixes: de966c592802 ("net/mlx4_core: Support more than 64 VFs") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03isdn: off by one in connect_res()Dan Carpenter1-1/+1
The bug here is that we use "Reject" as the index into the cau_t[] array in the else path. Since the cau_t[] has 9 elements if Reject == 9 then we are reading beyond the end of the array. My understanding of the code is that it's saying that if Reject is 1 or too high then that's invalid and we should hang up. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller7-63/+120
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree, they are: 1) Validate hooks for nf_tables NAT expressions, otherwise users can crash the kernel when using them from the wrong hook. We already got one user trapped on this when configuring masquerading. 2) Fix a BUG splat in nf_tables with CONFIG_DEBUG_PREEMPT=y. Reported by Andreas Schultz. 3) Avoid unnecessary reroute of traffic in the local input path in IPVS that triggers a crash in in xfrm. Reported by Florian Wiessner and fixes by Julian Anastasov. 4) Fix memory and module refcount leak from the error path of nf_tables_newchain(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03net: sctp: Deletion of an unnecessary check before the function call "kfree"Markus Elfring1-2/+1
The kfree() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-By: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03Merge branch 'udpv6_lockless_send'David S. Miller4-155/+327
Vladislav Yasevich says: ==================== ipv6: Add lockless UDP send path This series introduces a lockless UDPv6 send path similar to what Herbert Xu did for IPv4 a while ago. There are some difference from IPv4. IPv6 caching for flow label is a bit different, as well as it requires another cork cork structure that holds the IPv6 ancillary data. Please take a look. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03ipv6: Allow for partial checksums on non-ufo packetsVlad Yasevich1-1/+10
Currntly, if we are not doing UFO on the packet, all UDP packets will start with CHECKSUM_NONE and thus perform full checksum computations in software even if device support IPv6 checksum offloading. Let's start start with CHECKSUM_PARTIAL if the device supports it and we are sending only a single packet at or below mtu size. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03udpv6: Add lockless sendmsg() supportVlad Yasevich1-4/+20
This commit adds the same functionaliy to IPv6 that commit 903ab86d195cca295379699299c5fc10beba31c7 Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue Mar 1 02:36:48 2011 +0000 udp: Add lockless transmit path added to IPv4. UDP transmit path can now run without a socket lock, thus allowing multiple threads to send to a single socket more efficiently. This is only used when corking/MSG_MORE is not used. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03ipv6: Introduce udpv6_send_skb()Vlad Yasevich1-27/+40
Now that we can individually construct IPv6 skbs to send, add a udpv6_send_skb() function to populate the udp header and send the skb. This allows udp_v6_push_pending_frames() to re-use this function as well as enables us to add lockless sendmsg() support. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03ipv6: introduce ipv6_make_skbVlad Yasevich2-19/+103
This commit is very similar to commit 1c32c5ad6fac8cee1a77449f5abf211e911ff830 Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue Mar 1 02:36:47 2011 +0000 inet: Add ip_make_skb and ip_finish_skb It adds IPv6 version of the helpers ip6_make_skb and ip6_finish_skb. The job of ip6_make_skb is to collect messages into an ipv6 packet and poplulate ipv6 eader. The job of ip6_finish_skb is to transmit the generated skb. Together they replicated the job of ip6_push_pending_frames() while also provide the capability to be called independently. This will be needed to add lockless UDP sendmsg support. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03ipv6: Append sending data to arbitrary queueVlad Yasevich1-39/+67
Add the ability to append data to arbitrary queue. This will be needed later to implement lockless UDP sends. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03ipv6: pull cork initialization into its own function.Vlad Yasevich2-74/+96
Pull IPv6 cork initialization into its own function that can be re-used. IPv6 specific cork data did not have an explicit data structure. This patch creats eone so that just ipv6 cork data can be as arguemts. Also, since IPv6 tries to save the flow label into inet_cork_full tructure, pass the full cork. Adjust ip6_cork_release() to take cork data structures. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03cxgb4 : Improve IEEE DCBx support, other minor open-lldp fixesAnish Bhatt2-2/+107
* Add support for IEEE ets & pfc api. * Fix bug that resulted in incorrect bandwidth percentage being returned for CEE peers * Convert pfc enabled info from firmware format to what dcbnl expects before returning Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03net/tulip: don't warn about unknown ARM architectureArnd Bergmann1-1/+1
ARM has 32-byte cache lines, which according to the comment in the init registers function seems to work best with the default value of 0x4800 that is also used on sparc and parisc. This adds ARM to the same list, to use that default but no longer warn about it. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03net: hip04: add missing MODULE_LICENSEArnd Bergmann1-0/+1
The hip04 ethernet driver causes a new compile-time warning when built as a loadable module: WARNING: modpost: missing MODULE_LICENSE() in drivers/net/ethernet/hisilicon/hip04_eth.o see include/linux/module.h for more information This adds the license as "GPL", which matches the header of the file. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03Documentation: Update netlink_mmap.txtRichard Weinberger1-10/+3
Update netlink_mmap.txt wrt. commit 4682a0358639b29cf ("netlink: Always copy on mmap TX."). Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03net: dctcp: loosen requirement to assert ECT(0) during 3WHSFlorian Westphal1-9/+5
One deployment requirement of DCTCP is to be able to run in a DC setting along with TCP traffic. As Glenn Judd's NSDI'15 paper "Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter" [1] (tba) explains, one way to solve this on switch side is to split DCTCP and TCP traffic in two queues per switch port based on the DSCP: one queue soley intended for DCTCP traffic and one for non-DCTCP traffic. For the DCTCP queue, there's the marking threshold K as explained in commit e3118e8359bb ("net: tcp: add DCTCP congestion control algorithm") for RED marking ECT(0) packets with CE. For the non-DCTCP queue, there's f.e. a classic tail drop queue. As already explained in e3118e8359bb, running DCTCP at scale when not marking SYN/SYN-ACK packets with ECT(0) has severe consequences as for non-ECT(0) packets, traversing the RED marking DCTCP queue will result in a severe reduction of connection probability. This is due to the DCTCP queue being dominated by ECT(0) traffic and switches handle non-ECT traffic in the RED marking queue after passing K as drops, where K is usually a low watermark in order to leave enough tailroom for bursts. Splitting DCTCP traffic among several queues (ECN and non-ECN queue) is being considered a terrible idea in the network community as it splits single flows across multiple network paths. Therefore, commit e3118e8359bb implements this on Linux as ECT(0) marked traffic, as we argue that marking all packets of a DCTCP flow is the only viable solution and also doesn't speak against the draft. However, recently, a DCTCP implementation for FreeBSD hit also their mainline kernel [2]. In order to let them play well together with Linux' DCTCP, we would need to loosen the requirement that ECT(0) has to be asserted during the 3WHS as not implemented in FreeBSD. This simplifies the ECN test and lets DCTCP work together with FreeBSD. Joint work with Daniel Borkmann. [1] https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/judd [2] https://github.com/freebsd/freebsd/commit/8ad879445281027858a7fa706d13e458095b595f Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Glenn Judd <glenn.judd@morganstanley.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03Merge branch 'net-timestamp'David S. Miller11-17/+113
Willem de Bruijn says: ==================== net-timestamp: blinding Changes (v2 -> v3) - rebase only: v2 did not make it to patchwork / netdev (v1 -> v2) - fix capability check in patch 2 this could be moved into net/core/sock.c as sk_capable_nouser() (rfc -> v1) - dropped patch 4: timestamp batching due to complexity, as discussed - dropped patch 5: default mode because it does not really cover all use cases, as discussed - added documentation - minor fix, see patch 2 Two issues were raised during recent timestamping discussions: 1. looping full packets on the error queue exposes packet headers 2. TCP timestamping with retransmissions generates many timestamps This RFC patchset is an attempt at addressing both without breaking legacy behavior. Patch 1 reintroduces the "no payload" timestamp option, which loops timestamps onto an empty skb. This reduces the pressure on SO_RCVBUF from looping many timestamps. It does not reduce the number of recv() calls needed to process them. The timestamp cookie mechanism developed in http://patchwork.ozlabs.org/patch/427213/ did, but this is considerably simpler. Patch 2 then gives administrators the power to block all timestamp requests that contain data by unprivileged users. I proposed this earlier as a backward compatible workaround in the discussion of net-timestamp: pull headers for SOCK_STREAM http://patchwork.ozlabs.org/patch/414810/ Patch 3 only updates the txtimestamp example to test this option. Verified that with option '-n', length is zero in all cases and option '-I' (PKTINFO) stops working. ==================== Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03net-timestamp: no-payload option in txtimestamp testWillem de Bruijn1-4/+24
Demonstrate how SOF_TIMESTAMPING_OPT_TSONLY can be used and test the implementation. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03net-timestamp: no-payload only sysctlWillem de Bruijn5-1/+41
Tx timestamps are looped onto the error queue on top of an skb. This mechanism leaks packet headers to processes unless the no-payload options SOF_TIMESTAMPING_OPT_TSONLY is set. Add a sysctl that optionally drops looped timestamp with data. This only affects processes without CAP_NET_RAW. The policy is checked when timestamps are generated in the stack. It is possible for timestamps with data to be reported after the sysctl is set, if these were queued internally earlier. No vulnerability is immediately known that exploits knowledge gleaned from packet headers, but it may still be preferable to allow administrators to lock down this path at the cost of possible breakage of legacy applications. Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Changes (v1 -> v2) - test socket CAP_NET_RAW instead of capable(CAP_NET_RAW) (rfc -> v1) - document the sysctl in Documentation/sysctl/net.txt - fix access control race: read .._OPT_TSONLY only once, use same value for permission check and skb generation. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03net-timestamp: no-payload optionWillem de Bruijn6-12/+48
Add timestamping option SOF_TIMESTAMPING_OPT_TSONLY. For transmit timestamps, this loops timestamps on top of empty packets. Doing so reduces the pressure on SO_RCVBUF. Payload inspection and cmsg reception (aside from timestamps) are no longer possible. This works together with a follow on patch that allows administrators to only allow tx timestamping if it does not loop payload or metadata. Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Changes (rfc -> v1) - add documentation - remove unnecessary skb->len test (thanks to Richard Cochran) Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03sunvnet: set queue mapping when doing packet copiesDavid L Stevens1-0/+1
This patch fixes a bug where vnet_skb_shape() didn't set the already-selected queue mapping when a packet copy was required. This results in using the wrong queue index for stops/starts, hung tx queues and watchdog timeouts under heavy load. Signed-off-by: David L Stevens <david.stevens@oracle.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03qlge: Fix qlge_update_hw_vlan_features to handle if interface is downMarcelo Leitner1-10/+16
Currently qlge_update_hw_vlan_features() will always first put the interface down, then update features and then bring it up again. But it is possible to hit this code while the adapter is down and this causes a non-paired call to napi_disable(), which will get stuck. This patch fixes it by skipping these down/up actions if the interface is already down. Fixes: a45adbe8d352 ("qlge: Enhance nested VLAN (Q-in-Q) handling.") Cc: Harish Patil <harish.patil@qlogic.com> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-03Merge tag 'drm-amdkfd-fixes-2015-02-02' of ↵Dave Airlie3-4/+8
git://people.freedesktop.org/~gabbayo/linux into drm-fixes Three small fixes that came up during last week, nothing scary: - Accidently incremented a counter instead of decrementing it (copy-paste error) - Module parameter of max num of queues must be at least 1 and not 0 - Don't do BUG() as a result from wrong user input * tag 'drm-amdkfd-fixes-2015-02-02' of git://people.freedesktop.org/~gabbayo/linux: drm/amdkfd: Don't create BUG due to incorrect user parameter drm/amdkfd: max num of queues can't be 0 drm/amdkfd: Fix bug in accounting of queues
2015-02-03Merge branch 'drm-fixes-3.19' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie6-21/+31
into drm-fixes One last round of fixes for radeon for 3.19: - fix some fallout from the reservation object integration on the test/benchmark options - fix a crash in the gpu vm code if gfx init fails - fix a pll issue that leads to a blank screen on older IGP parts * 'drm-fixes-3.19' of git://people.freedesktop.org/~agd5f/linux: drm/radeon: fix the crash in test functions drm/radeon: fix the crash in benchmark functions drm/radeon: properly set vm fragment size for TN/RL drm/radeon: don't init gpuvm if accel is disabled (v3) drm/radeon: fix PLLs on RS880 and older v2
2015-02-02Bluetooth: Remove mgmt_rp_read_local_oob_ext_data structJohan Hedberg2-20/+9
This extended return parameters struct conflicts with the new Read Local OOB Extended Data command definition. To avoid the conflict simply rename the old "extended" version to the normal one and update the code appropriately to take into account the two possible response PDU sizes. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-02drm/radeon: fix the crash in test functionsIlija Hadzic1-4/+4
radeon_copy_dma and radeon_copy_blit must be called with a valid reservation object. Otherwise a crash will be provoked. We borrow the object from vram BO. bug: https://bugs.freedesktop.org/show_bug.cgi?id=88464 Cc: stable@vger.kernel.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-02drm/radeon: fix the crash in benchmark functionsIlija Hadzic1-5/+8
radeon_copy_dma and radeon_copy_blit must be called with a valid reservation object. Otherwise a crash will be provoked. We borrow the object from destination BO. bug: https://bugs.freedesktop.org/show_bug.cgi?id=88464 Cc: stable@vger.kernel.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-02drm/radeon: properly set vm fragment size for TN/RLAlex Deucher1-2/+4
Should be the same as cayman. We don't use VM by default on NI parts so this isn't critical. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2015-02-02drm/radeon: don't init gpuvm if accel is disabled (v3)Alex Deucher2-10/+12
If acceleration is disabled, it does not make sense to init gpuvm since nothing will use it. Moreover, if radeon_vm_init() gets called it uses accel to try and clear the pde tables, etc. which results in a bug. v2: handle vm_fini as well v3: handle bo_open/close as well Bug: https://bugs.freedesktop.org/show_bug.cgi?id=88786 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2015-02-02drm/radeon: fix PLLs on RS880 and older v2Christian König1-0/+3
This is a workaround for RS880 and older chips which seem to have an additional limit on the minimum PLL input frequency. v2: fix signed/unsigned warning bugs: https://bugzilla.kernel.org/show_bug.cgi?id=91861 https://bugzilla.kernel.org/show_bug.cgi?id=83461 Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2015-02-02PCI: Add NEC variants to Stratus ftServer PCIe DMI checkCharlotte Richardson1-0/+16
NEC OEMs the same platforms as Stratus does, which have multiple devices on some PCIe buses under downstream ports. Link: https://bugzilla.kernel.org/show_bug.cgi?id=51331 Fixes: 1278998f8ff6 ("PCI: Work around Stratus ftServer broken PCIe hierarchy (fix DMI check)") Signed-off-by: Charlotte Richardson <charlotte.richardson@stratus.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org # v3.5+ CC: Myron Stowe <myron.stowe@redhat.com>
2015-02-02sd: Fix max transfer length for 4k disksBrian King1-2/+4
The following patch fixes an issue observed with 4k sector disks where the max_hw_sectors attribute was getting set too large in sd_revalidate_disk. Since sdkp->max_xfer_blocks is in units of SCSI logical blocks and queue_max_hw_sectors is in units of 512 byte blocks, on a 4k sector disk, every time we went through sd_revalidate_disk, we were taking the current value of queue_max_hw_sectors and increasing it by a factor of 8. Fix this by only shifting sdkp->max_xfer_blocks. Cc: stable@vger.kernel.org Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-02-02scsi: fix device handler detach oopsMike Christie1-1/+2
This fixes a regression caused by commit 1d5203 ("scsi: handle more device handler setup/teardown in common code"). The bug is that the alua detach() callout will try to access the sddev->scsi_dh_data, but we have already set it to NULL. This patch moves the clearing of that field to after detach() is called. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-02-02Bluetooth: Set HCI_QUIRK_STRICT_DUPLICATE_FILTER for BTUSB_INTEL_NEWMarcel Holtmann1-0/+1
The Intel Snowfield Peak Bluetooth controllers use a strict scanning filter policy that filters based on Bluetooth device addresses and not on RSSI. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-02-02Bluetooth: Add restarting to service discoveryJakub Pawlowski2-5/+49
When using LE_SCAN_FILTER_DUP_ENABLE, some controllers would send advertising report from each LE device only once. That means that we don't get any updates on RSSI value, and makes Service Discovery very slow. This patch adds restarting scan when in Service Discovery, and device with filtered uuid is found, but it's not in RSSI range to send event yet. This way if device moves into range, we will quickly get RSSI update. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-02Bluetooth: Add le_scan_restart work for LE scan restartingJakub Pawlowski3-1/+97
Currently there is no way to restart le scan, and it's needed in service scan method. The way it work: it disable, and then enable le scan on controller. During the restart, we must remember when the scan was started, and it's duration, to later re-schedule the le_scan_disable work, that was stopped during the stop scan phase. Signed-off-by: Jakub Pawlowski <jpawlowski@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-02-02drm/amdkfd: Don't create BUG due to incorrect user parameterOded Gabbay1-1/+5
This patch changes a BUG_ON() statement to pr_debug, in case the user tries to update a non-existing queue. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Ben Goz <ben.goz@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-02drm/amdkfd: max num of queues can't be 0Oded Gabbay1-2/+2
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-02drm/amdkfd: Fix bug in accounting of queuesOded Gabbay1-1/+1
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-02-02net: rocker: Add support for retrieving port level statisticsDavid Ahern2-0/+155
Add support for retrieving port level statistics from device. Hook is added for ethtool's stats functionality. For example, $ ethtool -S eth3 NIC statistics: rx_packets: 12 rx_bytes: 2790 rx_dropped: 0 rx_errors: 0 tx_packets: 8 tx_bytes: 728 tx_dropped: 0 tx_errors: 0 Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02Merge branch 'switchdev_offload_flags'David S. Miller12-21/+206
Roopa Prabhu says: ==================== switchdev offload flags This patch series introduces new offload flags for switchdev. Kernel network subsystems can use this flag to accelerate network functions by offloading to hw. I expect that there will be need for subsystem specific feature flag in the future. This patch series currently only addresses bridge driver link attribute offloads to hardware. Looking at the current state of bridge l2 offload in the kernel, - flag 'self' is the way to directly manage the bridge device in hw via the ndo_bridge_setlink/ndo_bridge_getlink calls - flag 'master' is always used to manage the in kernel bridge devices via the same ndo_bridge_setlink/ndo_bridge_getlink calls Today these are used separately. The nic offloads use hwmode "vepa/veb" to go directly to hw with the "self" flag. At this point i am trying not to introduce any new user facing flags/attributes. In the model where we want the kernel bridging to be accelerated with hardware, we very much want the bridge driver to be involved. In this proposal, - The offload flag/bit helps switch asic drivers to indicate that they accelerate the kernel networking objects/functions - The user does not have to specify a new flag to do so. A bridge created with switch asic ports will be accelerated if the switch driver supports it. - The user can continue to directly manage l2 in nics (ixgbe) using the existing hwmode/self flags - It also does not stop users from using the 'self' flag to talk to the switch asic driver directly - Involving the bridge driver makes sure the add/del notifications to user space go out after both kernel and hardware are programmed (To selectively offload bridge port attributes, example learning in hw only etc, we can introduce offload bits for per bridge port flag attribute as in my previous patch https://patchwork.ozlabs.org/patch/413211/. I have not included that in this series) v2 - try a different name for the offload flag/bit - tries to solve the stacked netdev case by traversing the lowerdev list to reach the switch port v3 - - Tested with bond as bridge port for the stacked device case. Includes a bond_fix_features change to not ignore the NETIF_F_HW_NETFUNC_OFFLOAD flag - Some checkpatch fixes v4 - - rename flag to NETIF_F_HW_SWITCH_OFFLOAD - add ndo_bridge_setlink/dellink handlers in bond and team drivers as suggested by jiri. - introduce default ndo_dflt_netdev_switch_port_bridge_setlink/dellink handlers that masters can use to call offload api on lowerdevs. ==================== Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
2015-02-02team: handle NETIF_F_HW_SWITCH_OFFLOAD flag and add ↵Roopa Prabhu1-1/+4
ndo_bridge_setlink/dellink handlers Currently ndo_bridge_setlink and ndo_bridge_dellink handlers point to the default switchdev handlers This follows my bonding driver changes. I have only compile tested this patch. However similar bonding code has been tested. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02bonding: handle NETIF_F_HW_SWITCH_OFFLOAD flag and add ↵Roopa Prabhu1-1/+8
ndo_bridge_setlink/dellink handlers We want bond to pick up the offload flag if any of its slaves have it. NETIF_F_HW_SWITCH_OFFLOAD flag is added to the mask, so that netdev_increment_features does not ignore it. This also adds ndo_bridge_setlink and ndo_bridge_dellink handlers. These currently point to the default handlers provided by the switchdev api. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02rocker: set feature NETIF_F_HW_SWITCH_OFFLOADRoopa Prabhu1-1/+2
This patch sets the NETIF_F_HW_SWITCH_OFFLOAD feature flag on rocker ports Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02bridge: offload bridge port attributes to switch asic if feature flag setRoopa Prabhu1-3/+23
This patch adds support to set/del bridge port attributes in hardware from the bridge driver. With this, when the user sends a bridge setlink message with no flags or master flags set, - the bridge driver ndo_bridge_setlink handler sets settings in the kernel - calls the swicthdev api to propagate the attrs to the switchdev hardware You can still use the self flag to go to the switch hw or switch port driver directly. With this, it also makes sure a notification goes out only after the attributes are set both in the kernel and hw. The patch calls switchdev api only if BRIDGE_FLAGS_SELF is not set. This is because the offload cases with BRIDGE_FLAGS_SELF are handled in the caller (in rtnetlink.c). Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02swdevice: add new apis to set and del bridge port attributesRoopa Prabhu2-1/+146
This patch adds two new api's netdev_switch_port_bridge_setlink and netdev_switch_port_bridge_dellink to offload bridge port attributes to switch port (The names of the apis look odd with 'switch_port_bridge', but am more inclined to change the prefix of the api to something else. Will take any suggestions). The api's look at the NETIF_F_HW_SWITCH_OFFLOAD feature flag to pass bridge port attributes to the port device. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02bridge: add flags argument to ndo_bridge_setlink and ndo_bridge_dellinkRoopa Prabhu7-13/+18
bridge flags are needed inside ndo_bridge_setlink/dellink handlers to avoid another call to parse IFLA_AF_SPEC inside these handlers This is used later in this series Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02netdev: introduce new NETIF_F_HW_SWITCH_OFFLOAD feature flag for switch ↵Roopa Prabhu1-1/+5
device offloads This is a high level feature flag for all switch asic offloads switch drivers set this flag on switch ports. Logical devices like bridge, bonds, vxlans can inherit this flag from their slaves/ports. The patch also adds the flag to NETIF_F_ONE_FOR_ALL, so that it gets propagated to the upperdevices (bridges and bonds). Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02stmmac: DMA threshold mode or SF mode can be different among multiple device ↵Sonic Zhang1-2/+3
instance - In tx_hard_error_bump_tc interrupt, tc should be bumped only when current device instance is in DMA threshold mode. Check per device xstats.threshold other than global tc. - Set per device xstats.threshold to SF_DMA_MODE when current device instance is set to SF mode. v2-changes: - fix ident style Signed-off-by: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02ipv4: tcp: get rid of ugly unicast_sockEric Dumazet4-33/+37
In commit be9f4a44e7d41 ("ipv4: tcp: remove per net tcp_sock") I tried to address contention on a socket lock, but the solution I chose was horrible : commit 3a7c384ffd57e ("ipv4: tcp: unicast_sock should not land outside of TCP stack") addressed a selinux regression. commit 0980e56e506b ("ipv4: tcp: set unicast_sock uc_ttl to -1") took care of another regression. commit b5ec8eeac46 ("ipv4: fix ip_send_skb()") fixed another regression. commit 811230cd85 ("tcp: ipv4: initialize unicast_sock sk_pacing_rate") was another shot in the dark. Really, just use a proper socket per cpu, and remove the skb_orphan() call, to re-enable flow control. This solves a serious problem with FQ packet scheduler when used in hostile environments, as we do not want to allocate a flow structure for every RST packet sent in response to a spoofed packet. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>