summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2013-02-07netpoll: protect napi_poll and poll_controller during dev_[open|close]Neil Horman1-1/+10
Ivan Vercera was recently backporting commit 9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that, while this patch protects the tg3 driver from having its ndo_poll_controller routine called during device initalization, it does nothing for the driver during shutdown. I.e. it would be entirely possible to have the ndo_poll_controller method (or subsequently the ndo_poll) routine called for a driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or ndo_open routine could be called. Given that the two latter routines tend to initizlize and free many data structures that the former two rely on, the result can easily be data corruption or various other crashes. Furthermore, it seems that this is potentially a problem with all net drivers that support netpoll, and so this should ideally be fixed in a common path. As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic context, so I've come up with this solution. We can use a mutex to sleep in open/close paths and just do a mutex_trylock in the napi poll path and abandon the poll attempt if we're locked, as we'll just retry the poll on the next send anyway. I've tested this here by flooding netconsole with messages on a system whos nic driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx workqueue would be forced to send frames and poll the device. While this was going on I rapidly ifdown/up'ed the interface and watched for any problems. I've not found any. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Ivan Vecera <ivecera@redhat.com> CC: "David S. Miller" <davem@davemloft.net> CC: Ben Hutchings <bhutchings@solarflare.com> CC: Francois Romieu <romieu@fr.zoreil.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05tcp: remove Appropriate Byte Count supportStephen Hemminger2-2/+0
TCP Appropriate Byte Count was added by me, but later disabled. There is no point in maintaining it since it is a potential source of bugs and Linux already implements other better window protection heuristics. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-11/+13
Conflicts: drivers/net/ethernet/intel/e1000e/ethtool.c drivers/net/vmxnet3/vmxnet3_drv.c drivers/net/wireless/iwlwifi/dvm/tx.c net/ipv6/route.c The ipv6 route.c conflict is simple, just ignore the 'net' side change as we fixed the same problem in 'net-next' by eliminating cached neighbours from ipv6 routes. The e1000e conflict is an addition of a new statistic in the ethtool code, trivial. The vmxnet3 conflict is about one change in 'net' removing a guarding conditional, whilst in 'net-next' we had a netdev_info() conversion. The iwlwifi conflict is dealing with a WARN_ON() conversion in 'net-next' vs. a revert happening in 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-03Merge branch 'delete-wanrouter' of ↵David S. Miller4-703/+8
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Paul Gortmaker says: ==================== The removal of wanrouter code was originally listed in the (now gone) feature removal file since May 2012, and an RFC of the deletion was posted[1] in late 2012. The overall concept was given an OK, but defconfig contamination, build failures, etc. meant that it didn't quite make it into mainline for 3.8. Since that time, Dan discovered (via code audit) a runtime bug that proves nobody has been using this for over four years[2]. With that new information, I think it makes sense for someone to follow through on Joe's original RFC and get this done for the 3.9 release. In addition to resolving the build failures of the RFC by keeping stub headers, this also splits the change into two parts, just like the token ring removal did. Part #1 decouples the mainline kernel from the expired subsystem, and part #2 does the large scale deletion of the subsystem content. The advantage of the above, is that a "git blame" will never lead you to a 4000+ line deletion commit. The large scale deletion will never show up in a "git blame" and hence the same advantages that we get from the "--irreversible-delete" in the review stage of "git format-patch" are also embedded into the git history itself. This may seem like a moot point to some, but for those who spend a considerable amount of time data mining in the git history, this is probably worth doing. I have done build tests of all[mod/yes]config for both the stage 1 (Makefile and Kconfig) and stage 2 (full driver delete) as a sanity check, and the issues with the previously posted RFC should be gone. Speaking of "--irreversible-delete" -- these patches were created with that option, so if you want to use them locally, you are going to have to pull (location below) the content instead of doing a "git am" of the mailed out content. [1] http://patchwork.ozlabs.org/patch/198794/ [2] http://www.spinics.net/lists/netdev/msg218670.html ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-01wanrouter: delete now orphaned header content, files/driversPaul Gortmaker4-703/+8
The wanrouter support was identified earlier as unused for years, and so the previous commit totally decoupled it from the kernel, leaving the related wanrouter files present, but totally inert. Here we take the final step in that cleanup, by doing a wholesale removal of these files. The two step process is used so that the large deletion is decoupled from the git history of files that we still care about. The drivers deleted here all were dependent on the Kconfig setting CONFIG_WAN_ROUTER_DRIVERS. A stub wanrouter.h header (kernel & uapi) are left behind so that drivers/isdn/i4l/isdn_x25iface.c continues to compile, and so that we don't accidentally break userspace that expected these defines. Cc: Joe Perches <joe@perches.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-01-31ipv6: rename datagram_send_ctl and datagram_recv_ctlTom Parkin1-11/+11
The datagram_*_ctl functions in net/ipv6/datagram.c are IPv6-specific. Since datagram_send_ctl is publicly exported it should be appropriately named to reflect the fact that it's for IPv6 only. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31net/mlx4_en: Don't reassign port mac address on firmware that supports itMatan Barak1-1/+2
Mac reassignments should only be done when not supported by the firmware. To accomplish that, checking firmware capability bit to know whether we should reassign macs in the driver. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31ipv6 flowlabel: Convert np->ipv6_fl_list to RCU.YOSHIFUJI Hideaki / 吉藤英明2-1/+2
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31ipv6 flowlabel: Convert hash list to RCU.YOSHIFUJI Hideaki / 吉藤英明1-0/+1
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31net: usbnet: prevent buggy devices from killing usBjørn Mork1-0/+2
A device sending 0 length frames as fast as it can has been observed killing the host system due to the resulting memory pressure. Temporarily disable RX skb allocation and URB submission when the current error ratio is high, preventing us from trying to allocate an infinite number of skbs. Reenable as soon as we are finished processing the done queue, allowing the device to continue working after short error bursts. Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-30xfrm: Convert xfrm_addr_cmp() to boolean xfrm_addr_equal().YOSHIFUJI Hideaki / 吉藤英明1-13/+12
All users of xfrm_addr_cmp() use its result as boolean. Introduce xfrm_addr_equal() (which is equal to !xfrm_addr_cmp()) and convert all users. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-30xfrm: Use ipv6_addr_equal() where appropriate.YOSHIFUJI Hideaki / 吉藤英明1-3/+10
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller21-54/+185
Bring in the 'net' tree so that we can get some ipv4/ipv6 bug fixes that some net-next work will build upon. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29ipv4: introduce address lifetimeJiri Pirko2-0/+10
There are some usecase when lifetime of ipv4 addresses might be helpful. For example: 1) initramfs networkmanager uses a DHCP daemon to learn network configuration parameters 2) initramfs networkmanager addresses, routes and DNS configuration 3) initramfs networkmanager is requested to stop 4) initramfs networkmanager stops all daemons including dhclient 5) there are addresses and routes configured but no daemon running. If the system doesn't start networkmanager for some reason, addresses and routes will be used forever, which violates RFC 2131. This patch is essentially a backport of ivp6 address lifetime mechanism for ipv4 addresses. Current "ip" tool supports this without any patch (since it does not distinguish between ipv4 and ipv6 addresses in this perspective. Also, this should be back-compatible with all current netlink users. Reported-by: Pavel Šimerda <psimerda@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: frag, move LRU list maintenance outside of rwlockJesper Dangaard Brouer1-0/+22
Updating the fragmentation queues LRU (Least-Recently-Used) list, required taking the hash writer lock. However, the LRU list isn't tied to the hash at all, so we can use a separate lock for it. Original-idea-by: Florian Westphal <fw@strlen.de> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: use lib/percpu_counter API for fragmentation mem accountingJesper Dangaard Brouer1-8/+18
Replace the per network namespace shared atomic "mem" accounting variable, in the fragmentation code, with a lib/percpu_counter. Getting percpu_counter to scale to the fragmentation code usage requires some tweaks. At first view, percpu_counter looks superfast, but it does not scale on multi-CPU/NUMA machines, because the default batch size is too small, for frag code usage. Thus, I have adjusted the batch size by using __percpu_counter_add() directly, instead of percpu_counter_sub() and percpu_counter_add(). The batch size is increased to 130.000, based on the largest 64K fragment memory usage. This does introduce some imprecise memory accounting, but its does not need to be strict for this use-case. It is also essential, that the percpu_counter, does not share cacheline with other writers, to make this scale. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: frag helper functions for mem limit trackingJesper Dangaard Brouer2-1/+28
This change is primarily a preparation to ease the extension of memory limit tracking. The change does reduce the number atomic operation, during freeing of a frag queue. This does introduce a some performance improvement, as these atomic operations are at the core of the performance problems seen on NUMA systems. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: cacheline adjust struct inet_frag_queueJesper Dangaard Brouer1-4/+5
Fragmentation code cacheline adjusting of struct inet_frag_queue. Take advantage of the size of struct timer_list, and move all but spinlock_t lock, below the timer struct. On 64-bit 'lru_list', 'list' and 'refcnt', fits exactly into the next cacheline, and a new cacheline starts at 'fragments'. The netns_frags *net pointer is moved to the end of the struct, because its used in a compare, with "next/close-by" elements of which this struct is embedded into. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: cacheline adjust struct inet_frags for better frag performanceJesper Dangaard Brouer1-4/+7
The globally shared rwlock, of struct inet_frags, shares cacheline with the 'rnd' number, which is used by the hash calculations. Fix this, as this obviously is a bad idea, as unnecessary cache-misses will occur when accessing the 'rnd' number. Also small note that, moving function ptr (*match) up in struct, is to avoid it lands on the next cacheline (on 64-bit). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: cacheline adjust struct netns_frags for better frag performanceJesper Dangaard Brouer1-1/+4
This small cacheline adjustment of struct netns_frags improves performance significantly for the fragmentation code. Struct members 'lru_list' and 'mem' are both hot elements, and it hurts performance, due to cacheline bouncing at every call point, when they share a cacheline. Also notice, how mem is placed together with 'high_thresh' and 'low_thresh', as they are used in the compare operations together. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net neigh: Optimize neighbor entry size calculation.YOSHIFUJI Hideaki / 吉藤英明1-1/+1
When allocating memory for neighbour cache entry, if tbl->entry_size is not set, we always calculate sizeof(struct neighbour) + tbl->key_len, which is common in the same table. With this change, set tbl->entry_size during the table initialization phase, if it was not set, and use it in neigh_alloc() and neighbour_priv(). This change also allow us to have both of protocol private data and device priate data at tha same time. Note that the only user of prototcol private is DECnet and the only user of device private is ATM CLIP. Since those are exclusive, we have not been facing issues here. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29drivers/net/phy/micrel_phy: Add support for new PHYsDavid J. Choi1-1/+8
Summary of changes: .Newly added phys -KSZ8081/KSZ8091, which has some phy ids. -KSZ8061 -KSZ9031, which is Gigabit phy. -KSZ886X, which has a switch function. -KSZ8031, which has a same phy ids with KSZ8021. Signed-off-by: David J. Choi <david.choi@micrel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29netpoll: add RCU annotation to npinfo fieldCong Wang1-1/+1
dev->npinfo is protected by RCU. This fixes the following sparse warnings: net/core/netpoll.c:177:48: error: incompatible types in comparison expression (different address spaces) net/core/netpoll.c:200:35: error: incompatible types in comparison expression (different address spaces) net/core/netpoll.c:221:35: error: incompatible types in comparison expression (different address spaces) net/core/netpoll.c:327:18: error: incompatible types in comparison expression (different address spaces) Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29Merge branch 'for-davem' of ↵David S. Miller11-165/+409
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Included is an NFC pull. Samuel says: "It brings the following goodies: - LLCP socket timestamping (To be used e.g with the recently released nfctool application for a more efficient skb timestamping when sniffing). - A pretty big pn533 rework from Waldemar, preparing the driver to support more flavours of pn533 based devices. - HCI changes from Eric in preparation for the microread driver support. - Some LLCP memory leak fixes, cleanups and slight improvements. - pn544 and nfcwilink move to the devm_kzalloc API. - An initial Secure Element (SE) API. - An nfc.h license change from the original author, allowing non GPL application code to safely include it." Also included are a pair of mac80211 pulls. Johannes says: "We found two bugs in the previous code, so I'm sending you a pull request again this soon. This contains two regulatory bug fixes, some of Thomas's hwsim beacon timer work and a documentation fix from Bob." "Another pull request for mac80211-next. This time, I have a number of things, the patches are mostly self-explanatory. There are a few fixes from Felix and myself, and random cleanups & improvements. The biggest thing is the partial patchset from Marco preparing for mesh powersave." Additionally, there are a pair of iwlwifi pulls. Johannes says: "For iwlwifi-next, I have a few cleanups/improvements as well as a few not very important fixes and more preparations for new devices." "Please pull a few updates for iwlwifi. These are just some cleanups and a debug improvement." On top of that, there is a slew of driver updates. This includes brcmfmac, mwifiex, ath9k, carl9170, and mwl8k as well as a handful of others. The bcma and ssb busses get some attention as well. Still, I don't see any big headliners here. Also included is a pull of the wireless tree, in order to resolve some merge conflicts. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29can: rework skb reserved data handlingOliver Hartkopp1-0/+10
Added accessor and skb_reserve helpers for struct can_skb_priv. Removed pointless skb_headroom() check. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> CC: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28Merge tag 'mfd-for-linus-3.8-1' of ↵Linus Torvalds6-35/+93
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 Pull MFD fixes from Samuel Ortiz: "This is the first pull request for MFD fixes for 3.8 We have some build failure fixes (twl4030, vexpress, abx500 and tps65910), some actual runtime oops and lockup fixes (rtsx, da9052), and some more hypothetical NULL pointers dereferences fixes for pcf50633 and max776xx. Then we also have additional rtsx fixes for a correct switch output voltage and clock divider correctness for rtl8411 (rtsx driver), and irqdomain fix for db8550-prcmu, and some more cosmetic fixes for arizona and wm5102." * tag 'mfd-for-linus-3.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: mfd: rtsx: Fix oops when rtsx_pci_sdmmc is not probed mfd: wm5102: Fix definition of WM5102_MAX_REGISTER mfd: twl4030: Don't warn about uninitialized return code mfd: da9052/53 lockup fix mfd: rtsx: Add clock divider hook mmc: rtsx: Call MFD hook to switch output voltage mfd: rtsx: Add output voltage switch hook mfd: Fix compile errors and warnings when !CONFIG_AB8500_BM mfd: vexpress: Export global functions to fix build error mfd: arizona: Check errors from regcache_sync() mfd: tc3589x: Use simple irqdomain mfd: pcf50633: Init pcf->dev before using it mfd: max77693: Init max77693->dev before using it mfd: max77686: Init max77686->dev before using it mfd: db8500-prcmu: Fix irqdomain usage mfd: tps65910: Select REGMAP_IRQ in Kconfig to fix build error mfd: arizona: Disable control interface reporting for WM5102 and WM5110
2013-01-28Merge branch 'master' of ↵John W. Linville11-165/+409
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
2013-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds4-13/+51
Pull networking updates from David Miller: "Much more accumulated than I would have liked due to an unexpected bout with a nasty flu: 1) AH and ESP input don't set ECN field correctly because the transport head of the SKB isn't set correctly, fix from Li RongQing. 2) If netfilter conntrack zones are disabled, we can return an uninitialized variable instead of the proper error code. Fix from Borislav Petkov. 3) Fix double SKB free in ath9k driver beacon handling, from Felix Feitkau. 4) Remove bogus assumption about netns cleanup ordering in nf_conntrack, from Pablo Neira Ayuso. 5) Remove a bogus BUG_ON in the new TCP fastopen code, from Eric Dumazet. It uses spin_is_locked() in it's test and is therefore unsuitable for UP. 6) Fix SELINUX labelling regressions added by the tuntap multiqueue changes, from Paul Moore. 7) Fix CRC errors with jumbo frame receive in tg3 driver, from Nithin Nayak Sujir. 8) CXGB4 driver sets interrupt coalescing parameters only on first queue, rather than all of them. Fix from Thadeu Lima de Souza Cascardo. 9) Fix regression in the dispatch of read/write registers in dm9601 driver, from Tushar Behera. 10) ipv6_append_data miscalculates header length, from Romain KUNTZ. 11) Fix PMTU handling regressions on ipv4 routes, from Steffen Klassert, Timo Teräs, and Julian Anastasov. 12) In 3c574_cs driver, add necessary parenthesis to "x << y & z" expression. From Nickolai Zeldovich. 13) macvlan_get_size() causes underallocation netlink message space, fix from Eric Dumazet. 14) Avoid division by zero in xfrm_replay_advance_bmp(), from Nickolai Zeldovich. Amusingly the zero check was already there, we were just performing it after the modulus :-) 15) Some more splice bug fixes from Eric Dumazet, which fix things mostly eminating from how we now more aggressively use high-order pages in SKBs. 16) Fix size calculation bug when freeing hash tables in the IPSEC xfrm code, from Michal Kubecek. 17) Fix PMTU event propagation into socket cached routes, from Steffen Klassert. 18) Fix off by one in TX buffer release in netxen driver, from Eric Dumazet. 19) Fix rediculous memory allocation requirements introduced by the tuntap multiqueue changes, from Jason Wang. 20) Remove bogus AMD platform workaround in r8169 driver that causes major problems in normal operation, from Timo Teräs. 21) virtio-net set affinity and select queue don't handle discontiguous cpu numbers properly, fix from Wanlong Gao. 22) Fix a route refcounting issue in loopback driver, from Eric Dumazet. There's a similar fix coming that we might add to the macvlan driver as well. 23) Fix SKB leaks in batman-adv's distributed arp table code, from Matthias Schiffer. 24) r8169 driver gives descriptor ownership back the hardware before we're done reading the VLAN tag out of it, fix from Francois Romieu. 25) Checksums not calculated properly in GRE tunnel driver fix from Pravin B Shelar. 26) Fix SCTP memory leak on namespace exit." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (101 commits) dm9601: support dm9620 variant SCTP: Free the per-net sysctl table on net exit. v2 net: phy: icplus: fix broken INTR pin settings net: phy: icplus: Use the RGMII interface mode to configure clock delays IP_GRE: Fix kernel panic in IP_GRE with GRE csum. sctp: set association state to established in dupcook_a handler ip6mr: limit IPv6 MRT_TABLE identifiers r8169: fix vlan tag read ordering. net: cdc_ncm: use IAD provided by the USB core batman-adv: filter ARP packets with invalid MAC addresses in DAT batman-adv: check for more types of invalid IP addresses in DAT batman-adv: fix skb leak in batadv_dat_snoop_incoming_arp_reply() net: loopback: fix a dst refcounting issue virtio-net: reset virtqueue affinity when doing cpu hotplug virtio-net: split out clean affinity function virtio-net: fix the set affinity bug when CPU IDs are not consecutive can: pch_can: fix invalid error codes can: ti_hecc: fix invalid error codes can: c_can: fix invalid error codes r8169: remove the obsolete and incorrect AMD workaround ...
2013-01-28net: fix possible wrong checksum generationEric Dumazet1-0/+19
Pravin Shelar mentioned that GSO could potentially generate wrong TX checksum if skb has fragments that are overwritten by the user between the checksum computation and transmit. He suggested to linearize skbs but this extra copy can be avoided for normal tcp skbs cooked by tcp_sendmsg(). This patch introduces a new SKB_GSO_SHARED_FRAG flag, set in skb_shinfo(skb)->gso_type if at least one frag can be modified by the user. Typical sources of such possible overwrites are {vm}splice(), sendfile(), and macvtap/tun/virtio_net drivers. Tested: $ netperf -H 7.7.8.84 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.8.84 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 3959.52 $ netperf -H 7.7.8.84 -t TCP_SENDFILE TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.8.84 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 3216.80 Performance of the SENDFILE is impacted by the extra allocation and copy, and because we use order-0 pages, while the TCP_STREAM uses bigger pages. Reported-by: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-nextDavid S. Miller4-0/+99
Marc Kleine-Budde says: ==================== this is a pull-request for net-next/master. There is are 9 patches by Fabio Baltieri and Kurt Van Dijck which add LED infrastructure and support for CAN devices. Bernd Krumboeck adds a driver for the USB CAN adapter from 8 devices. Oliver Hartkopp improves the CAN gateway functionality. There are 4 patches by me, which clean up the CAN's Kconfig. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28net: add RCU annotation to sk_dst_cache fieldCong Wang1-1/+1
sock->sk_dst_cache is protected by RCU. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28decnet: use correct RCU API to deref sk_dst_cache fieldCong Wang1-1/+1
sock->sk_dst_cache is protected by RCU, therefore we should use __sk_dst_get() to deref it once we lock the sock. This fixes several sparse warnings. Cc: linux-decnet-user@lists.sourceforge.net Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28gro: Fix kcalloc argument orderJoe Perches1-2/+2
First number, then size. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-27Merge branch 'master' of git://1984.lsi.us.es/nf-nextDavid S. Miller18-28/+179
Pablo Neira Ayuso says: ==================== This batch contains netfilter updates for you net-next tree, they are: * The new connlabel extension for x_tables, that allows us to attach labels to each conntrack flow. The kernel implementation uses a bitmask and there's a file in user-space that maps the bits with the corresponding string for each existing label. By now, you can attach up to 128 overlapping labels. From Florian Westphal. * A new round of improvements for the netns support for conntrack. Gao feng has moved many of the initialization code of each module of the netns init path. He also made several code refactoring, that code looks cleaner to me now. * Added documentation for all possible tweaks for nf_conntrack via sysctl, from Jiri Pirko. * Cisco 7941/7945 IP phone support for our SIP conntrack helper, from Kevin Cernekee. * Missing header file in the snmp helper, from Stephen Hemminger. * Finally, a couple of fixes to resolve minor issues with these changes, from myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-27mfd: da9052/53 lockup fixAshish Jangam2-4/+65
An issue has been reported where the PMIC either locks up or fails to respond following a system Reset. This could result in a second write in which the bus writes the current content of the write buffer to address of the last I2C access. The failure case is where this unwanted write transfers incorrect data to a critical register. This patch fixes this issue to by following any read or write with a dummy read to a safe register address. A safe register address is one where the contents will not affect the operation of the system. Signed-off-by: Ashish Jangam <ashish.jangam@kpitcummins.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-01-27mfd: rtsx: Add clock divider hookWei WANG2-0/+4
Add callback function conv_clk_and_div_n to convert between SSC clock and its divider N. For rtl8411, the formula to calculate SSC clock divider N is different with the other card reader models. Signed-off-by: Wei WANG <wei_wang@realsil.com.cn> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-01-27mfd: rtsx: Add output voltage switch hookWei WANG1-4/+20
Different card reader has different method to switch output voltage, add this callback to let the card reader implement its individual switch function. This is needed as rtl8411 has a specific switch output voltage procedure. Signed-off-by: Wei WANG <wei_wang@realsil.com.cn> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-01-26can: gw: indicate and count deleted frames due to misconfigurationOliver Hartkopp1-0/+1
Add a statistic counter to detect deleted frames due to misconfiguration with a new read-only CGW_DELETED netlink attribute for the CAN gateway. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2013-01-26can: gw: make routing to the incoming CAN interface configurableOliver Hartkopp1-0/+1
Introduce new configuration flag CGW_FLAGS_CAN_IIF_TX_OK to configure if a CAN sk_buff that has been routed with can-gw is allowed to be send back to the originating CAN interface. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2013-01-26can: add private data space for CAN sk_buffsOliver Hartkopp1-0/+35
The struct can_skb_priv is used to transport additional information along with the stored struct can(fd)_frame that can not be contained in existing struct sk_buff elements. can_skb_priv is located in the skb headroom, which does not touch the existing CAN sk_buff usage with skb->data and skb->len, so that even out-of-tree CAN drivers can be used without changes. Btw. out-of-tree CAN drivers without can_skb_priv in the sk_buff headroom would not support features based on can_skb_priv. The can_skb_priv->ifindex contains the first interface where the CAN frame appeared on the local host. Unfortunately skb->skb_iif can not be used as this value is overwritten in every netif_receive_skb() call. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2013-01-26can: rename LED trigger name on netdev renamesKurt Van Dijck1-0/+9
The LED trigger name for CAN devices is based on the initial CAN device name, but does never change. The LED trigger name is not guaranteed to be unique in case of hotplugging CAN devices. This patch tries to address this problem by modifying the LED trigger name according to the CAN device name when the latter changes. v1 - Kurt Van Dijck v2 - Fabio Baltieri - remove rename blocking if trigger is bound - use led-subsystem function for the actual rename (still WiP) - call init/exit functions from dev.c v3 - Kurt Van Dijck - safe operation for non-candev based devices (vcan, slcan) based on earlier patch v4 - Kurt Van Dijck - trivial patch mistakes fixed Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be> Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2013-01-26can: export a safe netdev_priv wrapper for candevKurt Van Dijck1-0/+3
In net_device notifier calls, it was impossible to determine if a CAN device is based on candev in a safe way. This patch adds such test in order to access candev storage from within those notifiers. Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2013-01-26can: add tx/rx LED trigger supportFabio Baltieri2-0/+50
This patch implements the functions to add two LED triggers, named <ifname>-tx and <ifname>-rx, to a canbus device driver. Triggers are called from specific handlers by each CAN device driver and can be disabled altogether with a Kconfig option. The implementation keeps the LED on when the interface is UP and blinks the LED on network activity at a configurable rate. This only supports can-dev based drivers, as it uses some support field in the can_priv structure. Supported drivers should call devm_can_led_init() and can_led_event() as needed. Cleanup is handled automatically by devres, so no *_exit function is needed. Supported events are: - CAN_LED_EVENT_OPEN: turn on tx/rx LEDs - CAN_LED_EVENT_STOP: turn off tx/rx LEDs - CAN_LED_EVENT_TX: trigger tx LED blink - CAN_LED_EVENT_RX: trigger tx LED blink Cc: Wolfgang Grandegger <wg@grandegger.com> Cc: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2013-01-25Merge tag 'fixes-for-linus2' of ↵Linus Torvalds1-0/+41
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Here's a long-pending fixes pull request for arm-soc (I didn't send one in the -rc4 cycle). The larger deltas are from: - A fixup of error paths in the mvsdio driver - Header file move for a driver that hadn't been properly converted to multiplatform on i.MX, which was causing build failures when included - Device tree updates for at91 dealing mostly with their new pinctrl setup merged in 3.8 and mistakes in those initial configs The rest are the normal mix of small fixes all over the place; sunxi, omap, imx, mvebu, etc, etc." * tag 'fixes-for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (40 commits) mfd: vexpress-sysreg: Don't skip initialization on probe ARM: vexpress: Enable A7 cores in V2P-CA15_A7's Device Tree ARM: vexpress: extend the MPIDR range used for pen release check ARM: at91/dts: correct comment in at91sam9x5.dtsi for mii ARM: at91/at91_dt_defconfig: add at91sam9n12 SoC to DT defconfig ARM: at91/at91_dt_defconfig: remove memory specification to cmdline ARM: at91/dts: add macb mii pinctrl config for kizbox ARM: at91: rm9200: remake the BGA as default version ARM: at91: fix gpios on i2c-gpio for RM9200 DT ARM: at91/at91sam9x5 DTS: add SCK USART pins ARM: at91/at91sam9x5 DTS: correct wrong PIO BANK values on u(s)arts ARM: at91/at91-pinctrl documentation: fix typo and add some details ARM: kirkwood: fix missing #interrupt-cells property mmc: mvsdio: use devm_ API to simplify/correct error paths. clk: mvebu/clk-cpu.c: fix memory leakage ARM: OMAP2+: omap4-panda: add UART2 muxing for WiLink shared transport ARM: OMAP2+: DT node Timer iteration fix ARM: OMAP2+: Fix section warning for omap_init_ocp2scp() ARM: OMAP2+: fix build break for omapdrm ARM: OMAP2: Fix missing omap2xxx_clkt_vps_late_init function calls ...
2013-01-24Merge branch 'for-linus' of ↵Linus Torvalds1-0/+16
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k fixes from Geert Uytterhoeven: "The asm-generic changeset has been ack'ed by Arnd." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Wire up finit_module asm-generic/dma-mapping-broken.h: Provide dma_alloc_attrs()/dma_free_attrs() m68k: Provide dma_alloc_attrs()/dma_free_attrs()
2013-01-23Merge branch 'testing' of ↵David S. Miller2-4/+1
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== 1) Add a statistic counter for invalid output states and remove a superfluous state valid check, from Li RongQing. 2) Probe for asynchronous block ciphers instead of synchronous block ciphers to make the asynchronous variants available even if no synchronous block ciphers are found, from Jussi Kivilinna. 3) Make rfc3686 asynchronous block cipher and make use of the new asynchronous variant, from Jussi Kivilinna. 4) Replace some rwlocks by rcu, from Cong Wang. 5) Remove some unused defines. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-23soreuseport: TCP/IPv6 implementationTom Herbert2-1/+5
Motivation for soreuseport would be something like a web server binding to port 80 running with multiple threads, where each thread might have it's own listener socket. This could be done as an alternative to other models: 1) have one listener thread which dispatches completed connections to workers. 2) accept on a single listener socket from multiple threads. In case #1 the listener thread can easily become the bottleneck with high connection turn-over rate. In case #2, the proportion of connections accepted per thread tends to be uneven under high connection load (assuming simple event loop: while (1) { accept(); process() }, wakeup does not promote fairness among the sockets. We have seen the disproportion to be as high as 3:1 ratio between thread accepting most connections and the one accepting the fewest. With so_reusport the distribution is uniform. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-23soreuseport: TCP/IPv4 implementationTom Herbert2-3/+11
Allow multiple listener sockets to bind to the same port. Motivation for soresuseport would be something like a web server binding to port 80 running with multiple threads, where each thread might have it's own listener socket. This could be done as an alternative to other models: 1) have one listener thread which dispatches completed connections to workers. 2) accept on a single listener socket from multiple threads. In case #1 the listener thread can easily become the bottleneck with high connection turn-over rate. In case #2, the proportion of connections accepted per thread tends to be uneven under high connection load (assuming simple event loop: while (1) { accept(); process() }, wakeup does not promote fairness among the sockets. We have seen the disproportion to be as high as 3:1 ratio between thread accepting most connections and the one accepting the fewest. With so_reusport the distribution is uniform. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-23soreuseport: infrastructureTom Herbert3-3/+11
Definitions and macros for implementing soreusport. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-23netfilter: nf_conntrack: refactor l4proto support for netnsGao feng1-3/+7
Move the code that register/unregister l4proto to the module_init/exit context. Given that we have to modify some interfaces to accomodate these changes, it is a good time to use shorter function names for this using the nf_ct_* prefix instead of nf_conntrack_*, that is: nf_ct_l4proto_register nf_ct_l4proto_pernet_register nf_ct_l4proto_unregister nf_ct_l4proto_pernet_unregister We same many line breaks with it. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>