summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2009-02-06tun: Limit amount of queued packets per deviceHerbert Xu3-53/+118
Unlike a normal socket path, the tuntap device send path does not have any accounting. This means that the user-space sender may be able to pin down arbitrary amounts of kernel memory by continuing to send data to an end-point that is congested. Even when this isn't an issue because of limited queueing at most end points, this can also be a problem because its only response to congestion is packet loss. That is, when those local queues at the end-point fills up, the tuntap device will start wasting system time because it will continue to send data there which simply gets dropped straight away. Of course one could argue that everybody should do congestion control end-to-end, unfortunately there are people in this world still hooked on UDP, and they don't appear to be going away anywhere fast. In fact, we've always helped them by performing accounting in our UDP code, the sole purpose of which is to provide congestion feedback other than through packet loss. This patch attempts to apply the same bandaid to the tuntap device. It creates a pseudo-socket object which is used to account our packets just as a normal socket does for UDP. Of course things are a little complex because we're actually reinjecting traffic back into the stack rather than out of the stack. The stack complexities however should have been resolved by preceding patches. So this one can simply start using skb_set_owner_w. For now the accounting is essentially disabled by default for backwards compatibility. In particular, we set the cap to INT_MAX. This is so that existing applications don't get confused by the sudden arrival EAGAIN errors. In future we may wish (or be forced to) do this by default. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05net: Reexport sock_alloc_send_pskbHerbert Xu2-4/+9
The function sock_alloc_send_pskb is completely useless if not exported since most of the code in it won't be used as is. In fact, this code has already been duplicated in the tun driver. Now that we need accounting in the tun driver, we can in fact use this function as is. So this patch marks it for export again. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05net: Partially allow skb destructors to be used on receive pathHerbert Xu2-0/+3
As it currently stands, skb destructors are forbidden on the receive path because the protocol end-points will overwrite any existing destructor with their own. This is the reason why we have to call skb_orphan in the loopback driver before we reinject the packet back into the stack, thus creating a period during which loopback traffic isn't charged to any socket. With virtualisation, we have a similar problem in that traffic is reinjected into the stack without being associated with any socket entity, thus providing no natural congestion push-back for those poor folks still stuck with UDP. Now had we been consistent in telling them that UDP simply has no congestion feedback, I could just fob them off. Unfortunately, we appear to have gone to some length in catering for this on the standard UDP path, with skb/socket accounting so that has created a very unhealthy dependency. Alas habits are difficult to break out of, so we may just have to allow skb destructors on the receive path. It turns out that making skb destructors useable on the receive path isn't as easy as it seems. For instance, simply adding skb_orphan to skb_set_owner_r isn't enough. This is because we assume all over the IP stack that skb->sk is an IP socket if present. The new transparent proxy code goes one step further and assumes that skb->sk is the receiving socket if present. Now all of this can be dealt with by adding simple checks such as only treating skb->sk as an IP socket if skb->sk->sk_family matches. However, it turns out that for bridging at least we don't need to do all of this work. This is of interest because most virtualisation setups use bridging so we don't actually go through the IP stack on the host (with the exception of our old nemesis the bridge netfilter, but that's easily taken care of). So this patch simply adds skb_orphan to the point just before we enter the IP stack, but after we've gone through the bridge on the receive path. It also adds an skb_orphan to the one place in netfilter that touches skb->sk/skb->destructor, that is, tproxy. One word of caution, because of the internal code structure, anyone wishing to deploy this must use skb_set_owner_w as opposed to skb_set_owner_r since many functions that create a new skb from an existing one will invoke skb_set_owner_w on the new skb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05Merge branch 'master' of ↵David S. Miller3-3/+5
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2009-02-05Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller148-1791/+2431
2009-02-05gianfar: Fix stashing supportAndy Fleming3-3/+38
Stashing is only supported on the 85xx (e500-based) SoCs. The 83xx and 86xx chips don't have a proper cache for this. U-Boot has been updated to add stashing properties to the device tree nodes of gianfar devices on 85xx. So now we modify Linux to keep stashing off unless those properties are there. Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05gianfar: Add support for skb recyclingAndy Fleming2-4/+21
Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05netdev: Merge UCC and gianfar MDIO bus driversAndy Fleming12-490/+549
The MDIO bus drivers for the UCC and gianfar ethernet controllers are essentially the same. There's no reason to duplicate that much code. Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05gianfar: Fix potential soft reset raceAndy Fleming1-0/+3
SOFT_RESET must be asserted for at least 3 TX clocks in order for it to work properly. The syncs in the gfar_write() commands have been hiding this, but we need to guarantee it. Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05gianfar: Fix BD_LENGTH_MASK definitionAndy Fleming1-1/+1
BD_LENGTH_MASK is supposed to catch the low 16-bits of the status field, not the low byte. The old way, we would never be able to clean up tx packets with sizes divisible by 256. Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05virtio_net: Allow setting the MAC address of the NICAlex Williamson1-2/+21
Many physical NICs let the OS re-program the "hardware" MAC address. Virtual NICs should allow this too. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05virtio_net: Add support for VLAN filtering in the hypervisorAlex Williamson2-1/+44
VLAN filtering allows the hypervisor to drop packets from VLANs that we're not a part of, further reducing the number of extraneous packets recieved. This makes use of the VLAN virtqueue command class. The CTRL_VLAN feature bit tells us whether the backend supports VLAN filtering. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05virtio_net: Add a MAC filter tableAlex Williamson2-8/+70
Make use of the MAC control virtqueue class to support a MAC filter table. The filter table is managed by the hypervisor. We consider the table to be available if the CTRL_RX feature bit is set. We leave it to the hypervisor to manage the table and enable promiscuous or all-multi mode as necessary depending on the resources available to it. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05virtio_net: Add a set_rx_mode interfaceAlex Williamson2-1/+44
Make use of the RX_MODE control virtqueue class to enable the set_rx_mode netdev interface. This allows us to selectively enable/disable promiscuous and allmulti mode so we don't see packets we don't want. For now, we automatically enable these as needed if additional unicast or multicast addresses are requested. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05virtio_net: Add a virtqueue for outbound control commandsAlex Williamson2-3/+83
This will be used for RX mode, MAC filter table, VLAN filtering, etc... The control transaction consists of one or more "out" sg entries and one or more "in" sg entries. The first out entry contains a header defining the class and command. Additional out entries may provide data for the command. The last in entry provides a status response back from the command. Virtqueues typically run asynchronous, running a callback function when there's data in the channel. We can't readily make use of this in the command paths where we need to use this. Instead, we kick the virtqueue and spin. The kick causes an I/O write, triggering an immediate trap into the hypervisor. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05cxgb3: Fix lro switchDivy Le Ray1-2/+1
The LRO switch is always set to 1 in the rx processing loop. It breaks the accelerated iSCSI receive traffic. Fix its computation. Signed-off-by: Divy Le Ray <divy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-04Merge branch 'for-linus' of ↵Linus Torvalds59-867/+985
git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (40 commits) Blackfin arch: Remove outdated code Blackfin arch: Fix udelay implementation Blackfin arch: Update Copyright information Blackfin arch: Add BF561 PPI POLS, POLC Masks Blackfin arch: Update CM-BF527 kernel config Blackfin arch: define bfin_memmap as static since it is only used here Blackfin arch: cplb mananger: use a do...while loop rather than a for loop Blackfin arch: fix bug - traps test case 19 for exception 0x2d fails Blackfin arch: add platform device bfin_mii-bus and KSZ8893M switch driver platform resources to board files Blackfin arch: build jtag tty driver as a module by default Blackfin arch: fix 2 bugs related to debug Blackfin arch: Add ANOMALY_05000380 to BF54x to kill the compile warning Blackfin arch: Fix bug - 561 SMP kernel can't boot from jffs2 Blackfin arch: base SIC_IWR# programming on whether the MMR exists Blackfin arch: read SYSCR on newer parts that mirror the bits of SWRST in it Blackfin arch: fixup board init function name Blackfin arch: drop CONFIG_I2C_BOARDINFO ifdefs Blackfin arch: bfin_reset->_bfin_reset redirection no longer needed Blackfin arch: sync reboot handler with version in u-boot Blackfin arch: Faster Implementation of csum_tcpudp_nofold() ...
2009-02-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds17-284/+548
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Kill bogus TPC/address truncation during 32-bit faults. sparc: fixup for sparseirq changes sparc64: Validate kernel generated fault addresses on sparc64. sparc64: On non-Niagara, need to touch NMI watchdog in NOHZ mode. sparc64: Implement NMI watchdog on capable cpus. sparc: Probe PMU type and record in sparc_pmu_type. sparc64: Move generic PCR support code to seperate file.
2009-02-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds21-57/+93
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: sunrpc: fix rdma dependencies e1000: Fix PCI enable to honor the need_ioport flag sgi-xp: link XPNET's net_device_ops to its net_device structure pcnet_cs: Fix misuse of the equality operator. hso: add new device id's dca: redesign locks to fix deadlocks cassini/sungem: limit reaches -1, but 0 tested net: variables reach -1, but 0 tested qlge: bugfix: Add missing netif_napi_del call. qlge: bugfix: Add flash offset for second port. qlge: bugfix: Fix endian issue when reading flash. udp: increments sk_drops in __udp_queue_rcv_skb() net: Fix userland breakage wrt. linux/if_tunnel.h net: packet socket packet_lookup_frame fix
2009-02-04Merge branch 'for-linus' of git://git.o-hand.com/linux-mfdLinus Torvalds1-1/+0
* 'for-linus' of git://git.o-hand.com/linux-mfd: mfd: Remove non exported references from pcf50633
2009-02-04Blackfin arch: Remove outdated codeMichael Hennerich1-23/+1
The removed version with the loop registers saved on the stack was originally intended to workaround the missing toolchain support for LoopReg Clobbers. Since our toolchain now supports these there is no point in keeping this workaround. And since we don't touch LoopRegs anymore we're no longer subject for ANOMALY_05000312. Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Fix udelay implementationMichael Hennerich1-6/+5
Avoid possible overflow during 32*32->32 multiplies. Reported-by: Marco Reppenhagen <marco.reppenhagen@auerswald.de> Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Update Copyright informationMichael Hennerich1-1/+1
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Add BF561 PPI POLS, POLC MasksMichael Hennerich1-0/+2
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Update CM-BF527 kernel configMichael Hennerich1-124/+321
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: define bfin_memmap as static since it is only used hereMike Frysinger1-1/+1
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: cplb mananger: use a do...while loop rather than a for loopMike Frysinger1-4/+8
use a do...while loop rather than a for loop to get slightly better optimization and to avoid gcc "may be used uninitialized" warnings ... we know that the [id]cplb_nr_bounds variables will never be 0, so this is OK Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: fix bug - traps test case 19 for exception 0x2d failsBernd Schmidt1-3/+1
Enable null pointer checking for ICPLBs. The code was there but for some reason I had commented it out at some stage during development. Should restrict this to 1K since atomic ops start there. Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: add platform device bfin_mii-bus and KSZ8893M switch driver ↵Graf Yang10-0/+105
platform resources to board files Signed-off-by: Graf Yang <graf.yang@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: build jtag tty driver as a module by defaultMike Frysinger9-9/+9
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: fix 2 bugs related to debugJie Zhang1-8/+1
- unable to single step over emuexcpt instruction - gdbproxy goes into infinite loop when doing gdb does "next" over "emuexcpt" Don't decrement PC after software breakpoint. Signed-off-by: Jie Zhang <jie.zhang@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Add ANOMALY_05000380 to BF54x to kill the compile warningBryan Wu1-0/+1
Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Fix bug - 561 SMP kernel can't boot from jffs2Graf Yang1-42/+42
bss_l2 section is garbage when the data in this section is used by _bfin_relocate_l1_mem, so move the zero out function ahead. Signed-off-by: Graf Yang <graf.yang@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: base SIC_IWR# programming on whether the MMR existsMike Frysinger2-14/+8
base SIC_IWR# programming on whether the MMR exists rather than having to maintain another list of processors Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: read SYSCR on newer parts that mirror the bits of SWRST in itMike Frysinger1-0/+6
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: fixup board init function nameMike Frysinger6-12/+12
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: drop CONFIG_I2C_BOARDINFO ifdefsMike Frysinger8-42/+0
Drop CONFIG_I2C_BOARDINFO ifdefs as the common i2c header handles this already by stubbing things out Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: bfin_reset->_bfin_reset redirection no longer neededMike Frysinger1-7/+1
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: sync reboot handler with version in u-bootMike Frysinger1-11/+15
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Faster Implementation of csum_tcpudp_nofold()Michael Hennerich1-17/+17
Avoid conditional branch instructions during carry bit additions. Special thanks to Bernd. Simplify: Use ((len + proto) << 8) like every other __LITTLE_ENDIAN__ machine Cc: Bernd Schmidt <bernds_cb1@t-online.de> Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Fix bug - BF518 port F, G, and H have different mux offset ↵Graf Yang1-5/+12
compare to BF527 [Mike Frysinger <vapier.adi@gmail.com>: keep the ifdef nest down] Signed-off-by: Graf Yang <graf.yang@analog.com> Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Add in cflag to support mlong-calls for kgdb_testGrace Pan1-0/+2
Signed-off-by: Grace Pan <grace.pan@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Fix bug - Run "reboot" hangs bf518-ezbrdSonic Zhang16-15/+19
[Mike Frysinger <vapier.adi@gmail.com>: - setup P_DEFAULT_BOOT_SPI_CS for every arch based on the default bootrom behavior and convert all our boards to it - revert previous anomaly change ... bf51x is not affected by anomaly 05000353] Signed-off-by: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04MAINTIANERS: Blackfin: remove subscribers-only markingMike Frysinger1-1/+1
remove subscribers-only marking as the list is automatically & silently moderated for people Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Add ability to count and display number of NMI interruptsRobin Getz2-1/+8
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Add a few more instructions that can cause the trace buffer ↵Robin Getz1-0/+12
to be discontiguous Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Fix URLRobin Getz1-1/+1
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: cleanup bf54x ifdef mess in gpio codeMike Frysinger3-409/+208
merge more of the bf54x and !bf54x gpio code together to cut down on #ifdef mess Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: Add one more check on `fp' to prevent double faultJie Zhang1-5/+3
Signed-off-by: Jie Zhang <jie.zhang@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
2009-02-04Blackfin arch: explicit add a might sleep to gpio_freeUwe Kleine-Koenig1-0/+2
According to the documentation gpio_free should only be called from task context only. To make this more explicit add a might sleep to all implementations. This patch changes the gpio_free implementations for the blackfin architecture. Signed-off-by: Uwe Kleine-Koenig <ukleinek@strlen.de> Cc: David Brownell <david-b@pacbell.net> Acked-by: Bryan Wu <cooloney@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>