summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2011-01-21net_sched: accurate bytes/packets stats/ratesEric Dumazet1-3/+5
In commit 44b8288308ac9d (net_sched: pfifo_head_drop problem), we fixed a problem with pfifo_head drops that incorrectly decreased sch->bstats.bytes and sch->bstats.packets Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a previously enqueued packet, and bstats cannot be changed, so bstats/rates are not accurate (over estimated) This patch changes the qdisc_bstats updates to be done at dequeue() time instead of enqueue() time. bstats counters no longer account for dropped frames, and rates are more correct, since enqueue() bursts dont have effect on dequeue() rate. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-20sctp: user perfect name for Delayed SACK Timer optionShan Wei1-0/+1
The option name of Delayed SACK Timer should be SCTP_DELAYED_SACK, not SCTP_DELAYED_ACK. Left SCTP_DELAYED_ACK be concomitant with SCTP_DELAYED_SACK, for making compatibility with existing applications. Reference: 8.1.19. Get or Set Delayed SACK Timer (SCTP_DELAYED_SACK) (http://tools.ietf.org/html/draft-ietf-tsvwg-sctpsocket-25) Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-18Merge branch 'master' of ↵David S. Miller1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2011-01-14etherdevice.h: Add is_unicast_ether_addr functionTobias Klauser1-0/+11
From a check for !is_multicast_ether_addr it is not always obvious that we're checking for a unicast address. So add this helper function to make those code paths easier to read. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-14ipsec: update MAX_AH_AUTH_LEN to support sha512Nicolas Dichtel1-1/+1
icv_truncbits is set to 256 for sha512, so update MAX_AH_AUTH_LEN to 64. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-14net: remove dev_txq_stats_fold()Eric Dumazet1-5/+0
After recent changes, (percpu stats on vlan/tunnels...), we dont need anymore per struct netdev_queue tx_bytes/tx_packets/tx_dropped counters. Only remaining users are ixgbe, sch_teql, gianfar & macvlan : 1) ixgbe can be converted to use existing tx_ring counters. 2) macvlan incremented txq->tx_dropped, it can use the dev->stats.tx_dropped counter. 3) sch_teql : almost revert ab35cd4b8f42 (Use net_device internal stats) Now we have ndo_get_stats64(), use it, even for "unsigned long" fields (No need to bring back a struct net_device_stats) 4) gianfar adds a stats structure per tx queue to hold tx_bytes/tx_packets This removes a lockdep warning (and possible lockup) in rndis gadget, calling dev_get_stats() from hard IRQ context. Ref: http://www.spinics.net/lists/netdev/msg149202.html Reported-by: Neil Jones <neiljay@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jarek Poplawski <jarkao2@gmail.com> CC: Alexander Duyck <alexander.h.duyck@intel.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Sandeep Gopalpet <sandeep.kumar@freescale.com> CC: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-13ieee80211: correct IEEE80211_ADDBA_PARAM_BUF_SIZE_MASK macroAmitkumar Karwar1-1/+1
It is defined in include/linux/ieee80211.h. As per IEEE spec. bit6 to bit15 in block ack parameter represents buffer size. So the bitmask should be 0xFFC0. Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Bing Zhao <bzhao@marvell.com> Cc: stable@kernel.org Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-13sched: remove unused backlog in RED statsstephen hemminger1-1/+0
The RED statistics structure includes backlog field which is not set or used by any code. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-13Merge branch 'master' of git://1984.lsi.us.es/net-2.6David S. Miller3-10/+25
2011-01-13Merge branch 'master' of ↵David S. Miller3-5/+31
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2011-01-12netfilter: fix compilation when conntrack is disabled but tproxy is enabledKOVACS Krisztian3-10/+25
The IPv6 tproxy patches split IPv6 defragmentation off of conntrack, but failed to update the #ifdef stanzas guarding the defragmentation related fields and code in skbuff and conntrack related code in nf_defrag_ipv6.c. This patch adds the required #ifdefs so that IPv6 tproxy can truly be used without connection tracking. Original report: http://marc.info/?l=linux-netdev&m=129010118516341&w=2 Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-12Merge branch 'master' of git://1984.lsi.us.es/net-2.6David S. Miller2-6/+6
2011-01-12netfilter: ebtables: make broute table work againFlorian Westphal1-1/+1
broute table init hook sets up the "br_should_route_hook" pointer, which then gets called from br_input. commit a386f99025f13b32502fe5dedf223c20d7283826 (bridge: add proper RCU annotation to should_route_hook) introduced a typedef, and then changed this to: br_should_route_hook_t *rhook; [..] rhook = rcu_dereference(br_should_route_hook); if (*rhook(skb)) problem is that "br_should_route_hook" contains the address of the function, so calling *rhook() results in kernel panic. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-12ah: update maximum truncated ICV lengthNicolas Dichtel1-1/+1
For SHA256, RFC4868 requires to truncate ICV length to 128 bits, hence MAX_AH_AUTH_LEN should be updated to 16. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-11arp: allow to invalidate specific ARP entriesMaxim Levitsky1-0/+1
IPv4 over firewire needs to be able to remove ARP entries from the ARP cache that belong to nodes that are removed, because IPv4 over firewire uses ARP packets for private information about nodes. This information becomes invalid as soon as node drops off the bus and when it reconnects, its only possible to start talking to it after it responded to an ARP packet. But ARP cache prevents such packets from being sent. Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-11net_sched: factorize qdisc stats handlingEric Dumazet1-6/+14
HTB takes into account skb is segmented in stats updates. Generalize this to all schedulers. They should use qdisc_bstats_update() helper instead of manipulating bstats.bytes and bstats.packets Add bstats_update() helper too for classes that use gnet_stats_basic_packed fields. Note : Right now, TCQ_F_CAN_BYPASS shortcurt can be taken only if no stab is setup on qdisc. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-11net: Add alloc_netdev_mqs functionTom Herbert2-4/+10
Added alloc_netdev_mqs function which allows the number of transmit and receive queues to be specified independenty. alloc_netdev_mq was changed to a macro to call the new function. Also added alloc_etherdev_mqs with same purpose. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-11phonet: some signedness bugsDan Carpenter1-2/+2
Dan Rosenberg pointed out that there were some signed comparison bugs in the phonet protocol. http://marc.info/?l=full-disclosure&m=129424528425330&w=2 The problem is that we check for array overflows but "protocol" is signed and we don't check for array underflows. If you have already have CAP_SYS_ADMIN then you could use the bugs to get root, or someone could cause an oops by mistake. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-11netdev: bfin_mac: let boards set vlan masksMike Frysinger1-0/+1
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10nl80211: add/fix mesh docsJohannes Berg1-5/+15
Some mesh attribute/command docs are missing or have errors in the name so they don't match, fix all of them. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-10cfg80211: add mesh join/leave callback docsJohannes Berg1-0/+2
When I made the patch to add mesh join/leave I didn't pay attention to docs because it was a proof of concept, and then when we actually did merge it I forgot -- add docs now. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-10mac80211: add missing docs for off-chan TX flagJohannes Berg1-0/+4
The flag is IEEE80211_TX_CTL_TX_OFFCHAN and I had added that in a previous patch but forgotten docs. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-10mac80211: add remain-on-channel docsJohannes Berg1-0/+10
Add documentation for the new callbacks that I forgot in the patch adding the callbacks. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-10netfilter: x_tables: dont block BH while reading countersEric Dumazet1-5/+5
Using "iptables -L" with a lot of rules have a too big BH latency. Jesper mentioned ~6 ms and worried of frame drops. Switch to a per_cpu seqlock scheme, so that taking a snapshot of counters doesnt need to block BH (for this cpu, but also other cpus). This adds two increments on seqlock sequence per ipt_do_table() call, its a reasonable cost for allowing "iptables -L" not block BH processing. Reported-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-10net offloading: Pass features into netif_needs_gso().Jesse Gross1-9/+3
Now that there is a single function that can compute the device features relevant to a packet, we don't want to run it for each offload. This converts netif_needs_gso() to take the features of the device, rather than computing them itself. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10net offloading: Generalize netif_get_vlan_features().Jesse Gross1-2/+2
netif_get_vlan_features() is currently only used by netif_needs_gso(), so it only concerns itself with GSO features. However, several other places also should take into account the contents of the packet when deciding whether to offload to hardware. This generalizes the function to return features about all of the various forms of offloading. Since offloads tend to be linked together, this avoids duplicating the logic in each location (i.e. the scatter/gather code also needs the checksum logic). Suggested-by: Michał Mirosław <mirqus@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10net/sock.h: make some fields private to fix kernel-doc warning(s)Randy Dunlap1-0/+4
Fix new kernel-doc notation warning in sock.h by annotating skc_dontcopy_* as private fields. Warning(include/net/sock.h:163): No description found for parameter 'skc_dontcopy_end[0]' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-10net/fec: add mac field into platform data and consolidate fec_get_macShawn Guo1-0/+3
Add mac field into fec_platform_data and consolidate function fec_get_mac to get mac address in following order. 1) module parameter via kernel command line fec.macaddr=0x00,0x04,... 2) from flash in case of CONFIG_M5272 or fec_platform_data mac field for others, which typically have mac stored in fuse 3) fec mac address registers set by bootloader Signed-off-by: Shawn Guo <shawn.guo@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-08Merge branch 'next-spi' of git://git.secretlab.ca/git/linux-2.6Linus Torvalds3-5/+380
* 'next-spi' of git://git.secretlab.ca/git/linux-2.6: (77 commits) spi/omap: Fix DMA API usage in OMAP MCSPI driver spi/imx: correct the test on platform_get_irq() return value spi/topcliff: Typo fix threhold to threshold spi/dw_spi Typo change diable to disable. spi/fsl_espi: change the read behaviour of the SPIRF spi/mpc52xx-psc-spi: move probe/remove to proper sections spi/dw_spi: add DMA support spi/dw_spi: change to EXPORT_SYMBOL_GPL for exported APIs spi/dw_spi: Fix too short timeout in spi polling loop spi/pl022: convert running variable spi/pl022: convert busy flag to a bool spi/pl022: pass the returned sglen to the DMA engine spi/pl022: map the buffers on the DMA engine spi/topcliff_pch: Fix data transfer issue spi/imx: remove autodetection spi/pxa2xx: pass of_node to spi device and set a parent device spi/pxa2xx: Modify RX-Tresh instead of busy-loop for the remaining RX bytes. spi/pxa2xx: Add chipselect support for Sodaville spi/pxa2xx: Consider CE4100's FIFO depth spi/pxa2xx: Add CE4100 support ...
2011-01-08Merge branch 'for-2.6.38' of ↵Linus Torvalds6-22/+222
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (30 commits) gameport: use this_cpu_read instead of lookup x86: udelay: Use this_cpu_read to avoid address calculation x86: Use this_cpu_inc_return for nmi counter x86: Replace uses of current_cpu_data with this_cpu ops x86: Use this_cpu_ops to optimize code vmstat: User per cpu atomics to avoid interrupt disable / enable irq_work: Use per cpu atomics instead of regular atomics cpuops: Use cmpxchg for xchg to avoid lock semantics x86: this_cpu_cmpxchg and this_cpu_xchg operations percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support percpu,x86: relocate this_cpu_add_return() and friends connector: Use this_cpu operations xen: Use this_cpu_inc_return taskstats: Use this_cpu_ops random: Use this_cpu_inc_return fs: Use this_cpu_inc_return in buffer.c highmem: Use this_cpu_xx_return() operations vmstat: Use this_cpu_inc_return for vm statistics x86: Support for this_cpu_add, sub, dec, inc_return percpu: Generic support for this_cpu_add, sub, dec, inc_return ... Fixed up conflicts: in arch/x86/kernel/{apic/nmi.c, apic/x2apic_uv_x.c, process.c} as per Tejun.
2011-01-08Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds1-2/+2
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits) usb: don't use flush_scheduled_work() speedtch: don't abuse struct delayed_work media/video: don't use flush_scheduled_work() media/video: explicitly flush request_module work ioc4: use static work_struct for ioc4_load_modules() init: don't call flush_scheduled_work() from do_initcalls() s390: don't use flush_scheduled_work() rtc: don't use flush_scheduled_work() mmc: update workqueue usages mfd: update workqueue usages dvb: don't use flush_scheduled_work() leds-wm8350: don't use flush_scheduled_work() mISDN: don't use flush_scheduled_work() macintosh/ams: don't use flush_scheduled_work() vmwgfx: don't use flush_scheduled_work() tpm: don't use flush_scheduled_work() sonypi: don't use flush_scheduled_work() hvsi: don't use flush_scheduled_work() xen: don't use flush_scheduled_work() gdrom: don't use flush_scheduled_work() ... Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c as per Tejun.
2011-01-08Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Constify function scope static struct sched_param usage sched: Fix strncmp operation sched: Move sched_autogroup_exit() to free_signal_struct() sched: Fix struct autogroup memory leak sched: Mark autogroup_init() __init sched: Consolidate the name of root_task_group and init_task_group
2011-01-08Merge branch 'rmobile-latest' of ↵Linus Torvalds1-13/+47
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'rmobile-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (67 commits) ARM: mach-shmobile: update for SMP changes. ARM: mach-shmobile: update for GIC changes. ARM: mach-shmobile: Fix up clkdev fallout for SH73A0. dma: shdma: don't register the global die notifier multiple times ARM: mach-shmobile: Rely on run-time IRQ handlers ARM: mach-shmobile: Run-time IRQ handler for GIC ARM: mach-shmobile: Run-time IRQ handler for INTCA ARM: mach-shmobile: Enable CONFIG_MULTI_IRQ_HANDLER ARM: mach-shmobile: Use shared GIC entry macros ARM: mach-shmobile: mackerel: Add zboot support ARM: mach-shmobile: mackerel: Add HDMI sound support ARM: mach-shmobile: mackerel: add HDMI video support ARM: mach-shmobile: ap4evb: fixup clk_put timing of fsib_clk ARM: mach-shmobile: sh73a0: fix div4 table ARM: mach-shmobile: ap4/mackerel: modify wrong comment out of USB ARM: mach-shmobile: Mackerel VGA camera support mmc: sh_mmcif: make DMA support by the driver unconditional ARM: mach-shmobile: Add eMMC support through MMCIF on AG5EVM ARM: mach-shmobile: Use pullups for AG5EVM KEYSC pins ARM: mach-shmobile: sh73a0 GPIO pullup improvement ...
2011-01-08Merge branch 'for-linus' of ↵Linus Torvalds4-17/+138
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (58 commits) Input: wacom_w8001 - support pen or touch only devices Input: wacom_w8001 - use __set_bit to set keybits Input: bu21013_ts - fix misuse of logical operation in place of bitop Input: i8042 - add Acer Aspire 5100 to the Dritek list Input: wacom - add support for digitizer in Lenovo W700 Input: psmouse - disable the synaptics extension on OLPC machines Input: psmouse - fix up Synaptics comment Input: synaptics - ignore bogus mt packet Input: synaptics - add multi-finger and semi-mt support Input: synaptics - report clickpad property input: mt: Document interface updates Input: fix double equality sign in uevent Input: introduce device properties hid: egalax: Add support for Wetab (726b) Input: include MT library as source for kerneldoc MAINTAINERS: Update input-mt entry hid: egalax: Add support for Samsung NB30 netbook hid: egalax: Document the new devices in Kconfig hid: egalax: Add support for Wetab hid: egalax: Convert to MT slots ... Fixed up trivial conflict in drivers/input/keyboard/Kconfig
2011-01-08Merge branch 'tty-next' of ↵Linus Torvalds5-6/+33
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6 * 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (36 commits) serial: apbuart: Fixup apbuart_console_init() TTY: Add tty ioctl to figure device node of the system console. tty: add 'active' sysfs attribute to tty0 and console device drivers: serial: apbuart: Handle OF failures gracefully Serial: Avoid unbalanced IRQ wake disable during resume tty: fix typos/errors in tty_driver.h comments pch_uart : fix warnings for 64bit compile 8250: fix uninitialized FIFOs ip2: fix compiler warning on ip2main_pci_tbl specialix: fix compiler warning on specialix_pci_tbl rocket: fix compiler warning on rocket_pci_ids 8250: add a UPIO_DWAPB32 for 32 bit accesses 8250: use container_of() instead of casting serial: omap-serial: Add support for kernel debugger serial: fix pch_uart kconfig & build drivers: char: hvc: add arm JTAG DCC console support RS485 documentation: add 16C950 UART description serial: ifx6x60: fix memory leak serial: ifx6x60: free IRQ on error Serial: EG20T: add PCH_UART driver ... Fixed up conflicts in drivers/serial/apbuart.c with evil merge that makes the code look fairly sane (unlike either side).
2011-01-08Merge branch 'usb-next' of ↵Linus Torvalds9-10/+246
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (144 commits) USB: add support for Dream Cheeky DL100B Webmail Notifier (1d34:0004) USB: serial: ftdi_sio: add support for TIOCSERGETLSR USB: ehci-mxc: Setup portsc register prior to accessing OTG viewport USB: atmel_usba_udc: fix freeing irq in usba_udc_remove() usb: ehci-omap: fix tll channel enable mask usb: ohci-omap3: fix trivial typo USB: gadget: ci13xxx: don't assume that PAGE_SIZE is 4096 USB: gadget: ci13xxx: fix complete() callback for no_interrupt rq's USB: gadget: update ci13xxx to work with g_ether USB: gadgets: ci13xxx: fix probing of compiled-in gadget drivers Revert "USB: musb: pm: don't rely fully on clock support" Revert "USB: musb: blackfin: pm: make it work" USB: uas: Use GFP_NOIO instead of GFP_KERNEL in I/O submission path USB: uas: Ensure we only bind to a UAS interface USB: uas: Rename sense pipe and sense urb to status pipe and status urb USB: uas: Use kzalloc instead of kmalloc USB: uas: Fix up the Sense IU usb: musb: core: kill unneeded #include's DA8xx: assign name to MUSB IRQ resource usb: gadget: g_ncm added ... Manually fix up trivial conflicts in USB Kconfig changes in: arch/arm/mach-omap2/Kconfig arch/sh/Kconfig drivers/usb/Kconfig drivers/usb/host/ehci-hcd.c and annoying chip clock data conflicts in: arch/arm/mach-omap2/clock3xxx_data.c arch/arm/mach-omap2/clock44xx_data.c
2011-01-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds6-7/+46
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (147 commits) [SCSI] arcmsr: fix write to device check [SCSI] lpfc: lower stack use in lpfc_fc_frame_check [SCSI] eliminate an unnecessary local variable from scsi_remove_target() [SCSI] libiscsi: use bh locking instead of irq with session lock [SCSI] libiscsi: do not take host lock in queuecommand [SCSI] be2iscsi: fix null ptr when accessing task hdr [SCSI] be2iscsi: fix gfp use in alloc_pdu [SCSI] libiscsi: add more informative failure message during iscsi scsi eh [SCSI] gdth: Add missing call to gdth_ioctl_free [SCSI] bfa: remove unused defintions and misc cleanups [SCSI] bfa: remove inactive functions [SCSI] bfa: replace bfa_assert with WARN_ON [SCSI] qla2xxx: Use sg_next to fetch next sg element while walking sg list. [SCSI] qla2xxx: Fix to avoid recursive lock failure during BSG timeout. [SCSI] qla2xxx: Remove code to not reset ISP82xx on failure. [SCSI] qla2xxx: Display mailbox register 4 during 8012 AEN for ISP82XX parts. [SCSI] qla2xxx: Don't perform a BIG_HAMMER if Get-ID (0x20) mailbox command fails on CNAs. [SCSI] qla2xxx: Remove redundant module parameter permission bits [SCSI] qla2xxx: Add sysfs node for displaying board temperature. [SCSI] qla2xxx: Code cleanup to remove unwanted comments and code. ...
2011-01-07Merge branch 'vfs-scale-working' of ↵Linus Torvalds20-203/+586
git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin * 'vfs-scale-working' of git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin: (57 commits) fs: scale mntget/mntput fs: rename vfsmount counter helpers fs: implement faster dentry memcmp fs: prefetch inode data in dcache lookup fs: improve scalability of pseudo filesystems fs: dcache per-inode inode alias locking fs: dcache per-bucket dcache hash locking bit_spinlock: add required includes kernel: add bl_list xfs: provide simple rcu-walk ACL implementation btrfs: provide simple rcu-walk ACL implementation ext2,3,4: provide simple rcu-walk ACL implementation fs: provide simple rcu-walk generic_check_acl implementation fs: provide rcu-walk aware permission i_ops fs: rcu-walk aware d_revalidate method fs: cache optimise dentry and inode for rcu-walk fs: dcache reduce branches in lookup path fs: dcache remove d_mounted fs: fs_struct use seqlock fs: rcu-walk for path lookup ...
2011-01-07sched: Consolidate the name of root_task_group and init_task_groupYong Zhang1-1/+1
root_task_group is the leftover of USER_SCHED, now it's always same to init_task_group. But as Mike suggested, root_task_group is maybe the suitable name to keep for a tree. So in this patch: init_task_group --> root_task_group init_task_group_load --> root_task_group_load INIT_TASK_GROUP_LOAD --> ROOT_TASK_GROUP_LOAD Suggested-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20110107071736.GA32635@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07fs: scale mntget/mntputNick Piggin2-36/+19
The problem that this patch aims to fix is vfsmount refcounting scalability. We need to take a reference on the vfsmount for every successful path lookup, which often go to the same mount point. The fundamental difficulty is that a "simple" reference count can never be made scalable, because any time a reference is dropped, we must check whether that was the last reference. To do that requires communication with all other CPUs that may have taken a reference count. We can make refcounts more scalable in a couple of ways, involving keeping distributed counters, and checking for the global-zero condition less frequently. - check the global sum once every interval (this will delay zero detection for some interval, so it's probably a showstopper for vfsmounts). - keep a local count and only taking the global sum when local reaches 0 (this is difficult for vfsmounts, because we can't hold preempt off for the life of a reference, so a counter would need to be per-thread or tied strongly to a particular CPU which requires more locking). - keep a local difference of increments and decrements, which allows us to sum the total difference and hence find the refcount when summing all CPUs. Then, keep a single integer "long" refcount for slow and long lasting references, and only take the global sum of local counters when the long refcount is 0. This last scheme is what I implemented here. Attached mounts and process root and working directory references are "long" references, and everything else is a short reference. This allows scalable vfsmount references during path walking over mounted subtrees and unattached (lazy umounted) mounts with processes still running in them. This results in one fewer atomic op in the fastpath: mntget is now just a per-CPU inc, rather than an atomic inc; and mntput just requires a spinlock and non-atomic decrement in the common case. However code is otherwise bigger and heavier, so single threaded performance is basically a wash. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: implement faster dentry memcmpNick Piggin1-0/+21
The standard memcmp function on a Westmere system shows up hot in profiles in the `git diff` workload (both parallel and single threaded), and it is likely due to the costs associated with trapping into microcode, and little opportunity to improve memory access (dentry name is not likely to take up more than a cacheline). So replace it with an open-coded byte comparison. This increases code size by 8 bytes in the critical __d_lookup_rcu function, but the speedup is huge, averaging 10 runs of each: git diff st user sys elapsed CPU before 1.15 2.57 3.82 97.1 after 1.14 2.35 3.61 96.8 git diff mt user sys elapsed CPU before 1.27 3.85 1.46 349 after 1.26 3.54 1.43 333 Elapsed time for single threaded git diff at 95.0% confidence: -0.21 +/- 0.01 -5.45% +/- 0.24% It's -0.66% +/- 0.06% elapsed time on my Opteron, so rep cmp costs on the fam10h seem to be relatively smaller, but there is still a win. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: improve scalability of pseudo filesystemsNick Piggin1-0/+1
Regardless of how much we possibly try to scale dcache, there is likely always going to be some fundamental contention when adding or removing children under the same parent. Pseudo filesystems do not seem need to have connected dentries because by definition they are disconnected. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache per-inode inode alias lockingNick Piggin1-1/+0
dcache_inode_lock can be replaced with per-inode locking. Use existing inode->i_lock for this. This is slightly non-trivial because we sometimes need to find the inode from the dentry, which requires d_inode to be stabilised (either with refcount or d_lock). Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache per-bucket dcache hash lockingNick Piggin2-2/+4
We can turn the dcache hash locking from a global dcache_hash_lock into per-bucket locking. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07bit_spinlock: add required includesNick Piggin1-0/+4
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07kernel: add bl_listNick Piggin2-0/+271
Introduce a type of hlist that can support the use of the lowest bit in the hlist_head. This will be subsequently used to implement per-bucket bit spinlock for inode and dentry hashes, and may be useful in other cases such as network hashes. Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: provide simple rcu-walk generic_check_acl implementationNick Piggin1-0/+19
This simple implementation just checks for no ACLs on the inode, and if so, then the rcu-walk may proceed, otherwise fail it. This could easily be extended to put acls under RCU and check them under seqlock, if need be. But this implementation is enough to show the rcu-walk aware permissions code for path lookups is working, and will handle cases where there are no ACLs or ACLs in just the final element. This patch implicity converts tmpfs to rcu-aware permission check. Subsequent patches onvert ext*, xfs, and, btrfs. Each of these uses acl/permission code in a different way, so convert them all to provide templates and proof of concept. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: provide rcu-walk aware permission i_opsNick Piggin5-8/+10
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: rcu-walk aware d_revalidate methodNick Piggin1-1/+0
Require filesystems be aware of .d_revalidate being called in rcu-walk mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning -ECHILD from all implementations. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: cache optimise dentry and inode for rcu-walkNick Piggin2-36/+42
Put dentry and inode fields into top of data structure. This allows RCU path traversal to perform an RCU dentry lookup in a path walk by touching only the first 56 bytes of the dentry. We also fit in 8 bytes of inline name in the first 64 bytes, so for short names, only 64 bytes needs to be touched to perform the lookup. We should get rid of the hash->prev pointer from the first 64 bytes, and fit 16 bytes of name in there, which will take care of 81% rather than 32% of the kernel tree. inode is also rearranged so that RCU lookup will only touch a single cacheline in the inode, plus one in the i_ops structure. This is important for directory component lookups in RCU path walking. In the kernel source, directory names average is around 6 chars, so this works. When we reach the last element of the lookup, we need to lock it and take its refcount which requires another cacheline access. Align dentry and inode operations structs, so members will be at predictable offsets and we can group common operations into head of structure. Signed-off-by: Nick Piggin <npiggin@kernel.dk>