summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2011-03-11NFS move nfs_client initialization into nfs_get_clientAndy Adamson1-0/+3
Now nfs_get_client returns an nfs_client ready to be used no matter if it was found or created. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11NFSv4: remove CONFIG_NFS_V4 from nfs_read_dataAndy Adamson1-2/+0
Cleanup nfs_read_data. We also won't use CONFIG_NFS_V4_1 for additional NFSv4.1 fields in subsequent patches. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11NFS: change nfs_writeback_done to return voidFred Isaman1-1/+1
The return values are not used by any callers. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11NFSv4/4.1: Fix nfs4_schedule_state_recovery abusesTrond Myklebust1-5/+3
nfs4_schedule_state_recovery() should only be used when we need to force the state manager to check the lease. If we just want to start the state manager in order to handle a state recovery situation, we should be using nfs4_schedule_state_manager(). This patch fixes the abuses of nfs4_schedule_state_recovery() by replacing its use with a set of helper functions that do the right thing. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11plist: Shrink struct plist_headLai Jiangshan1-23/+24
struct plist_head is used in struct task_struct as well as struct rtmutex. If we can make it smaller, it will also make these structures smaller as well. The field prio_list in struct plist_head is seldom used and we can get its information from the plist_nodes. Removing this field will decrease the size of plist_head by half. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> LKML-Reference: <4D107982.9090700@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-11block: fixup plugging stubs for !CONFIG_BLOCKJens Axboe1-3/+6
They used an older prototype, fix it up. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-11Merge branch 'master' of ↵John W. Linville1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
2011-03-11mtd: nand: add software BCH ECC supportIvan Djelic2-0/+75
This patch adds software BCH ECC support to mtd, in order to handle recent NAND device ecc requirements (4 bits or more). It does so by adding a new ecc mode (NAND_ECC_SOFT_BCH) for use by board drivers, and a new Kconfig option to enable BCH support. It relies on the generic BCH library introduced in a previous patch. When a board driver uses mode NAND_ECC_SOFT_BCH, it should also set fields chip->ecc.size and chip->ecc.bytes to select BCH ecc data size and required error correction capability. See nand_bch_init() documentation for details. It has been tested on the following platforms using mtd-utils, UBI and UBIFS: x86 (with nandsim), arm926ejs. Signed-off-by: Ivan Djelic <ivan.djelic@parrot.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-03-11Merge branch 'slab/rcu' into slab/nextPekka Enberg49-125/+255
Conflicts: mm/slub.c
2011-03-11slub: automatically reserve bytes at the end of slabLai Jiangshan1-0/+1
There is no "struct" for slub's slab, it shares with struct page. But struct page is very small, it is insufficient when we need to add some metadata for slab. So we add a field "reserved" to struct kmem_cache, when a slab is allocated, kmem_cache->reserved bytes are automatically reserved at the end of the slab for slab's metadata. Changed from v1: Export the reserved field via sysfs Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-03-11Lockless (and preemptless) fastpaths for slubChristoph Lameter1-1/+4
Use the this_cpu_cmpxchg_double functionality to implement a lockless allocation algorithm on arches that support fast this_cpu_ops. Each of the per cpu pointers is paired with a transaction id that ensures that updates of the per cpu information can only occur in sequence on a certain cpu. A transaction id is a "long" integer that is comprised of an event number and the cpu number. The event number is incremented for every change to the per cpu state. This means that the cmpxchg instruction can verify for an update that nothing interfered and that we are updating the percpu structure for the processor where we picked up the information and that we are also currently on that processor when we update the information. This results in a significant decrease of the overhead in the fastpaths. It also makes it easy to adopt the fast path for realtime kernels since this is lockless and does not require the use of the current per cpu area over the critical section. It is only important that the per cpu area is current at the beginning of the critical section and at the end. So there is no need even to disable preemption. Test results show that the fastpath cycle count is reduced by up to ~ 40% (alloc/free test goes from ~140 cycles down to ~80). The slowpath for kfree adds a few cycles. Sadly this does nothing for the slowpath which is where the main issues with performance in slub are but the best case performance rises significantly. (For that see the more complex slub patches that require cmpxchg_double) Kmalloc: alloc/free test Before: 10000 times kmalloc(8)/kfree -> 134 cycles 10000 times kmalloc(16)/kfree -> 152 cycles 10000 times kmalloc(32)/kfree -> 144 cycles 10000 times kmalloc(64)/kfree -> 142 cycles 10000 times kmalloc(128)/kfree -> 142 cycles 10000 times kmalloc(256)/kfree -> 132 cycles 10000 times kmalloc(512)/kfree -> 132 cycles 10000 times kmalloc(1024)/kfree -> 135 cycles 10000 times kmalloc(2048)/kfree -> 135 cycles 10000 times kmalloc(4096)/kfree -> 135 cycles 10000 times kmalloc(8192)/kfree -> 144 cycles 10000 times kmalloc(16384)/kfree -> 754 cycles After: 10000 times kmalloc(8)/kfree -> 78 cycles 10000 times kmalloc(16)/kfree -> 78 cycles 10000 times kmalloc(32)/kfree -> 82 cycles 10000 times kmalloc(64)/kfree -> 88 cycles 10000 times kmalloc(128)/kfree -> 79 cycles 10000 times kmalloc(256)/kfree -> 79 cycles 10000 times kmalloc(512)/kfree -> 85 cycles 10000 times kmalloc(1024)/kfree -> 82 cycles 10000 times kmalloc(2048)/kfree -> 82 cycles 10000 times kmalloc(4096)/kfree -> 85 cycles 10000 times kmalloc(8192)/kfree -> 82 cycles 10000 times kmalloc(16384)/kfree -> 706 cycles Kmalloc: Repeatedly allocate then free test Before: 10000 times kmalloc(8) -> 211 cycles kfree -> 113 cycles 10000 times kmalloc(16) -> 174 cycles kfree -> 115 cycles 10000 times kmalloc(32) -> 235 cycles kfree -> 129 cycles 10000 times kmalloc(64) -> 222 cycles kfree -> 120 cycles 10000 times kmalloc(128) -> 343 cycles kfree -> 139 cycles 10000 times kmalloc(256) -> 827 cycles kfree -> 147 cycles 10000 times kmalloc(512) -> 1048 cycles kfree -> 272 cycles 10000 times kmalloc(1024) -> 2043 cycles kfree -> 528 cycles 10000 times kmalloc(2048) -> 4002 cycles kfree -> 571 cycles 10000 times kmalloc(4096) -> 7740 cycles kfree -> 628 cycles 10000 times kmalloc(8192) -> 8062 cycles kfree -> 850 cycles 10000 times kmalloc(16384) -> 8895 cycles kfree -> 1249 cycles After: 10000 times kmalloc(8) -> 190 cycles kfree -> 129 cycles 10000 times kmalloc(16) -> 76 cycles kfree -> 123 cycles 10000 times kmalloc(32) -> 126 cycles kfree -> 124 cycles 10000 times kmalloc(64) -> 181 cycles kfree -> 128 cycles 10000 times kmalloc(128) -> 310 cycles kfree -> 140 cycles 10000 times kmalloc(256) -> 809 cycles kfree -> 165 cycles 10000 times kmalloc(512) -> 1005 cycles kfree -> 269 cycles 10000 times kmalloc(1024) -> 1999 cycles kfree -> 527 cycles 10000 times kmalloc(2048) -> 3967 cycles kfree -> 570 cycles 10000 times kmalloc(4096) -> 7658 cycles kfree -> 637 cycles 10000 times kmalloc(8192) -> 8111 cycles kfree -> 859 cycles 10000 times kmalloc(16384) -> 8791 cycles kfree -> 1173 cycles Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-03-11slub: min_partial needs to be in first cachelineChristoph Lameter1-1/+1
It is used in unfreeze_slab() which is a performance critical function. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-03-11Merge branch 'for-2.6.39' of ↵Pekka Enberg1-0/+128
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu into slub/lockless
2011-03-11mtd: cfi: add support for AMIC flashes (e.g. A29L160AT)Steffen Sledz1-0/+1
Signed-off-by: Steffen Sledz <sledz@dresearch.de> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-03-11lib: add shared BCH ECC libraryIvan Djelic1-0/+79
This is a new software BCH encoding/decoding library, similar to the shared Reed-Solomon library. Binary BCH (Bose-Chaudhuri-Hocquenghem) codes are widely used to correct errors in NAND flash devices requiring more than 1-bit ecc correction; they are generally better suited for NAND flash than RS codes because NAND bit errors do not occur in bursts. Latest SLC NAND devices typically require at least 4-bit ecc protection per 512 bytes block. This library provides software encoding/decoding, but may also be used with ASIC/SoC hardware BCH engines to perform error correction. It is being currently used for this purpose on an OMAP3630 board (4bit/8bit HW BCH). It has also been used to decode raw dumps of NAND devices with on-die BCH ecc engines (e.g. Micron 4bit ecc SLC devices). Latest NAND devices (including SLC) can exhibit high error rates (typically a dozen or more bitflips per hour during stress tests); in order to minimize the performance impact of error correction, this library implements recently developed algorithms for fast polynomial root finding (see bch.c header for details) instead of the traditional exhaustive Chien root search; a few performance figures are provided below: Platform: arm926ejs @ 468 MHz, 32 KiB icache, 16 KiB dcache BCH ecc : 4-bit per 512 bytes Encoding average throughput: 250 Mbits/s Error correction time (compared with Chien search): average worst average (Chien) worst (Chien) ---------------------------------------------------------- 1 bit 8.5 µs 11 µs 200 µs 383 µs 2 bit 9.7 µs 12.5 µs 477 µs 728 µs 3 bit 18.1 µs 20.6 µs 758 µs 1010 µs 4 bit 19.5 µs 23 µs 1028 µs 1280 µs In the above figures, "worst" is meant in terms of error pattern, not in terms of cache miss / page faults effects (not taken into account here). The library has been extensively tested on the following platforms: x86, x86_64, arm926ejs, omap3630, qemu-ppc64, qemu-mips. Signed-off-by: Ivan Djelic <ivan.djelic@parrot.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-03-11mtd: onenand: add new option to control initial onenand unlockingRoman Tereshonkov1-0/+1
A new option ONENAND_SKIP_INITIAL_UNLOCKING is added. This allows to disable initial onenand unlocking when the driver is initialized. Signed-off-by: Roman Tereshonkov <roman.tereshonkov@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-03-11mtd: NOR flash driver for OMAP-L137/AM17xDavid Griego1-0/+29
OMAP-L137/AM17x has limited number of dedicated EMIFA address pins, enough to interface directly to an SDRAM. If a device such as an asynchronous flash needs to be attached to the EMIFA, then either GPIO pins or a chip select may be used to control the flash device's upper address lines. This patch adds support for the NOR flash on the OMAP-L137/ AM17x user interface daughter board using the latch-addr-flash MTD mapping driver which allows flashes to be partially physically addressed. The upper address lines are set by a board specific code which is a separate patch. Signed-off-by: David Griego <dgriego@mvista.com> Signed-off-by: Aleksey Makarov <amakarov@ru.mvista.com> Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Savinay Dharmappa <savinay.dharmappa@ti.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-03-11mtd_blkdevs: Add background processing supportJarkko Lavinen1-0/+2
Add a new background method into mtd_blktrans_ops, add background support into mtd_blktrans_thread(), and add mtd_blktrans_cease_background(). If the mtd blktrans dev has the background support, the thread will call background function when the request queue becomes empty. The background operation may run as long as needs to until mtd_blktrans_cease_background() tells to stop. Signed-off-by: Jarkko Lavinen <jarkko.lavinen@nokia.com> Tested-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-03-11genirq: Add desc->irq_data accessorThomas Gleixner1-0/+5
We have accessors for all fields in irq_data based on irq_desc, but not for irq_data itself. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-11net: add proper documentation for previously added net_device_ops for FCoEYi Zou1-0/+36
Add proper documentation for previously added net_device_ops ops for FCoE. Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-03-11Merge branch 'for_2.6.39/pm-misc' of ↵Tony Lindgren6-13/+10
ssh://master.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap-pm into omap-for-linus
2011-03-11ipv4: Remove redundant RCU locking in ip_check_mc().David S. Miller1-1/+1
All callers are under rcu_read_lock() protection already. Rename to ip_check_mc_rcu() to make it even more clear. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-11Merge branch 'master' of ↵David S. Miller5-12/+21
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x_cmn.c
2011-03-10NFSv4: remove duplicate clientid in struct nfs_clientAndy Adamson1-2/+0
Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10SUNRPC: Close a race in __rpc_wait_for_completion_task()Trond Myklebust1-0/+1
Although they run as rpciod background tasks, under normal operation (i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck() and nfs4_do_close() want to be fully synchronous. This means that when we exit, we want all references to the rpc_task to be gone, and we want any dentry references etc. held by that task to be released. For this reason these functions call __rpc_wait_for_completion_task(), followed by rpc_put_task() in the expectation that the latter will be releasing the last reference to the rpc_task, and thus ensuring that the callback_ops->rpc_release() has been called synchronously. This patch fixes a race which exists due to the fact that rpciod calls rpc_complete_task() (in order to wake up the callers of __rpc_wait_for_completion_task()) and then subsequently calls rpc_put_task() without ensuring that these two steps are done atomically. In order to avoid adding new spin locks, the patch uses the existing waitqueue spin lock to order the rpc_task reference count releases between the waiting process and rpciod. The common case where nobody is waiting for completion is optimised for by checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task reference count is 1: in those cases we drop trying to grab the spin lock, and immediately free up the rpc_task. Those few processes that need to put the rpc_task from inside an asynchronous context and that do not care about ordering are given a new helper: rpc_put_task_async(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10hrtimer: Update hrtimer->state documentationThomas Gleixner1-5/+11
We changed some of the state bits and combinations thereof over time, but never updated the documentation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-10tracing: Remove lock_depth from event entrySteven Rostedt1-1/+0
The lock_depth field in the event headers was added as a temporary data point for help in removing the BKL. Now that the BKL is pretty much been removed, we can remove this field. This in turn changes the header from 12 bytes to 8 bytes, removing the 4 byte buffer that gcc would insert if the first field in the data load was 8 bytes in size. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-10drbd: Remove unused function atodb_endio()Andreas Gruenbacher1-1/+1
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Provide hints with the error message when clearing the sync pause flagPhilipp Reisner1-0/+2
When the user clears the sync-pause flag, and sync stays in pause state, give hints to the user, why it still is in pause state. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Corrected off-by-one error in DRBD_MINOR_COUNT_MAXPhilipp Reisner1-1/+2
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Rename enum drbd_state_ret_codes to enum drbd_state_rvAndreas Gruenbacher1-2/+2
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Rename enum drbd_ret_codes to enum drbd_ret_codeAndreas Gruenbacher1-1/+1
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: --force option for disconnectPhilipp Reisner1-1/+3
As the network connection can be lost at any time, a --force option for disconnect is just a matter of completeness. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: add packet_type 27 (return_code_only) to netlink apiLars Ellenberg2-1/+6
In case we ever should add an other packet type, we must not reuse 27, as that currently used for "empty" return code only replies. Document it as such. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: New packet for Ahead/Behind mode: P_OUT_OF_SYNCPhilipp Reisner1-1/+1
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: Implemented two new connection states Ahead/BehindPhilipp Reisner1-0/+4
In this connection mode, the ahead node no longer replicates application IO. The behind's disk becomes out dated. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10drbd: New configuration parameters for dealing with network congestionPhilipp Reisner3-0/+19
net { on_congestion {block|pull-ahead|disconnect}; congestion-fill {sectors}; congestion-extents {al-extents}; } Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10tg3: Add code to verify RODATA checksum of VPDMatt Carlson1-0/+1
This patch adds code to verify the checksum stored in the "RV" info keyword of the RODATA VPD section. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10nilfs2: move NILFS_SUPER_MAGIC to linux/magic.hJiro SEKIBA2-1/+2
move NILFS_SUPER_MAGIC macro to linux/magic.h from linux/nilfs2_fs.h in the same manner as other filesystem magic number defined in the file. Signed-off-by: Jiro SEKIBA <jir@unicus.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2011-03-10Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/coreJens Axboe10-84/+72
Conflicts: block/blk-core.c block/blk-flush.c drivers/md/raid1.c drivers/md/raid10.c drivers/md/raid5.c fs/nilfs2/btnode.c fs/nilfs2/mdt.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10block: kill off REQ_UNPLUGJens Axboe2-21/+9
With the plugging now being explicitly controlled by the submitter, callers need not pass down unplugging hints to the block layer. If they want to unplug, it's because they manually plugged on their own - in which case, they should just unplug at will. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10block: remove per-queue pluggingJens Axboe8-66/+9
Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10block: initial patch for on-stack per-task pluggingJens Axboe4-0/+51
This patch adds support for creating a queuing context outside of the queue itself. This enables us to batch up pieces of IO before grabbing the block device queue lock and submitting them to the IO scheduler. The context is created on the stack of the process and assigned in the task structure, so that we can auto-unplug it if we hit a schedule event. The current queue plugging happens implicitly if IO is submitted to an empty device, yet callers have to remember to unplug that IO when they are going to wait for it. This is an ugly API and has caused bugs in the past. Additionally, it requires hacks in the vm (->sync_page() callback) to handle that logic. By switching to an explicit plugging scheme we make the API a lot nicer and can get rid of the ->sync_page() hack in the vm. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10block: add API for delaying work/request_fn a little bitJens Axboe1-0/+6
Currently we use plugging for that, but as plugging is going away, we need an alternative mechanism. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10Merge branch 'for-linus' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules
2011-03-10sysctl: the include of rcupdate.h is only needed in the kernelStephen Rothwell1-1/+1
Fixes this build-check error: include/linux/sysctl.h:28: included file 'linux/rcupdate.h' is not exported Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-10net: don't allow CAP_NET_ADMIN to load non-netdev kernel modulesVasiliy Kulikov1-0/+3
Since a8f80e8ff94ecba629542d9b4b5f5a8ee3eb565c any process with CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't allow anybody load any module not related to networking. This patch restricts an ability of autoloading modules to netdev modules with explicit aliases. This fixes CVE-2011-1019. Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior of loading netdev modules by name (without any prefix) for processes with CAP_SYS_MODULE to maintain the compatibility with network scripts that use autoloading netdev modules by aliases like "eth0", "wlan0". Currently there are only three users of the feature in the upstream kernel: ipip, ip_gre and sit. root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) -- root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: fffffff800001000 CapEff: fffffff800001000 CapBnd: fffffff800001000 root@albatros:~# modprobe xfs FATAL: Error inserting xfs (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted root@albatros:~# lsmod | grep xfs root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod | grep xfs root@albatros:~# lsmod | grep sit root@albatros:~# ifconfig sit sit: error fetching interface information: Device not found root@albatros:~# lsmod | grep sit root@albatros:~# ifconfig sit0 sit0 Link encap:IPv6-in-IPv4 NOARP MTU:1480 Metric:1 root@albatros:~# lsmod | grep sit sit 10457 0 tunnel4 2957 1 sit For CAP_SYS_MODULE module loading is still relaxed: root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: ffffffffffffffff CapBnd: ffffffffffffffff root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod | grep xfs xfs 745319 0 Reference: https://lkml.org/lkml/2011/2/24/203 Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: James Morris <jmorris@namei.org>
2011-03-10tcp: ioctl type SIOCOUTQNSD returns amount of data not sentMario Schuknecht1-1/+3
In contrast to SIOCOUTQ which returns the amount of data sent but not yet acknowledged plus data not yet sent this patch only returns the data not sent. For various methods of live streaming bitrate control it may be helpful to know how much data are in the tcp outqueue are not sent yet. Signed-off-by: Mario Schuknecht <m.schuknecht@dresearch.de> Signed-off-by: Steffen Sledz <sledz@dresearch.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10Merge branch 'for-linus' of ↵Linus Torvalds1-4/+10
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: nd->inode is not set on the second attempt in path_walk() unfuck proc_sysctl ->d_compare() minimal fix for do_filp_open() race
2011-03-10ieee80211: add IEEE80211_COUNTRY_STRING_LEN definitionBing Zhao1-0/+3
and make use of it in wireless drivers Signed-off-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>