summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-11-11ceph: initialize root dentrySage Weil2-3/+5
Set up d_fsdata on the root dentry. This fixes a NULL pointer dereference in ceph_d_prune on umount. It also means we can eventually strip out all of the conditional checks on d_fsdata because it is now set unconditionally (prior to setting up the d_ops). Fix the ceph_d_prune debug print while we're here. Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-06ceph: fix iput race when queueing inode workSage Weil1-3/+6
If we queue a work item that calls iput(), make sure we ihold() before attempting to queue work. Otherwise our queued work might miraculously run before we notice the queue_work() succeeded and call ihold(), allowing the inode to be destroyed. That is, instead of if (queue_work(...)) ihold(); we need to do ihold(); if (!queue_work(...)) iput(); Reported-by: Amon Ott <a.ott@m-privacy.de> Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-06ceph/super.c: quiet sparse noiseH Hartley Sweeten1-2/+2
Quiet the sparse noise: warning: symbol 'create_fs_client' was not declared. Should it be static? warning: symbol 'destroy_fs_client' was not declared. Should it be static? Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Sage Weil <sage@newdream.net> ceph-devel@vger.kernel.org Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-06ceph/mds_client.c: quiet sparse noiseH Hartley Sweeten1-2/+2
Quiet the following sparse noise: warning: symbol 'get_nonsnap_parent' was not declared. Should it be static? warning: symbol 'done_closing_sessions' was not declared. Should it be static? Local functions don't need external visability. Make them static. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Sage Weil <sage@newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-06ceph: use new D_COMPLETE dentry flagSage Weil5-23/+68
We used to use a flag on the directory inode to track whether the dcache contents for a directory were a complete cached copy. Switch to a dentry flag CEPH_D_COMPLETE that is safely updated by ->d_prune(). Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-03ceph: clear parent D_COMPLETE flag when on dentry pruneSage Weil2-0/+41
When the VFS prunes a dentry from the cache, clear the D_COMPLETE flag on the parent dentry. Do this for the live and snapshotted namespaces. Do not bother for the .snap dir contents, since we do not cache that. Signed-off-by: Sage Weil <sage@newdream.net>
2011-11-03Merge branch 'for-linus' of ↵Linus Torvalds11-226/+516
git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock: hwspinlock: add MAINTAINERS entries hwspinlock/omap: omap_hwspinlock_remove should be __devexit hwspinlock/u8500: add hwspinlock driver hwspinlock/core: register a bank of hwspinlocks in a single API call hwspinlock/core: remove stubs for register/unregister hwspinlock/core: use a mutex to protect the radix tree hwspinlock/core/omap: fix id issues on multiple hwspinlock devices hwspinlock/omap: simplify allocation scheme hwspinlock/core: simplify 'owner' handling hwspinlock/core: simplify Kconfig Fix up trivial conflicts (addition of omap_hwspinlock_pdata, removal of omap_spinlock_latency) in arch/arm/mach-omap2/hwspinlock.c Also, do an "evil merge" to fix a compile error in omap_hsmmc.c which for some reason was reported in the same email thread as the "please pull hwspinlock changes".
2011-11-03Merge branch 'for-linus' of ↵Linus Torvalds5-52/+33
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: Revert "HID: multitouch: decide if hid-multitouch needs to handle mt devices" HID: drivers/hid/hid-roccat.c: eliminate a null pointer dereference HID: hid-apple: add device ID of another wireless aluminium HID: Add device IDs for Macbook Pro 8 keyboards
2011-11-03Revert "perf: Add PM notifiers to fix CPU hotplug races"Linus Torvalds1-95/+2
This reverts commit 144060fee07e9c22e179d00819c83c86fbcbf82c. It causes a resume regression for Andi on his Acer Aspire 1830T post 3.1. The screen just stays black after wakeup. Also, it really looks like the wrong way to suspend and resume perf events: I think they should be done as part of the CPU suspend and resume, rather than as a notifier that does smp_call_function(). Reported-by: Andi Kleen <andi@firstfloor.org> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/linux-dmLinus Torvalds42-37/+12079
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/linux-dm: dm: raid fix device status indicator when array initializing dm log userspace: add log device dependency dm log userspace: fix comment hyphens dm: add thin provisioning target dm: add persistent data library dm: add bufio dm: export dm get md dm table: add immutable feature dm table: add always writeable feature dm table: add singleton feature dm kcopyd: add dm_kcopyd_zero to zero an area dm: remove superfluous smp_mb dm: use local printk ratelimit dm table: propagate non rotational flag
2011-11-03Merge branch 'for-linus' of git://git.selinuxproject.org/~jmorris/linux-securityLinus Torvalds1-0/+30
* 'for-linus' of git://git.selinuxproject.org/~jmorris/linux-security: TOMOYO: Fix interactive judgment functionality.
2011-11-03Merge branch 'linux_next' of ↵Linus Torvalds11-525/+2671
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac * 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (21 commits) MAINTAINERS: add an entry for Edac Sandy Bridge driver edac: tag sb_edac as EXPERIMENTAL, as it requires more testing EDAC: Fix incorrect edac mode reporting in sb_edac edac: sb_edac: Add it to the building system edac: Add an experimental new driver to support Sandy Bridge CPU's i7300_edac: Fix error cleanup logic i7core_edac: Initialize memory name with cpu, channel, bank i7core_edac: Fix compilation on 32 bits arch i7core_edac: scrubbing fixups EDAC: Correct Kconfig dependencies i7core_edac: return -ENODEV if no MC is found i7core_edac: use edac's own way to print errors MAINTAINERS: remove dropped edac_mce.* from the file i7core_edac: Drop the edac_mce facility x86, MCE: Use notifier chain only for MCE decoding EDAC i7core: Use mce socketid for better compatibility i7core_edac: Don't enable memory scrubbing for Xeon 35xx i7core_edac: Add scrubbing support edac: Move edac main structs to include/linux/edac.h i7core_edac: Fix oops when trying to inject errors ...
2011-11-03Merge branch 'for-3.2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds1-1/+1
* 'for-3.2' of git://linux-nfs.org/~bfields/linux: nfsd4: typo logical vs bitwise negate in nfsd4_decode_share_access
2011-11-03Merge branch 'misc-3.2' of ↵Linus Torvalds9-9/+19
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux * 'misc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: MAINTAINERS: Update entry for IA64 [IA64] gpio: GENERIC_GPIO default must be n [IA64[ add CONFIG_NET_VENDOR_INTEL=y to default config files where needed [IA64] agp/hp-agp: Allow binding user memory to the AGP GART [IA64] sn2: add missing put_cpu()
2011-11-03Merge branch 'akpm' (Andrew's incoming - part two)Linus Torvalds70-1035/+4909
Says Andrew: "60 patches. That's good enough for -rc1 I guess. I have quite a lot of detritus to be rechecked, work through maintainers, etc. - most of the remains of MM - rtc - various misc - cgroups - memcg - cpusets - procfs - ipc - rapidio - sysctl - pps - w1 - drivers/misc - aio" * akpm: (60 commits) memcg: replace ss->id_lock with a rwlock aio: allocate kiocbs in batches drivers/misc/vmw_balloon.c: fix typo in code comment drivers/misc/vmw_balloon.c: determine page allocation flag can_sleep outside loop w1: disable irqs in critical section drivers/w1/w1_int.c: multiple masters used same init_name drivers/power/ds2780_battery.c: fix deadlock upon insertion and removal drivers/power/ds2780_battery.c: add a nolock function to w1 interface drivers/power/ds2780_battery.c: create central point for calling w1 interface w1: ds2760 and ds2780, use ida for id and ida_simple_get() to get it pps gpio client: add missing dependency pps: new client driver using GPIO pps: default echo function include/linux/dma-mapping.h: add dma_zalloc_coherent() sysctl: make CONFIG_SYSCTL_SYSCALL default to n sysctl: add support for poll() RapidIO: documentation update drivers/net/rionet.c: fix ethernet address macros for LE platforms RapidIO: fix potential null deref in rio_setup_device() RapidIO: add mport driver for Tsi721 bridge ...
2011-11-03memcg: replace ss->id_lock with a rwlockAndrew Bresticker2-10/+10
While back-porting Johannes Weiner's patch "mm: memcg-aware global reclaim" for an internal effort, we noticed a significant performance regression during page-reclaim heavy workloads due to high contention of the ss->id_lock. This lock protects idr map, and serializes calls to idr_get_next() in css_get_next() (which is used during the memcg hierarchy walk). Since idr_get_next() is just doing a look up, we need only serialize it with respect to idr_remove()/idr_get_new(). By making the ss->id_lock a rwlock, contention is greatly reduced and performance improves. Tested: cat a 256m file from a ramdisk in a 128m container 50 times on each core (one file + container per core) in parallel on a NUMA machine. Result is the time for the test to complete in 1 of the containers. Both kernels included Johannes' memcg-aware global reclaim patches. Before rwlock patch: 1710.778s After rwlock patch: 152.227s Signed-off-by: Andrew Bresticker <abrestic@google.com> Cc: Paul Menage <menage@gmail.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Ying Han <yinghan@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03aio: allocate kiocbs in batchesJeff Moyer2-29/+108
In testing aio on a fast storage device, I found that the context lock takes up a fair amount of cpu time in the I/O submission path. The reason is that we take it for every I/O submitted (see __aio_get_req). Since we know how many I/Os are passed to io_submit, we can preallocate the kiocbs in batches, reducing the number of times we take and release the lock. In my testing, I was able to reduce the amount of time spent in _raw_spin_lock_irq by .56% (average of 3 runs). The command I used to test this was: aio-stress -O -o 2 -o 3 -r 8 -d 128 -b 32 -i 32 -s 16384 <dev> I also tested the patch with various numbers of events passed to io_submit, and I ran the xfstests aio group of tests to ensure I didn't break anything. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Cc: Daniel Ehrenberg <dehrenberg@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03drivers/misc/vmw_balloon.c: fix typo in code commentRakib Mullick1-1/+1
Fix typo in code comment. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Acked-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03drivers/misc/vmw_balloon.c: determine page allocation flag can_sleep outside ↵Rakib Mullick1-1/+1
loop In vmballoon_reserve_page(), flags has been passed from the callee function (vmballoon_inflate here). So, we can determine can_sleep outside the loop. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Acked-by: Dmitry Torokhov <dtor@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03w1: disable irqs in critical sectionJan Weitzel1-0/+5
Interrupting w1_delay() in w1_read_bit() results in missing the low level on the w1 line and receiving "1" instead of "0". Add local_irq_save()/local_irq_restore() around the critical section Signed-off-by: Jan Weitzel <j.weitzel@phytec.de> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03drivers/w1/w1_int.c: multiple masters used same init_nameFlorian Faber1-0/+1
When using multiple masters, w1_int.c would use the .init_name from w1.c for all entities, which will fail when creating a corresponding sysfs entry. This patch uses the unique name previously generated. WARNING: at fs/sysfs/dir.c:451 sysfs_add_one+0x48/0x64() sysfs: cannot create duplicate filename '/devices/w1 bus master' Modules linked in: Call trace: [<9001a604>] warn_slowpath_common+0x34/0x44 [<9001a64c>] warn_slowpath_fmt+0x14/0x18 [<90078020>] sysfs_add_one+0x48/0x64 [<900784ec>] create_dir+0x40/0x68 [<9007857a>] sysfs_create_dir+0x66/0x78 [<900c1a8a>] kobject_add_internal+0x6e/0x104 [<900c1bc0>] kobject_add_varg+0x20/0x2c [<900c1c1c>] kobject_add+0x30/0x3c [<900dbd66>] device_add+0x6a/0x378 [<900dbb4a>] device_initialize+0x12/0x48 [<900dc080>] device_register+0xc/0x10 [<900f99be>] w1_add_master_device+0x162/0x274 [<90008e7a>] w1_gpio_probe+0x66/0xb4 [<9000030c>] kernel_init+0x0/0xe8 [<900dde54>] platform_drv_probe+0xc/0xe [<9000030c>] kernel_init+0x0/0xe8 [<900dd4f8>] driver_probe_device+0x6c/0xdc [<900dd5fc>] __driver_attach+0x34/0x48 [<900dcce8>] bus_for_each_dev+0x2c/0x48 [<900dd5c8>] __driver_attach+0x0/0x48 [<900dd38c>] driver_attach+0x10/0x14 [<900dd16a>] bus_add_driver+0x6a/0x18c [<900dd768>] driver_register+0x60/0xb8 [<90011594>] __initcall_w1_therm_init6+0x0/0x4 [<90008e00>] w1_gpio_init+0x0/0x14 [<9000030c>] kernel_init+0x0/0xe8 [<900ddf48>] platform_driver_register+0x30/0x38 [<90011594>] __initcall_w1_therm_init6+0x0/0x4 [<90008e00>] w1_gpio_init+0x0/0x14 [<9000030c>] kernel_init+0x0/0xe8 [<900ddf5e>] platform_driver_probe+0xe/0x3c [<90008e0c>] w1_gpio_init+0xc/0x14 [<90011594>] __initcall_w1_therm_init6+0x0/0x4 [<90008e00>] w1_gpio_init+0x0/0x14 [<900126d4>] do_one_initcall+0x34/0x130 [<90000372>] kernel_init+0x66/0xe8 [<90011594>] __initcall_w1_therm_init6+0x0/0x4 [<9001ca3e>] do_exit+0x0/0x3a6 [<9000030c>] kernel_init+0x0/0xe8 [<9001ca3e>] do_exit+0x0/0x3a6 ---[ end trace 5a9233884fead918 ]--- kobject_add_internal failed for w1 bus master with -EEXIST, don't try to register things with the same name in the same directory. Signed-off-by: Florian Faber <faber@faberman.de> Cc: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03drivers/power/ds2780_battery.c: fix deadlock upon insertion and removalClifton Barnes1-0/+9
Fixes the deadlock when inserting and removing the ds2780. Signed-off-by: Clifton Barnes <cabarnes@indesign-llc.com> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: <stable@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03drivers/power/ds2780_battery.c: add a nolock function to w1 interfaceClifton Barnes2-13/+37
Adds a nolock function to the w1 interface to avoid locking the mutex if needed. Signed-off-by: Clifton Barnes <cabarnes@indesign-llc.com> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: <stable@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03drivers/power/ds2780_battery.c: create central point for calling w1 interfaceClifton Barnes1-35/+42
Simply creates one point to call the w1 interface. Signed-off-by: Clifton Barnes <cabarnes@indesign-llc.com> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: <stable@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03w1: ds2760 and ds2780, use ida for id and ida_simple_get() to get itJonathan Cameron2-84/+12
Straightforward. As an aside, the ida_init calls are not needed as far as I can see needed. (DEFINE_IDA does the same already). Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk> Cc: Evgeniy Polyakov <zbr@ioremap.net> Acked-by: Clifton Barnes <cabarnes@indesign-llc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03pps gpio client: add missing dependencyHeiko Carstens1-1/+1
Add "depends on GENERIC_HARDIRQS" to avoid compile breakage on s390: drivers/built-in.o: In function `pps_gpio_remove': linux-next/drivers/pps/clients/pps-gpio.c:189: undefined reference to `free_irq' Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Nuss <jamesnuss@nanometrics.ca> Cc: Rodolfo Giometti <giometti@enneenne.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03pps: new client driver using GPIOJames Nuss4-0/+269
This client driver allows you to use a GPIO pin as a source for PPS signals. Platform data [1] are used to specify the GPIO pin number, label, assert event edge type, and whether clear events are captured. This driver is based on the work by Ricardo Martins who submitted an initial implementation [2] of a PPS IRQ client driver to the linuxpps mailing-list on Dec 3 2010. [1] include/linux/pps-gpio.h [2] http://ml.enneenne.com/pipermail/linuxpps/2010-December/004155.html [akpm@linux-foundation.org: remove unneeded cast of void*] Signed-off-by: James Nuss <jamesnuss@nanometrics.ca> Cc: Ricardo Martins <rasm@fe.up.pt> Acked-by: Rodolfo Giometti <giometti@linux.it> Signed-off-by: Ricardo Martins <rasm@fe.up.pt> Cc: Alexander Gordeev <lasaine@lvk.cs.msu.su> Cc: Igor Plyatov <plyatov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03pps: default echo functionJames Nuss3-28/+13
A default echo function has been provided so it is no longer an error when you specify PPS_ECHOASSERT or PPS_ECHOCLEAR without an explicit echo function. This allows some code re-use and also makes it easier to write client drivers since the default echo function does not normally need to change. Signed-off-by: James Nuss <jamesnuss@nanometrics.ca> Reviewed-by: Ben Gardiner <bengardiner@nanometrics.ca> Acked-by: Rodolfo Giometti <giometti@linux.it> Cc: Ricardo Martins <rasm@fe.up.pt> Cc: Alexander Gordeev <lasaine@lvk.cs.msu.su> Cc: Igor Plyatov <plyatov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03include/linux/dma-mapping.h: add dma_zalloc_coherent()Andrew Morton2-0/+17
Lots of driver code does a dma_alloc_coherent() and then zeroes out the memory with a memset. Make it easy for them. Cc: Alexandre Bounine <alexandre.bounine@idt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03sysctl: make CONFIG_SYSCTL_SYSCALL default to nWANG Cong2-37/+2
When I tried to send a patch to remove it, Andi told me we still need to keep compabitlies for old libc, so we can't remove this completely. Then just make it default to n and remove the doc from feature-removal-schedule.txt. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03sysctl: add support for poll()Lucas De Marchi5-0/+108
Adding support for poll() in sysctl fs allows userspace to receive notifications of changes in sysctl entries. This adds a infrastructure to allow files in sysctl fs to be pollable and implements it for hostname and domainname. [akpm@linux-foundation.org: s/declare/define/ for definitions] Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: Greg KH <gregkh@suse.de> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03RapidIO: documentation updateAlexandre Bounine1-1/+1
Update rapidio.txt to reflect changes from recent patch. See http://marc.info/?l=linux-kernel&m=131285620113589&w=2 for details. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Liu Gang <Gang.Liu@freescale.com> Cc: Micha Nelissen <micha@neli.hopto.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03drivers/net/rionet.c: fix ethernet address macros for LE platformsAlexandre Bounine1-2/+2
Modify Ethernet addess macros to be compatible with BE/LE platforms Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Chul Kim <chul.kim@idt.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: <stable@kernel.org> [2.6.39+] Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03RapidIO: fix potential null deref in rio_setup_device()Alexandre Bounine1-1/+1
The "goto cleanup" path can deference "rswitch" when it is NULL. Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Dan Carpenter <error27@gmail.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Chul Kim <chul.kim@idt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03RapidIO: add mport driver for Tsi721 bridgeAlexandre Bounine8-2/+3196
Add RapidIO mport driver for IDT TSI721 PCI Express-to-SRIO bridge device. The driver provides full set of callback functions defined for mport devices in RapidIO subsystem. It also is compatible with current version of RIONET driver (Ethernet over RapidIO messaging services). This patch is applicable to kernel versions starting from 2.6.39. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Signed-off-by: Chul Kim <chul.kim@idt.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03arch/powerpc/sysdev/fsl_rio.c: release rapidio port I/O region resource if ↵Liu Gang1-0/+1
port failed to initialize The "struct rio_mport" contains a member of master port I/O memory resource structure "struct resource iores". This resource will be read from device tree and be used for rapidio R/W transaction memory space. Rapidio requests the port I/O memory resource under the root resource "iomem_resource". struct rio_mport *port; port = kzalloc(sizeof(struct rio_mport), GFP_KERNEL); request_resource(&iomem_resource, &port->iores); When port failed to initialize, allocated "rio_mport" structure memory will be freed, and the port I/O memory resource structure pointer "&port->iores" will be invalid. If other requests resource under "iomem_resource", "&port->iores" node may be operated in the child resources list and this will cause the system to crash. So the requested port I/O memory resource should be released before freeing allocated "rio_mport" structure. Signed-off-by: Liu Gang <Gang.Liu@freescale.com> Acked-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03drivers/rapidio/rio-scan.c: use discovered bit to test if enumeration is ↵Liu Gang1-2/+2
complete The discovered bit in PGCCSR register indicates if the device has been discovered by system host. In Rapidio systems, some agent devices can also be master devices. They can issue requests into the system. Signed-off-by: Liu Gang <Gang.Liu@freescale.com> Acked-by: Alexandre Bounine <alexandre.bounine@idt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03init: add root=PARTUUID=UUID/PARTNROFF=%d supportWill Drewry1-5/+43
Expand root=PARTUUID=UUID syntax to support selecting a root partition by integer offset from a known, unique partition. This approach provides similar properties to specifying a device and partition number, but using the UUID as the unique path prior to evaluating the offset. For example, root=PARTUUID=99DE9194-FC15-4223-9192-FC243948F88B/PARTNROFF=1 selects the partition with UUID 99DE.. then select the next partition. This change is motivated by a particular usecase in Chromium OS where the bootloader can easily determine what partition it is on (by UUID) but doesn't perform general partition table walking. That said, support for this model provides a direct mechanism for the user to modify the root partition to boot without specifically needing to extract each UUID or update the bootloader explicitly when the root partition UUID is changed (if it is recreated to be larger, for instance). Pinning to a /boot-style partition UUID allows the arbitrary root partition reconfiguration/modifications with slightly less ambiguity than just [dev][partition] and less stringency than the specific root partition UUID. [sfr@canb.auug.org.au: fix init sections warning] Signed-off-by: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03include/linux/sem.h: make sysv_sem empty if SYSVIPC is disabledManfred Spraul1-2/+7
For the sysvsem undo, each task struct contains a sysv_sem structure with a pointer to the undo information. This pointer is only necessary if sysvipc is enabled - thus the pointer can be made conditional on CONFIG_SYSVIPC. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03ipc/sem.c: remove private structures from public header fileManfred Spraul2-42/+46
include/linux/sem.h contains several structures that are only used within ipc/sem.c. The patch moves them into ipc/sem.c - there is no need to expose the structures to the whole kernel. No functional changes, only whitespace cleanups and 80-char per line fixes. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03ipc/sem.c: handle spurious wakeupsManfred Spraul1-0/+9
semtimedop() does not handle spurious wakeups, it returns -EINTR to user space. Most other schedule() users would just loop and not return to user space. The patch adds such a loop to semtimedop() Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03ipc/sem.c: fix return code race with semop vs. semop +semctl(IPC_RMID)Manfred Spraul1-1/+0
sys_semtimedop() may return -EIDRM although the semaphore operation completed successfully: thread 1: thread 2: semtimedop(), sleeps semop(): * acquires sem_lock() semtimedop() woken up due to timeout sem_lock() loops * notices that thread 2 could be completed. * performs the operations that thread 2 is sleeping on. * marks the semaphore operation as IN_WAKEUP * drops sem_lock(), does wakeup, sets return code to 0 * thread delayed due to interrupt, whatever * returns to user space * thread still delayed semctl(IPC_RMID) * acquires sem_lock() * ipc_rmid(), ipcp->deleted=1 * drops sem_lock() * thread finally continues - but seem_lock() now fails due to ipcp->deleted == 1 * returns -EIDRM instead of 0 The fix is trivial: Always use the return code in queue.status. In real world, the race probably doesn't matter: If the semaphore array is destroyed, the app is probably not interested if the last operation succeeded or was already cancelled. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03ida: make ida_simple_get/put() IRQ safeTejun Heo1-4/+7
It's often convenient to be able to release resource from IRQ context. Make ida_simple_*() use irqsave/restore spin ops so that they are IRQ safe. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03proc: fix races against execve() of /proc/PID/fd**Vasiliy Kulikov1-43/+103
fd* files are restricted to the task's owner, and other users may not get direct access to them. But one may open any of these files and run any setuid program, keeping opened file descriptors. As there are permission checks on open(), but not on readdir() and read(), operations on the kept file descriptors will not be checked. It makes it possible to violate procfs permission model. Reading fdinfo/* may disclosure current fds' position and flags, reading directory contents of fdinfo/ and fd/ may disclosure the number of opened files by the target task. This information is not sensible per se, but it can reveal some private information (like length of a password stored in a file) under certain conditions. Used existing (un)lock_trace functions to check for ptrace_may_access(), but instead of using EPERM return code from it use EACCES to be consistent with existing proc_pid_follow_link()/proc_pid_readlink() return code. If they differ, attacker can guess what fds exist by analyzing stat() return code. Patched handlers: stat() for fd/*, stat() and read() for fdindo/*, readdir() and lookup() for fd/ and fdinfo/. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: <stable@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03procfs: report EISDIR when reading sysctl dirs in procPavel Emelyanov1-0/+1
On reading sysctl dirs we should return -EISDIR instead of -EINVAL. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03cpusets: avoid looping when storing to mems_allowed if one node remains setDavid Rientjes1-3/+6
{get,put}_mems_allowed() exist so that general kernel code may locklessly access a task's set of allowable nodes without having the chance that a concurrent write will cause the nodemask to be empty on configurations where MAX_NUMNODES > BITS_PER_LONG. This could incur a significant delay, however, especially in low memory conditions because the page allocator is blocking and reclaim requires get_mems_allowed() itself. It is not atypical to see writes to cpuset.mems take over 2 seconds to complete, for example. In low memory conditions, this is problematic because it's one of the most imporant times to change cpuset.mems in the first place! The only way a task's set of allowable nodes may change is through cpusets by writing to cpuset.mems and when attaching a task to a generic code is not reading the nodemask with get_mems_allowed() at the same time, and then clearing all the old nodes. This prevents the possibility that a reader will see an empty nodemask at the same time the writer is storing a new nodemask. If at least one node remains unchanged, though, it's possible to simply set all new nodes and then clear all the old nodes. Changing a task's nodemask is protected by cgroup_mutex so it's guaranteed that two threads are not changing the same task's nodemask at the same time, so the nodemask is guaranteed to be stored before another thread changes it and determines whether a node remains set or not. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Paul Menage <paul@paulmenage.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03mm/page_cgroup.c: quiet sparse noiseH Hartley Sweeten1-1/+1
warning: symbol 'swap_cgroup_ctrl' was not declared. Should it be static? Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Paul Menage <paul@paulmenage.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03memcg: Fix race condition in memcg_check_events() with this_cpu usageSteven Rostedt1-4/+6
Various code in memcontrol.c () calls this_cpu_read() on the calculations to be done from two different percpu variables, or does an open-coded read-modify-write on a single percpu variable. Disable preemption throughout these operations so that the writes go to the correct palces. [hannes@cmpxchg.org: added this_cpu to __this_cpu conversion] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Greg Thelen <gthelen@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03memcg: close race between charge and putbackJohannes Weiner1-1/+20
There is a potential race between a thread charging a page and another thread putting it back to the LRU list: charge: putback: SetPageCgroupUsed SetPageLRU PageLRU && add to memcg LRU PageCgroupUsed && add to memcg LRU The order of setting one flag and checking the other is crucial, otherwise the charge may observe !PageLRU while the putback observes !PageCgroupUsed and the page is not linked to the memcg LRU at all. Global memory pressure may fix this by trying to isolate and putback the page for reclaim, where that putback would link it to the memcg LRU again. Without that, the memory cgroup is undeletable due to a charge whose physical page can not be found and moved out. Signed-off-by: Johannes Weiner <jweiner@redhat.com> Cc: Ying Han <yinghan@google.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03memcg: skip scanning active lists based on individual sizeJohannes Weiner4-41/+25
Reclaim decides to skip scanning an active list when the corresponding inactive list is above a certain size in comparison to leave the assumed working set alone while there are still enough reclaim candidates around. The memcg implementation of comparing those lists instead reports whether the whole memcg is low on the requested type of inactive pages, considering all nodes and zones. This can lead to an oversized active list not being scanned because of the state of the other lists in the memcg, as well as an active list being scanned while its corresponding inactive list has enough pages. Not only is this wrong, it's also a scalability hazard, because the global memory state over all nodes and zones has to be gathered for each memcg and zone scanned. Make these calculations purely based on the size of the two LRU lists that are actually affected by the outcome of the decision. Signed-off-by: Johannes Weiner <jweiner@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: Ying Han <yinghan@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>