summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-05-01[SCSI] esp, scsi_tgt_lib, fcoe: use list_move() instead of ↵Kirill A. Shutemov3-12/+6
list_del()/list_add() combination Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] fcoe: remove unnecessary module state checkYi Zou1-47/+0
The check of module state being MODULE_STATE_LIVE is no longer needed for the individual fcoe transport driver, e.g., fcoe.ko, as sysfs entries now go to libfcoe now, if it reaches fcoe.ko, it has to be already registered. The module state check for libfcoe will guard the possible race condition of sysfs being writable before module_init function is called and after module_exit. Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] fcoe: Remove mutex_trylock/restart_syscall checksRobert Love1-20/+4
These checks were initially added to avoid a lockdep false positive when dealing with the s_active, rtnl and fcoe_config_mutex mutexes. Recently the create, destroy, enable and disable sysfs entries were moved from fcoe.ko to libfcoe.ko. With this change the mutex usage was shuffled around and the lockdep false positive stopped happening. We can now remove these checks. Signed-off-by: Robert Love <robert.w.love@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] libfcoe: Remove mutex_trylock/restart_syscall checksRobert Love1-25/+11
This code was incorrectly ported from fcoe.c when the fcoe transport infrastructure was put into place. It was originally needed in fcoe.c when dealing with the rtnl mutex. In that code it was only needed to avoid a lockdep false positive. In libfcoe we don't deal with the rtnl mutex, we don't get the lockdep false positive and therefore we don't need these checks. Signed-off-by: Robert Love <robert.w.love@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] bnx2fc: introduce missing kfreeJulia Lawall1-0/+3
Error handling code following a kmalloc should free the allocated data. The semantic match that finds the problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r exists@ local idexpression x; statement S; expression E; identifier f,f1,l; position p1,p2; expression *ptr != NULL; @@ x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...); ... if (x == NULL) S <... when != x when != if (...) { <+...x...+> } ( x->f1 = E | (x->f1 == NULL || ...) | f(...,x->f1,...) ) ...> ( return \(0\|<+...x...+>\|ptr\); | return@p2 ...; ) @script:python@ p1 << r.p1; p2 << r.p2; @@ print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] ipr: fix synchronous request flags for better performanceWayne Boyer2-2/+3
In testing it was noticed that Extended Delay after Reset flag was being set for gscsi and volume set devices. This had a negative effect on performance for volume sets. The fix is to only set the flag for gscsi devices. Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Update version number to 8.03.07.03-k.Madhuranath Iyengar1-2/+2
A minor change in the versioning. We'll be attaching the "-k" in the end. Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Log fcport state transitions when debug messages are enabled.Chad Dupuis6-8/+35
Add the inline function qla2x00_set_port_state() so that when a fcport state transition happens we can log the state transition if debug messages are enabled. Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Verify login-state has transitioned to PRLI-completed.Andrew Vasquez2-5/+23
Before driver's own internal state is marked as PLOGI/PRLI complete. This additional check closes a window seen with dual-personality initiator/target devices where a driver's PLOGI/PRLI request occurs within the window after the target's PLOGI request has completed, but prior to the target's PRLI arriving and processed by the firmware. Without this additional check, the firmware will return port-information stating that the port neither supports target nor initiator functions, causing the driver to register the rport prematurely to the FC-transport without the proper 'roles' being set. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Limit the logs in case device state does not change for ISP82xx.Giridhar Malavali1-4/+14
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Add the ql2xdontresethba module_param.Giridhar Malavali3-0/+10
Also, change the ISP82xx code to only reset if this module_param is set and reset is intended via the QLA82XX_DEV_NEED_RESET case. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Display hardware/firmware registers to get more information ↵Giridhar Malavali1-0/+18
about the error for ISP82xx. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Add test for valid loop id to qla2x00_relogin().Joe Carnuccio3-3/+14
If fabric device has invalid loop id (FC_NO_LOOP_ID) then call qla2x00_find_new_loop_id() to attempt to obtain valid loop id. Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Perform FCoE context reset before trying adapter reset for ↵Giridhar Malavali2-11/+37
ISP82xx. For certain failures, try to recover first by doing FCoE context reset before attempting big hammer approach(adpater reset). Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Remove extra call to qla82xx_check_fw_alive().Giridhar Malavali1-1/+0
The stanadlone call to qla82xx_check_fw_alive() in qla82xx_watchdog() is a typo, so remove it. Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Updated the reset sequence for ISP82xx.Giridhar Malavali1-2/+17
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Update copyright banner.Andrew Vasquez22-22/+22
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Update License file.Madhuranath Iyengar1-5/+287
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Free firmware PCB on logout request.Andrew Vasquez1-1/+2
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Include request queue ID in the upper 16-bits of the I/O ↵Mike Hernandez1-1/+1
handle for Abort I/O IOCBs. The upper 16-bits of the handle for all I/O in multi-queue supported drivers carries the ID of the request queue it was submitted on. When using Abort I/O IOCB, the driver needs to also populate the upper 16-bits in the handle_to_abort field so the fw can correlate with the actual I/O. Signed-off-by: Mike Hernandez <michael.hernandez@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Remove extraneous setting of FCF_ASYNC_SENT during ↵Andrew Vasquez1-1/+0
login-done completion. The bit is already set upon entry. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Check for a match before attempting to set FCP-priority ↵Andrew Vasquez1-6/+9
information. Modifying qla24xx_get_fcp_prio() to return a 'found' status allows the driver to short circuit the 'set FCP-priority' call and reduce the amount of noise generated in the messages file: scsi(5): Unable to activate fcp priority, ret=0x102 scsi(5): Unable to activate fcp priority, ret=0x102 Also make qla24xx_get_fcp_prio() static. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Correct calling contexts of qla2x00_mark_device_lost() in ↵Andrew Vasquez1-3/+3
async paths. The respective done() functions are called from process context, so there's no reason to 'defer' the request. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-05-01[SCSI] qla2xxx: Display PortID information during FCP command-status handling.Andrew Vasquez1-4/+5
To provide a clearer translation of the command-status origin in relation to the midlayer's standard SCSI nexus. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-16[SCSI] be2iscsi: Fix for proper setting of FWJayamohan Kallickal1-4/+9
There was a bug in setting up type and dmsg for FW Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-16[SCSI] be2iscsi: check boot_kset is created before destroying itJayamohan Kallickal1-2/+4
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-16[SCSI] be2iscsi: Set a timeout to FWJayamohan Kallickal1-0/+1
Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-16[SCSI] be2iscsi: Modifying Maintainer's emailidJayamohan Kallickal1-2/+2
- Modifying Maintainer's emailid to emulex as Emulex has acquired Serverengines Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-16[SCSI] be2iscsi: change in copyright noticeJayamohan Kallickal9-57/+52
- Modifying copyright year to 2011 - Replacing Serverengines with Emulex as Serverengines Corp has been acquired by Emulex Corp Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-16[SCSI] Log thin provisioning threshold eventShyam Iyer2-0/+7
At least log the message that we received a THIN PROVISIONING SOFT THRESHOLD REACHED Unit Attention. Also added it to unit attention decodes. Signed-off-by: Shyam Iyer <shyam_iyer@dell.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-15Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds6-84/+78
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: only force kblockd unplugging from the schedule() path block: cleanup the block plug helper functions block, blk-sysfs: Use the variable directly instead of a function call block: move queue run on unplug to kblockd block: kill queue_sync_plugs() block: readd plug trace event block: add callback function for unplug notification block: add comment on why we save and disable interrupts in flush_plug_list() block: fixup block IO unplug trace call block: remove block_unplug_timer() trace point block: splice plug list to local context
2011-04-15Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6Linus Torvalds2-58/+97
* 'linux-next' of git://git.infradead.org/ubifs-2.6: UBIFS: fix compilation warnings when compiling with gcc 4.5 UBIFS: fix oops when R/O file-system is fsync'ed
2011-04-15vfs: fix incorrect dentry_update_name_case() BUG_ON() testLinus Torvalds1-1/+1
The case we should be verifying when updating the dentry name is that the _parent_ inode (the directory) semaphore is held, not the semaphore for the dentry itself. It's the directory locking that rename and readdir() etc all care about. The comment just above even says so - but then the BUG_ON() still checked the dentry inode itself. Very few people noticed, because this helper function really isn't used for very much, so you had to be using ncpfs to ever hit it. I think I should just remove the BUG_ON (the function really has just one user), but let's run with it fixed for a while before getting rid of it entirely. Reported-and-tested-by: Bongani Hlope <bonganih@bankservafrica.com> Reported-and-tested-by: Bernd Feige <bernd.feige@uniklinik-freiburg.de> Cc: Petr Vandrovec <petr@vandrovec.name>, Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15block: only force kblockd unplugging from the schedule() pathJens Axboe2-8/+9
For the explicit unplugging, we'd prefer to kick things off immediately and not pay the penalty of the latency to switch to kblockd. So let blk_finish_plug() do the run inline, while the implicit-on-schedule-out unplug will punt to kblockd. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-15block: cleanup the block plug helper functionsChristoph Hellwig2-21/+9
It's a bit of a mess currently. task->plug is being cleared and reset in __blk_finish_plug(), and blk_finish_plug() is testing for a NULL plug which cannot happen even from schedule() anymore since it uses blk_needs_flush_plug() to determine whether to call into this function at all. So get rid of some of the cruft. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-15Merge branch 'for-linus' of ↵Linus Torvalds4-23/+42
git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin: Blackfin: SMP: fix cache flush loop Blackfin: time-ts: ack gptimer sooner to avoid missing short ints Blackfin: gptimers: fix thinko when disabling timers Blackfin: SMP: make all barriers handle cache issues
2011-04-15Merge branch 'for-linus' of ↵Linus Torvalds1-3/+9
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: fix linger request requeueing
2011-04-15mm/thp: use conventional format for boolean attributesBen Hutchings1-10/+14
The conventional format for boolean attributes in sysfs is numeric ("0" or "1" followed by new-line). Any boolean attribute can then be read and written using a generic function. Using the strings "yes [no]", "[yes] no" (read), "yes" and "no" (write) will frustrate this. [akpm@linux-foundation.org: use kstrtoul()] [akpm@linux-foundation.org: test_bit() doesn't return 1/0, per Neil] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Tested-by: David Rientjes <rientjes@google.com> Cc: NeilBrown <neilb@suse.de> Cc: <stable@kernel.org> [2.6.38.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15ramfs: fix memleak on no-mmu archBob Liu1-0/+1
On no-mmu arch, there is a memleak during shmem test. The cause of this memleak is ramfs_nommu_expand_for_mapping() added page refcount to 2 which makes iput() can't free that pages. The simple test file is like this: int main(void) { int i; key_t k = ftok("/etc", 42); for ( i=0; i<100; ++i) { int id = shmget(k, 10000, 0644|IPC_CREAT); if (id == -1) { printf("shmget error\n"); } if(shmctl(id, IPC_RMID, NULL ) == -1) { printf("shm rm error\n"); return -1; } } printf("run ok...\n"); return 0; } And the result: root:/> free total used free shared buffers Mem: 60320 17912 42408 0 0 -/+ buffers: 17912 42408 root:/> shmem run ok... root:/> free total used free shared buffers Mem: 60320 19096 41224 0 0 -/+ buffers: 19096 41224 root:/> shmem run ok... root:/> free total used free shared buffers Mem: 60320 20296 40024 0 0 -/+ buffers: 20296 40024 ... After this patch the test result is:(no memleak anymore) root:/> free total used free shared buffers Mem: 60320 16668 43652 0 0 -/+ buffers: 16668 43652 root:/> shmem run ok... root:/> free total used free shared buffers Mem: 60320 16668 43652 0 0 -/+ buffers: 16668 43652 Signed-off-by: Bob Liu <lliubbo@gmail.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15um: disable CONFIG_CMPXCHG_LOCALRichard Weinberger1-0/+4
Commit 8a5ec0ba "Lockless (and preemptless) fastpaths for slub" makes use of this_cpu_cmpxchg_double() which needs this_cpu_cmpxchg16b_emu() on x86_64. Implementing cmpxchg16b emulation for UML would introduce too much complexity. So just disable it. Signed-off-by: Richard Weinberger <richard@nod.at> Reported-by: Sergei Trofimovich <slyich@gmail.com> Acked-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15um: fix call tracer and bug handlerRichard Weinberger1-0/+6
Commit 1de1502c ("x86, um: now we can get rid of trivial uml headers") removed accidentally bug.h which broke UML's call tracer and bug handler. Without asm-generic/bug.h UML uses BUG() from arch/x86/ which makes use of ud2. UML cannot use ud2, it raises SIGILL in user mode. As UML has a different stack for handling signals the call trace will be cut off. Signed-off-by: Richard Weinberger <richard@nod.at> Reported-by: Sergei Trofimovich <slyich@gmail.com> Tested-by: Sergei Trofimovich <slyich@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15fs/fhandle.c: add <linux/personality.h> for ia64Jeff Mahoney1-0/+1
force_o_largefile() on ia64 is defined in <asm/fcntl.h> and requires <linux/personality.h>. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15MAINTAINERS: change mail adress of Hans J. KochHans J. Koch1-3/+3
My old mail address doesn't exist anymore. This patch changes all occurences in MAINTAINERS to my new address. Signed-off-by: Hans J. Koch <hjk@hansjkoch.de> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15RapidIO/mpc85xx: fix possible mport registration problemsAlexandre Bounine3-4/+7
Fix a possible problem with mport registration left non-cleared after fsl_rio_setup() exits on link error. Abort mport initialization if registration failed. This patch is applicable to 2.6.39-rc1 only. The problem does not exist for earlier versions. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: Thomas Moll <thomas.moll@sysgo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15RapidIO: add IDT CPS-1432 switch definitionsAlexandre Bounine2-0/+2
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15oom-kill: remove boost_dying_task_prio()KOSAKI Motohiro1-28/+0
This is an almost-revert of commit 93b43fa ("oom: give the dying task a higher priority"). That commit dramatically improved oom killer logic when a fork-bomb occurs. But I've found that it has nasty corner case. Now cpu cgroup has strange default RT runtime. It's 0! That said, if a process under cpu cgroup promote RT scheduling class, the process never run at all. If an admin inserts a !RT process into a cpu cgroup by setting rtruntime=0, usually it runs perfectly because a !RT task isn't affected by the rtruntime knob. But if it promotes an RT task via an explicit setscheduler() syscall or an OOM, the task can't run at all. In short, the oom killer doesn't work at all if admins are using cpu cgroup and don't touch the rtruntime knob. Eventually, kernel may hang up when oom kill occur. I and the original author Luis agreed to disable this logic. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Luis Claudio R. Goncalves <lclaudio@uudg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Acked-by: David Rientjes <rientjes@google.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15vmscan: all_unreclaimable() use zone->all_unreclaimable as a nameKOSAKI Motohiro1-11/+13
all_unreclaimable check in direct reclaim has been introduced at 2.6.19 by following commit. 2006 Sep 25; commit 408d8544; oom: use unreclaimable info And it went through strange history. firstly, following commit broke the logic unintentionally. 2008 Apr 29; commit a41f24ea; page allocator: smarter retry of costly-order allocations Two years later, I've found obvious meaningless code fragment and restored original intention by following commit. 2010 Jun 04; commit bb21c7ce; vmscan: fix do_try_to_free_pages() return value when priority==0 But, the logic didn't works when 32bit highmem system goes hibernation and Minchan slightly changed the algorithm and fixed it . 2010 Sep 22: commit d1908362: vmscan: check all_unreclaimable in direct reclaim path But, recently, Andrey Vagin found the new corner case. Look, struct zone { .. int all_unreclaimable; .. unsigned long pages_scanned; .. } zone->all_unreclaimable and zone->pages_scanned are neigher atomic variables nor protected by lock. Therefore zones can become a state of zone->page_scanned=0 and zone->all_unreclaimable=1. In this case, current all_unreclaimable() return false even though zone->all_unreclaimabe=1. This resulted in the kernel hanging up when executing a loop of the form 1. fork 2. mmap 3. touch memory 4. read memory 5. munmmap as described in http://www.gossamer-threads.com/lists/linux/kernel/1348725#1348725 Is this ignorable minor issue? No. Unfortunately, x86 has very small dma zone and it become zone->all_unreclamble=1 easily. and if it become all_unreclaimable=1, it never restore all_unreclaimable=0. Why? if all_unreclaimable=1, vmscan only try DEF_PRIORITY reclaim and a-few-lru-pages>>DEF_PRIORITY always makes 0. that mean no page scan at all! Eventually, oom-killer never works on such systems. That said, we can't use zone->pages_scanned for this purpose. This patch restore all_unreclaimable() use zone->all_unreclaimable as old. and in addition, to add oom_killer_disabled check to avoid reintroduce the issue of commit d1908362 ("vmscan: check all_unreclaimable in direct reclaim path"). Reported-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@kernel.dk> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15mm: check that we have the right vma in __access_remote_vm()Michael Ellerman1-1/+1
In __access_remote_vm() we need to check that we have found the right vma, not the following vma before we try to access it. Otherwise we might call the vma's access routine with an address which does not fall inside the vma. It was discovered on a current kernel but with an unreleased driver, from memory it was strace leading to a kernel bad access, but it obviously depends on what the access implementation does. Looking at other access implementations I only see: $ git grep -A 5 vm_operations|grep access arch/powerpc/platforms/cell/spufs/file.c- .access = spufs_mem_mmap_access, arch/x86/pci/i386.c- .access = generic_access_phys, drivers/char/mem.c- .access = generic_access_phys fs/sysfs/bin.c- .access = bin_access, The spufs one looks like it might behave badly given the wrong vma, it assumes vma->vm_file->private_data is a spu_context, and looks like it would probably blow up pretty quickly if it wasn't. generic_access_phys() only uses the vma to check vm_flags and get the mm, and then walks page tables using the address. So it should bail on the vm_flags check, or at worst let you access some other VM_IO mapping. And bin_access() just proxies to another access implementation. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15brk: COMPAT_BRK: fix detection of randomized brkJiri Kosina3-2/+9
5520e89 ("brk: fix min_brk lower bound computation for COMPAT_BRK") tried to get the whole logic of brk randomization for legacy (libc5-based) applications finally right. It turns out that the way to detect whether brk has actually been randomized in the end or not introduced by that patch still doesn't work for those binaries, as reported by Geert: : /sbin/init from my old m68k ramdisk exists prematurely. : : Before the patch: : : | brk(0x80005c8e) = 0x80006000 : : After the patch: : : | brk(0x80005c8e) = 0x80005c8e : : Old libc5 considers brk() to have failed if the return value is not : identical to the requested value. I don't like it, but currently see no better option than a bit flag in task_struct to catch the CONFIG_COMPAT_BRK && randomize_va_space == 2 case. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-15drivers/misc/sgi-gru/grufile.c: fix the wrong members of gru_chipWanlong Gao1-4/+4
Fix the wrong members and the wrong function's definition, since the irq_chip had changed. Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>