summaryrefslogtreecommitdiff
path: root/block
AgeCommit message (Collapse)AuthorFilesLines
2006-03-28Merge branch 'cfq-merge' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds2-171/+207
* 'cfq-merge' of git://brick.kernel.dk/data/git/linux-2.6-block: [BLOCK] cfq-iosched: seek and async performance fixes [PATCH] ll_rw_blk: fix 80-col offender in put_io_context() [PATCH] cfq-iosched: small cfq_choose_req() optimization [PATCH] [BLOCK] cfq-iosched: change cfq io context linking from list to tree
2006-03-28[PATCH] for_each_possible_cpu: fixes for generic partKAMEZAWA Hiroyuki1-1/+1
replaces for_each_cpu with for_each_possible_cpu(). Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28[BLOCK] cfq-iosched: seek and async performance fixesJens Axboe1-37/+65
Detect whether a given process is seeky and if so disable (mostly) the idle window if it is. We still allow just a little idle time, just enough to allow that process to submit a new request. That is needed to maintain fairness across priority groups. In some cases, we could setup several async queues. This is not optimal from a performance POV, since we want all async io in one queue to perform good sorting on it. It also impacted sync queues, as async io got too much slice time. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-28[PATCH] ll_rw_blk: fix 80-col offender in put_io_context()Jens Axboe1-1/+3
This makes akpm more happy. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-28[PATCH] cfq-iosched: small cfq_choose_req() optimizationAndreas Mohr1-18/+32
this is a small optimization to cfq_choose_req() in the CFQ I/O scheduler (this function is a semi-often invoked candidate in an oprofile log): by using a bit mask variable, we can use a simple switch() to check the various cases instead of having to query two variables for each check. Benefit: 251 vs. 285 bytes footprint of cfq_choose_req(). Also, common case 0 (no request wrapping) is now checked first in code. Signed-off-by: Andreas Mohr <andi@lisas.de> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-28[PATCH] [BLOCK] cfq-iosched: change cfq io context linking from list to treeJens Axboe2-116/+108
On setups with many disks, we spend a considerable amount of time looking up the process-disk mapping on each queue of io. Testing with a NULL based block driver, this costs 40-50% reduction in throughput for 1000 disks. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-27Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds3-10/+11
* 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] Don't make debugfs depend on DEBUG_KERNEL [PATCH] Fix blktrace compile with sysfs not defined [PATCH] unused label in drivers/block/cciss. [BLOCK] increase size of disk stat counters [PATCH] blk_execute_rq_nowait-speedup [PATCH] ide-cd: quiet down GPCMD_READ_CDVD_CAPACITY failure [BLOCK] ll_rw_blk: kmalloc -> kzalloc conversion [PATCH] kzalloc() conversion in drivers/block [PATCH] update max_sectors documentation
2006-03-27[PATCH] md: Make sure QUEUE_FLAG_CLUSTER is set properly for md.NeilBrown1-0/+2
This flag should be set for a virtual device iff it is set for all underlying devices. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27[PATCH] Fix blktrace compile with sysfs not definedJens Axboe1-0/+1
debugfs depends on sysfs, so make blktrace kconfig option depend on that. Reported by Adrian Bunk. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-27[BLOCK] increase size of disk stat countersBen Woodard1-3/+3
The kernel's representation of the disk statistics uses the type unsigned which is 32b on both 32b and 64b platforms. Unfortunately, most system tools that work with these numbers that are exported in /proc/diskstats including iostat read these numbers into unsigned longs. This works fine on 32b platforms and when the number of IO transactions are small on 64b platforms. However, when the numbers wrap on 64b platforms & you read the numbers into unsigned longs, and compare the numbers to previous readings, then you get an unsigned representation of a negative number. This looks like a very large 64b number & gives you bizarre readouts in iostat: ilc4: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util ilc4: sda 5.50 0.00 143.96 0.00 307496983987862656.00 0.00 153748491993931328.00 0.00 2136028725038430.00 7.94 55.12 5.59 80.42 Though fixing iostat in user space is possible, and a quick survey indicates that several other similar tools also use unsigned longs when processing /proc/diskstats. Therefore, it seems like a better approach would be to extend the length of the disk_stats structure on 64b architectures to 64b. The following patch does that. It should not affect the operation on 32b platforms. Signed-off-by: Ben Woodard <woodard@redhat.com> Cc: Rick Lindsley <ricklind@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-27[PATCH] blk_execute_rq_nowait-speedupAndrew Morton1-3/+5
Both elv_add_request() and generic_unplug_device() grab the queue lock and disable interrupts, do that locally and use the __ variants. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-27[BLOCK] ll_rw_blk: kmalloc -> kzalloc conversionJens Axboe1-4/+2
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-26[PATCH] 2TB files: add blkcnt_tTakashi Sato1-0/+9
Add blkcnt_t as the type of inode.i_blocks. This enables you to make the size of blkcnt_t either 4 bytes or 8 bytes on 32 bits architecture with CONFIG_LSF. - CONFIG_LSF Add new configuration parameter. - blkcnt_t On h8300, i386, mips, powerpc, s390 and sh that define sector_t, blkcnt_t is defined as u64 if CONFIG_LSF is enabled; otherwise it is defined as unsigned long. On other architectures, it is defined as unsigned long. - inode.i_blocks Change the type from sector_t to blkcnt_t. Signed-off-by: Takashi Sato <sho@tnes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26[PATCH] mempool: use mempool_create_slab_pool()Matthew Dobson1-1/+1
Modify well over a dozen mempool users to call mempool_create_slab_pool() rather than calling mempool_create() with extra arguments, saving about 30 lines of code and increasing readability. Signed-off-by: Matthew Dobson <colpatch@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24BUG_ON() Conversion in block/elevator.cEric Sesterhenn1-2/+1
this changes if() BUG(); constructs to BUG_ON() which is cleaner, contains unlikely() and can better optimized away. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-23[PATCH] Block queue IO tracing support (blktrace) as of 2006-03-23Jens Axboe6-2/+604
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-23[PATCH] sem2mutex: blockdev #2Arjan van de Ven1-11/+11
Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-21[PATCH] kobj_map semaphore to mutex conversionJes Sorensen1-15/+16
Convert the kobj_map code to use a mutex instead of a semaphore. It converts the single two users as well, genhd.c and char_dev.c. Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-19[PATCH] fix rmmod problems with elevator attributes, clean them upAl Viro4-168/+67
2006-03-19[PATCH] elevator_t lifetime rules and sysfs fixesAl Viro4-201/+152
2006-03-19[PATCH] noise removal: cfq-iosched.cAl Viro1-14/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] don't bother with refcounting for cfq_dataAl Viro1-21/+7
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] fix sysfs interaction and lifetime rules handling for queuesAl Viro1-25/+58
2006-03-19[PATCH] fix cfq_get_queue()/ioprio_set(2) racesAl Viro1-4/+12
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] deal with rmmod/put_io_context() racesAl Viro3-1/+32
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] stop elv_unregister() from rogering other iosched's data, fix lockingAl Viro3-15/+24
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] stop cfq from pinning queue downAl Viro1-4/+0
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] make cfq_exit_queue() prune the cfq_io_context for that queueAl Viro1-1/+58
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] fix the exclusion for ioprio_set()Al Viro1-1/+13
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] keep sync and async cfq_queue separateAl Viro1-10/+26
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] switch to use of ->key to get cfq_data by cfq_io_contextAl Viro1-11/+15
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] stop leaking cfq_data in cfq_set_request()Al Viro1-2/+0
We don't need to pin ->key down; ->cfqq->cfqd will do that for us. Incidentally, that stops the leak we had - that reference was never dropped. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] fix cfq hash lookupsAl Viro1-1/+1
If somebody does a hash lookup for cfq_queue while ioprio of an async queue is elevated, they shouldn't end up stuck with lowered ioprio when we go back. Fix is to use ->org_ioprio{,class} in hash lookups. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] fix locking in queue_requests_store()Al Viro1-3/+7
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-19[PATCH] fix double-free in blk_init_queue_node()Al Viro1-5/+5
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-09[PATCH] block: disable block layer bouncing for most memory on 64bit systemsAndi Kleen1-14/+19
The low level PCI DMA mapping functions should handle it in most cases. This should fix problems with depleting the DMA zone early. The old code used precious GFP_DMA memory in many cases where it was not needed. Signed-off-by: Andi Kleen <ak@suse.de> Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-28[PATCH] cfq-iosched: slice expiry fixupsJens Axboe1-91/+60
During testing of SLES10, we encountered a hang in the CFQ io scheduler. Turns out the deferred slice expiry logic is buggy, so remove that for now. We could be left with an idle queue that would never wake up. So kill that logic, always expire immediately. Also fix a potential timer race condition. Patch looks bigger than it is, because it moves a function. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-08Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-blockLinus Torvalds1-0/+3
2006-02-08[PATCH] block: implement elv_insert and use it (fix ordcolor flipping bug)Tejun Heo2-34/+40
q->ordcolor must only be flipped on initial queueing of a hardbarrier request. Constructing ordered sequence and requeueing used to pass through __elv_add_request() which flips q->ordcolor when it sees a barrier request. This patch separates out elv_insert() from __elv_add_request() and uses elv_insert() when constructing ordered sequence and requeueing. elv_insert() inserts the given request at the specified position and does nothing else. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-08[PATCH] blk: Fix SG_IO ioctl failure retry loopingJens Axboe1-0/+3
When issuing an SG_IO ioctl through sd that resulted in an unrecoverable error, a nearly infinite retry loop was discovered. This is due to the fact that the block layer SG_IO code is not setting up rq->retries. This patch also fixes up the sg_scsi_ioctl path. Signed-off-by: Brian King <brking@us.ibm.com> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-02-05[PATCH] block: request_queue->ordcolor must not be flipped on SOFTBARRIERTejun Heo1-1/+2
q->ordcolor must not be flipped on SOFTBARRIER. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05[PATCH] fix ordering on requeued request drainageJens Axboe1-22/+16
Previously, if a fs request which was being drained failed and got requeued, blk_do_ordered() didn't allow it to be reissued, which causes queue stall. This patch makes blk_do_ordered() use the sequence of each request to determine whether a request can be issued or not. This fixes the bug and simplifies code. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05[PATCH] percpu data: only iterate over possible CPUsEric Dumazet1-1/+1
percpu_data blindly allocates bootmem memory to store NR_CPUS instances of cpudata, instead of allocating memory only for possible cpus. As a preparation for changing that, we need to convert various 0 -> NR_CPUS loops to use for_each_cpu(). (The above only applies to users of asm-generic/percpu.h. powerpc has gone it alone and is presently only allocating memory for present CPUs, so it's currently corrupting memory). Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Bottomley <James.Bottomley@steeleye.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Jens Axboe <axboe@suse.de> Cc: Anton Blanchard <anton@samba.org> Acked-by: William Irwin <wli@holomorphy.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01[PATCH] device-mapper disk statistics: timingJun'ichi "Nick" Nomura1-0/+2
Record I/O timing statistics The start time is added to struct dm_io, an existing structure allocated privately internally within dm and attached to each incoming bio. We export disk_round_stats() from block/ll_rw_blk.c instead of creating a private clone. Signed-off-by: Jun'ichi "Nick" Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-31[BLOCK] A few kerneldoc fixupsJens Axboe1-0/+2
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-24[BLOCK] ll_rw_blk: fix setting of ->ordered on initTetsuo Takata1-0/+1
This makes XFS barrier mounts succeed on my SCSI system. Signed-off-by: Tetsuo Takata <takatatt@intellilink.co.jp> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-24[BLOCK] elevator: allow default scheduler to potentially be modularNate Diller1-4/+6
Jens has decided that allowing the default scheduler to be a module is a bug, and should not be allowed under kconfig. However, I find that scenario useful for debugging, and wish for the kernel to be able to handle this situation without OOPSing, if I enable such an option in the .config directly. This patch dynamically checks for the presence of the compiled-in default, and falls back to no-op, emitting a suitable error message, when the default is not available Tested for a range of boot options on 2.6.16-rc1-mm2. Signed-off-by: Nate Diller <nate.diller@gmail.com> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-24[BLOCK] elevator: default choice selectionNate Diller1-31/+14
My previous default iosched patch did a poor job dealing with the 'elevator=' boot-time option. The old behavior falls back to the compiled-in default if the requested one is not registered at boot time. This patch dynamically evaluates which default to use, and emits a suitable error message when the requested scheduler is not available. It also does the 'as' -> 'anticipatory' conversion before elevator registration, which along with a modified registration function, allows it to correctly indicate which default scheduler is in use. Tested for a range of boot options on 2.6.16-rc1-mm2. Signed-off-by: Nate Diller <nate.diller@gmail.com> Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-24[BLOCK] ll_rw_blk: use preempt-disabling disk_stat_add() in completionJens Axboe1-1/+1
It can legally be called with interrupts/preemption enabled. Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-24[BLOCK] ll_rw_blk: make max_sectors and max_hw_sectors unsigned intsJens Axboe1-1/+1
IDE lba48 can support full 64k request size, which overflows the max_hw_sectors variable. Signed-off-by: Jens Axboe <axboe@suse.de>