summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2006-01-09[PATCH] Fix and add EXPORT_SYMBOL(filemap_write_and_wait)OGAWA Hirofumi15-48/+21
This patch add EXPORT_SYMBOL(filemap_write_and_wait) and use it. See mm/filemap.c: And changes the filemap_write_and_wait() and filemap_write_and_wait_range(). Current filemap_write_and_wait() doesn't wait if filemap_fdatawrite() returns error. However, even if filemap_fdatawrite() returned an error, it may have submitted the partially data pages to the device. (e.g. in the case of -ENOSPC) <quotation> Andrew Morton writes, If filemap_fdatawrite() returns an error, this might be due to some I/O problem: dead disk, unplugged cable, etc. Given the generally crappy quality of the kernel's handling of such exceptions, there's a good chance that the filemap_fdatawait() will get stuck in D state forever. </quotation> So, this patch doesn't wait if filemap_fdatawrite() returns the -EIO. Trond, could you please review the nfs part? Especially I'm not sure, nfs must use the "filemap_fdatawrite(inode->i_mapping) == 0", or not. Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] fat: support a truncate() for expanding size (generic_cont_expand)OGAWA Hirofumi2-17/+74
This patch changes generic_cont_expand(), in order to share the code with fatfs. - Use vmtruncate() if ->prepare_write() returns a error. Even if ->prepare_write() returns an error, it may already have added some blocks. So, this truncates blocks outside of ->i_size by vmtruncate(). - Add generic_cont_expand_simple(). The generic_cont_expand_simple() assumes that ->prepare_write() can handle the block boundary. With this, we don't need to care the extra byte. And for expanding a file size by truncate(), fatfs uses the added generic_cont_expand_simple(). Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] fat: support ->direct_IO()OGAWA Hirofumi3-15/+87
This patch add to support of ->direct_IO() for mostly read. The user of this seems to want to use for streaming read. So, current direct I/O has limitation, it can only overwrite. (For write operation, mainly we need to handle the hole etc..) Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] fat: s/EXPORT_SYMBOL/EXPORT_SYMBOL_GPL/OGAWA Hirofumi5-17/+17
All EXPORT_SYMBOL of fatfs is only for vfat/msdos. _GPL would be proper. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] fat: add the read/writepages()OGAWA Hirofumi1-1/+16
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] fat: use sb_find_get_block() instead of sb_getblk()OGAWA Hirofumi1-2/+2
We don't need to allocate buffer for checking the buffer is uptodate. This use sb_find_get_block() instead, and if it returns NULL it's not uptodate. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] fat: move fat_clusters_flush() to write_super()OGAWA Hirofumi3-6/+14
It is overkill to update the FS_INFO whenever modifying prev_free/free_clusters, because those are just a hint. So, this patch uses ->write_super() for updating FS_INFO instead. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] slob: introduce the SLOB allocatorMatt Mackall1-0/+4
configurable replacement for slab allocator This adds a CONFIG_SLAB option under CONFIG_EMBEDDED. When CONFIG_SLAB is disabled, the kernel falls back to using the 'SLOB' allocator. SLOB is a traditional K&R/UNIX allocator with a SLAB emulation layer, similar to the original Linux kmalloc allocator that SLAB replaced. It's signicantly smaller code and is more memory efficient. But like all similar allocators, it scales poorly and suffers from fragmentation more than SLAB, so it's only appropriate for small systems. It's been tested extensively in the Linux-tiny tree. I've also stress-tested it with make -j 8 compiles on a 3G SMP+PREEMPT box (not recommended). Here's a comparison for otherwise identical builds, showing SLOB saving nearly half a megabyte of RAM: $ size vmlinux* text data bss dec hex filename 3336372 529360 190812 4056544 3de5e0 vmlinux-slab 3323208 527948 190684 4041840 3dac70 vmlinux-slob $ size mm/{slab,slob}.o text data bss dec hex filename 13221 752 48 14021 36c5 mm/slab.o 1896 52 8 1956 7a4 mm/slob.o /proc/meminfo: SLAB SLOB delta MemTotal: 27964 kB 27980 kB +16 kB MemFree: 24596 kB 25092 kB +496 kB Buffers: 36 kB 36 kB 0 kB Cached: 1188 kB 1188 kB 0 kB SwapCached: 0 kB 0 kB 0 kB Active: 608 kB 600 kB -8 kB Inactive: 808 kB 812 kB +4 kB HighTotal: 0 kB 0 kB 0 kB HighFree: 0 kB 0 kB 0 kB LowTotal: 27964 kB 27980 kB +16 kB LowFree: 24596 kB 25092 kB +496 kB SwapTotal: 0 kB 0 kB 0 kB SwapFree: 0 kB 0 kB 0 kB Dirty: 4 kB 12 kB +8 kB Writeback: 0 kB 0 kB 0 kB Mapped: 560 kB 556 kB -4 kB Slab: 1756 kB 0 kB -1756 kB CommitLimit: 13980 kB 13988 kB +8 kB Committed_AS: 4208 kB 4208 kB 0 kB PageTables: 28 kB 28 kB 0 kB VmallocTotal: 1007312 kB 1007312 kB 0 kB VmallocUsed: 48 kB 48 kB 0 kB VmallocChunk: 1007264 kB 1007264 kB 0 kB (this work has been sponsored in part by CELF) From: Ingo Molnar <mingo@elte.hu> Fix 32-bitness bugs in mm/slob.c. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] RCU signal handlingIngo Molnar1-2/+2
RCU tasklist_lock and RCU signal handling: send signals RCU-read-locked instead of tasklist_lock read-locked. This is a scalability improvement on SMP and a preemption-latency improvement under PREEMPT_RCU. Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: William Irwin <wli@holomorphy.com> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] frv: suppress configuration of certain features for FRVDavid Howells1-1/+1
Suppress configuration of certain features for the FRV arch as they can't be built for FRV at the moment: (*) RTC (*) HISAX_* (*) PARPORT_PC (*) VGA_CONSOLE (*) BINFMT_ELF Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] Fold numa_maps into mempolicies.cChristoph Lameter1-122/+5
First discussed at http://marc.theaimsgroup.com/?t=113149255100001&r=1&w=2 - Use the check_range() in mempolicy.c to gather statistics. - Improve the numa_maps code in general and fix some comments. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-09[PATCH] drop-pagecacheAndrew Morton2-1/+69
Add /proc/sys/vm/drop_caches. When written to, this will cause the kernel to discard as much pagecache and/or reclaimable slab objects as it can. THis operation requires root permissions. It won't drop dirty data, so the user should run `sync' first. Caveats: a) Holds inode_lock for exorbitant amounts of time. b) Needs to be taught about NUMA nodes: propagate these all the way through so the discarding can be controlled on a per-node basis. This is a debugging feature: useful for getting consistent results between filesystem benchmarks. We could possibly put it under a config option, but it's less than 300 bytes. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-07Merge git://git.linux-nfs.org/pub/linux/nfs-2.6Linus Torvalds35-1016/+1678
2006-01-07[PATCH] fs/ufs: debug mode compilation failureEvgeniy1-1/+1
This patch should fix compilation failure of fs/ufs/dir.c with defined UFS_DIR_DEBUG Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06NFSv4: Fix an Oops in nfs_do_expire_all_delegationsTrond Myklebust1-4/+2
If the loop errors, we need to exit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Allow entries in the idmap cache to expireTrond Myklebust3-0/+33
If someone changes the uid/gid mapping in userland, then we do eventually want those changes to be propagated to the kernel. Currently the kernel assumes that it may cache entries forever. Add an expiration time + garbage collector for idmap entries. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFS: get rid of some needless code obfuscation in xdr_encode_sattr().Trond Myklebust1-11/+10
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFS: Send valid mode bits to the serverTrond Myklebust3-2/+5
inode->i_mode contains a lot more than just the mode bits. Make sure that we mask away this extra stuff in SETATTR calls to the server. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06SUNRPC: get rid of cl_chattyChuck Lever5-7/+1
Clean up: Every ULP that uses the in-kernel RPC client, except the NLM client, sets cl_chatty. There's no reason why NLM shouldn't set it, so just get rid of cl_chatty and always be verbose. Test-plan: Compile with CONFIG_NFS enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06SUNRPC: new interface to force an RPC rebindChuck Lever1-2/+2
We'd like to hide fields in rpc_xprt and rpc_clnt from upper layer protocols. Start by creating an API to force RPC rebind, replacing logic that simply sets cl_port to zero. Test-plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. NFSv2/3 and NFSv4 mounting should be carefully checked. Probably need to rig a server where certain services aren't running, or that returns an error for some typical operation. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv3: try get_root user-supplied security_flavorJ. Bruce Fields1-7/+19
Thanks to Ed Keizer for bug and root cause. He says: "... we could only mount the top-level Solaris share. We could not mount deeper into the tree. Investigation showed that Solaris allows UNIX authenticated FSINFO only on the top level of the share. This is a problem because we share/export our home directories one level higher than we mount them. I.e. we share the partition and not the individual home directories. This prevented access to home directories." We still may need to try auth_sys for the case where the client doesn't have appropriate credentials. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NLM: fix parsing of sm notify procedureJ. Bruce Fields1-1/+3
The procedure that decodes statd sm_notify call seems to be skipping a few arguments. How did this ever work? >From folks at Polyserve. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NLM: Further cancel fixesJ. Bruce Fields2-6/+16
If the server receives an NLM cancel call and finds no waiting lock to cancel, then chances are the lock has already been applied, and the client just hadn't yet processed the NLM granted callback before it sent the cancel. The Open Group text, for example, perimts a server to return either success (LCK_GRANTED) or failure (LCK_DENIED) in this case. But returning an error seems more helpful; the client may be able to use it to recognize that a race has occurred and to recover from the race. So, modify the relevant functions to return an error in this case. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NLM: clean up nlmsvc_delete_blockJ. Bruce Fields1-2/+1
The fl_next check here is superfluous (and possibly a layering violation). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NLM: don't unlock on cancel requestsJ. Bruce Fields2-16/+2
Currently when lockd gets an NLM_CANCEL request, it also does an unlock for the same range. This is incorrect. The Open Group documentation says that "This procedure cancels an *outstanding* blocked lock request." (Emphasis mine.) Also, consider a client that holds a lock on the first byte of a file, and requests a lock on the entire file. If the client cancels that request (perhaps because the requesting process is signalled), the server shouldn't apply perform an unlock on the entire file, since that will also remove the previous lock that the client was already granted. Or consider a lock request that actually *downgraded* an exclusive lock to a shared lock. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NLM: Clean up nlmsvc_grant_reply lockingJ. Bruce Fields1-4/+3
Slightly simpler logic here makes it more trivial to verify that the up's and down's are balanced here. Break out an assignment from a conditional while we're at it. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Allow user to set the port used by the NFSv4 callback channelTrond Myklebust5-3/+113
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFS: Clean up weak cache consistency codeTrond Myklebust1-20/+40
...and ensure that nfs_update_inode() respects wcc Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Ensure DELEGRETURN returns attributesTrond Myklebust3-17/+35
Upon return of a write delegation, the server will almost always bump the change attribute. Ensure that we pick up that change so that we don't invalidate our data cache unnecessarily. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Ensure change attribute returned by GETATTR callback conforms to specTrond Myklebust3-1/+5
According to RFC3530 we're supposed to cache the change attribute at the time the client receives a write delegation. If the inode is clean, a CB_GETATTR callback by the server to the client is supposed to return the cached change attribute. If, OTOH, the inode is dirty, the client should bump the cached change attribute by 1. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFS: Make directIO aware of compound pages...Trond Myklebust1-3/+4
...and avoid calling set_page_dirty on them Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFS: Make stat() return updated mtimes after a write()Trond Myklebust2-11/+14
The SuS states that a call to write() will cause mtime to be updated on the file. In order to satisfy that requirement, we need to flush out any cached writes in nfs_getattr(). Speed things up slightly by not committing the writes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Ensure that we return the delegation on the target of a rename too.Trond Myklebust1-1/+3
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFS: support large reads and writes on the wireChuck Lever5-29/+40
Most NFS server implementations allow up to 64KB reads and writes on the wire. The Solaris NFS server allows up to a megabyte, for instance. Now the Linux NFS client supports transfer sizes up to 1MB, too. This will help reduce protocol and context switch overhead on read/write intensive NFS workloads, and support larger atomic read and write operations on servers that support them. Test-plan: Connectathon and iozone on mount point with wsize=rsize>32768 over TCP. Tests with NFS over UDP to verify the maximum RPC payload size cap. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFS: make "inode number mismatch" message more usefulChuck Lever1-8/+9
To help NFS users and server developers, make the "inode number mismatch" message display more useful information. Test-plan: None. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFS: get rid of useless kernel log messageChuck Lever1-2/+1
nfs_statfs() generates a log message when GETATTR returns an error. This is usually a useless message. Make it a dprintk. Test plan: None Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFS: Fix error recovery code in fs/nfs/inode.c:__init_nfs()Chuck Lever1-2/+2
Red Hat found a problem in the error recovery logic in __init_nfs. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFS: use generic_write_checks() to sanity check direct writesChuck Lever1-24/+19
Replace ad hoc write parameter sanity checking in nfs_file_direct_write() with a call to generic_write_checks(). This should make the proper checks modulo the O_LARGEFILE flag, and should catch NFSv2-specific limitations by virtue of i_sb->s_maxbytes. Test plan: Posix compliance testing with both NFSv2 and NFSv3. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Remove requirement for machine creds for the "setclientid" operationTrond Myklebust4-49/+52
Use a cred from the nfs4_client->cl_state_owners list. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Remove requirement for machine creds for the "renew" operationTrond Myklebust4-25/+41
In RFC3530, the RENEW operation is allowed to use either the same principal, RPC security flavour and (if RPCSEC_GSS), the same mechanism and service that was used for SETCLIENTID_CONFIRM OR Any principal, RPC security flavour and service combination that currently has an OPEN file on the server. Choose the latter since that doesn't require us to keep credentials for the same principal for the entire duration of the mount. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Send RENEW requests to the server only when we're holding stateTrond Myklebust5-2/+75
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFS: Convert instances of kernel_thread() to kthread()Trond Myklebust1-30/+16
Convert private implementations in NFSv4 state recovery and delegation code to use kthreads. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: State recovery cleanupTrond Myklebust3-25/+28
Use wait_on_bit() when waiting for state recovery to complete. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: OPEN/LOCK/LOCKU/CLOSE will automatically renew the NFSv4 leaseTrond Myklebust1-3/+31
Cut down on the number of unnecessary RENEW requests on the wire. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06SUNRPC: Ensure that SIGKILL will always terminate a synchronous RPC call.Trond Myklebust1-2/+2
...and make sure that the "intr" flag also enables SIGHUP and SIGTERM to interrupt RPC calls too (as per the Solaris implementation). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Make DELEGRETURN an interruptible operation.Trond Myklebust1-8/+60
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Convert LOCK rpc call into an asynchronous RPC callTrond Myklebust2-75/+175
In order to allow users to interrupt/cancel it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: locking XDR cleanupTrond Myklebust2-168/+155
Get rid of some unnecessary intermediate structures Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Make open recovery track O_RDWR, O_RDONLY and O_WRONLY correctlyTrond Myklebust2-131/+156
When recovering from a delegation recall or a network partition, we need to replay open(O_RDWR), open(O_RDONLY) and open(O_WRONLY) separately. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06NFSv4: Make nfs4_state track O_RDWR, O_RDONLY and O_WRONLY separatelyTrond Myklebust3-28/+42
A closer reading of RFC3530 reveals that OPEN_DOWNGRADE must always specify a access modes that have been the argument of a previous OPEN operation. IOW: doing OPEN(O_RDWR) and then OPEN_DOWNGRADE(O_WRONLY) is forbidden unless the user called OPEN(O_WRONLY) In order to fix that, we really need to track the three possible open states separately. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>