summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2008-12-23NFS: add "[no]resvport" mount optionChuck Lever2-5/+18
The standard default security setting for NFS is AUTH_SYS. An NFS client connects to NFS servers via a privileged source port and a fixed standard destination port (2049). The client sends raw uid and gid numbers to identify users making NFS requests, and the server assumes an appropriate authority on the client has vetted these values because the source port is privileged. On Linux, by default in-kernel RPC services use a privileged port in the range between 650 and 1023 to avoid using source ports of well- known IP services. Using such a small range limits the number of NFS mount points and the number of unique NFS servers to which a client can connect concurrently. An NFS client can use unprivileged source ports to expand the range of source port numbers, allowing more concurrent server connections and more NFS mount points. Servers must explicitly allow NFS connections from unprivileged ports for this to work. In the past, bumping the value of the sunrpc.max_resvport sysctl on the client would permit the NFS client to use unprivileged ports. Bumping this setting also changes the maximum port number used by other in-kernel RPC services, some of which still required a port number less than 1023. This is exacerbated by the way source port numbers are chosen by the Linux RPC client, which starts at the top of the range and works downwards. It means that bumping the maximum means all RPC services requesting a source port will likely get an unprivileged port instead of a privileged one. Changing this setting effects all NFS mount points on a client. A sysadmin could not selectively choose which mount points would use non-privileged ports and which could not. Lastly, this mechanism of expanding the limit on the number of NFS mount points was entirely undocumented. To address the need for the NFS client to use a large range of source ports without interfering with the activity of other in-kernel RPC services, we introduce a new NFS mount option. This option explicitly tells only the NFS client to use a non-privileged source port when communicating with the NFS server for one specific mount point. This new mount option is called "resvport," like the similar NFS mount option on FreeBSD and Mac OS X. A sister patch for nfs-utils will be submitted that documents this new option in nfs(5). The default setting for this new mount option requires the NFS client to use a privileged port, as before. Explicitly specifying the "noresvport" mount option allows the NFS client to use an unprivileged source port for this mount point when connecting to the NFS server port. This mount option is supported only for text-based NFS mounts. [ Sidebar: it is widely known that security mechanisms based on the use of privileged source ports are ineffective. However, the NFS client can combine the use of unprivileged ports with the use of secure authentication mechanisms, such as Kerberos. This allows a large number of connections and mount points while ensuring a useful level of security. Eventually we may change the default setting for this option depending on the security flavor used for the mount. For example, if the mount is using only AUTH_SYS, then the default setting will be "resvport;" if the mount is using a strong security flavor such as krb5, the default setting will be "noresvport." ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [Trond.Myklebust@netapp.com: Fixed a bug whereby nfs4_init_client() was being called with incorrect arguments.] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23NFS: move nfs_server flag initializationChuck Lever1-8/+8
Make it possible for the NFSv4 mount set up logic to pass mount option flags down the stack to nfs_create_rpc_client(). This is immediately useful if we want NFS mount options to modulate settings of the underlying RPC transport, but it may be useful at some later point if other parts of the NFSv4 mount initialization logic want to know what the mount options are. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23NFS: expand flags passed to nfs_create_rpc_client()Chuck Lever1-4/+8
The nfs_create_rpc_client() function sets up an RPC client for an NFS mount point. Add an option that allows it to set up an RPC transport from an unprivileged port. Instead of having nfs_create_rpc_client()'s callers retain local knowledge about how to set up an RPC client, create a couple of flag arguments to control the use of RPC_CLNT_CREATE flags. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23NFS: introduce nfs_mount_info struct for calling nfs_mount()Chuck Lever4-40/+49
Clean up: convert nfs_mount() to take a single data structure argument to make it simpler to add more arguments. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23NFS: Move declaration of nfs_mount() to fs/nfs/internal.hChuck Lever2-0/+6
Clean up: The nfs_mount() function is not to be used outside of the NFS client. Move its public declaration to fs/nfs/internal.h. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23NFS: rename nfs_path variableChuck Lever1-5/+5
Clean up: I'm about to move the declaration of nfs_mount into fs/nfs/internal.h and include it in fs/nfs/nfsroot.c. There's a conflicting definition of nfs_path in fs/nfs/internal.h and fs/nfs/nfsroot.c, so rename the private one. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23lockd: convert reclaimer thread to kthread interfaceJeff Layton1-6/+15
My understanding is that there is a push to turn the kernel_thread interface into a non-exported symbol and move all kernel threads to use the kthread API. This patch changes lockd to use kthread_run to spawn the reclaimer thread. I've made the assumption here that the extra module references taken when we spawn this thread are unnecessary and removed them. I've also added a KERN_ERR printk that pops if the thread can't be spawned to warn the admin that the locks won't be reclaimed. In the future, it would be nice to be able to notify userspace that locks have been lost (probably by implementing SIGLOST), and adding some good policies about how long we should reattempt to reclaim the locks. Finally, I removed a comment about memory leaks that I believe is obsolete and added a new one to clarify the result of sending a SIGKILL to the reclaimer thread. As best I can tell, doing so doesn't actually cause a memory leak. I consider this patch 2.6.29 material. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23LOCKD: Make lockd_up() and lockd_down() exported GPL-onlyTrond Myklebust1-3/+3
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23SUNRPC: nfsacl_encode/nfsacl_decode should be exported as GPL-onlyTrond Myklebust1-2/+2
Again, this has never been intended as a public abi for out-of-tree modules. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23nfs: remove redundant tests on reading new pagesWu Fengguang1-6/+0
aops->readpages() and its NFS helper readpage_async_filler() will only be called to do readahead I/O for newly allocated pages. So it's not necessary to test for the always 0 dirty/uptodate page flags. The removal of nfs_wb_page() call also fixes a readahead bug: the NFS readahead has been synchronous since 2.6.23, because that call will clear PG_readahead, which is the reminder for asynchronous readahead. More background: the PG_readahead page flag is shared with PG_reclaim, one for read path and the other for write path. clear_page_dirty_for_io() unconditionally clears PG_readahead to prevent possible readahead residuals, assuming itself to be always called in the write path. However, NFS is one and the only exception in that it _always_ calls clear_page_dirty_for_io() in the read path, i.e. for readpages()/readpage(). Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Wu Fengguang <wfg@linux.intel.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23dlm: fs/dlm/ast.c: fix warningAndrew Morton1-22/+17
fs/dlm/ast.c: In function 'dlm_astd': fs/dlm/ast.c:64: warning: 'bastmode' may be used uninitialized in this function Cleans code up. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23dlm: add new debugfs entryDavid Teigland2-50/+247
The new debugfs entry dumps all rsb and lkb structures, and includes a lot more information than has been available before. This includes the new timestamps added by a previous patch for debugging callback issues. Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23dlm: add time stamp of blocking callbackDavid Teigland2-0/+3
Record the time the latest blocking callback was queued for a lock. This will be used for debugging in combination with lock queue timestamp changes in the previous patch. Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23dlm: change lock time stampingDavid Teigland4-19/+19
Use ktime instead of jiffies for timestamping lkb's. Also stamp the time on every lkb whenever it's added to a resource queue, instead of just stamping locks subject to timeouts. This will allow us to use timestamps more widely for debugging all locks. Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23dlm: improve how bast mode handlingDavid Teigland5-15/+17
The lkb bastmode value is set in the context of processing the lock, and read by the dlm_astd thread. Because it's accessed in these two separate contexts, the writing/reading ought to be done under a lock. This is simple to do by setting it and reading it when the lkb is added to and removed from dlm_astd's callback list which is properly locked. Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23dlm: remove extra blocking callback checkDavid Teigland1-6/+1
Just before delivering a blocking callback (bast), the dlm_astd thread checks again that the granted mode of the lkb actually blocks the mode requested by the bast. The idea behind this was originally that the granted mode may have changed since the bast was queued, making the callback now unnecessary. Reasons for removing this extra check are: - dlm_astd doesn't lock the rsb before reading the lkb grmode, so it's not technically safe (this removes the long standing FIXME) - after running some tests, it doesn't appear the check ever actually eliminates a bast - delivering an unnecessary blocking callback isn't a bad thing and can happen anyway Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23dlm: replace schedule with cond_reschedSteven Whitehouse1-1/+1
This is a one-liner to use cond_resched() rather than schedule() in the ast delivery loop. It should not be necessary to schedule every time, so this will save some cpu time while continuing to allow scheduling when required. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23dlm: remove kmap/kunmapSteven Whitehouse1-7/+0
The pages used in lowcomms are not highmem, so kmap is not necessary. Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23dlm: trivial annotation of be16 valueHarvey Harrison1-9/+9
fs/dlm/dir.c:419:14: warning: incorrect type in assignment (different base types) fs/dlm/dir.c:419:14: expected unsigned short [unsigned] [addressable] [assigned] [usertype] be_namelen fs/dlm/dir.c:419:14: got restricted __be16 [usertype] <noident> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23dlm: fix up memory allocation flagsSteven Whitehouse3-4/+5
Use ls_allocation for memory allocations, which a cluster fs sets to GFP_NOFS. Use GFP_NOFS for allocations when no lockspace struct is available. Taking dlm locks needs to avoid calling back into the cluster fs because write-out can require taking dlm locks. Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23UBIFS: avoid unnecessary calculationsArtem Bityutskiy1-1/+2
Do not calculate min_idx_lebs, because it is available in c->min_idx_lebs Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23UBIFS: re-calculate min_idx_size after the commitArtem Bityutskiy1-0/+2
When we commit, but before we try to write anything to the flash media, @c->min_idx_size is inaccurate, because we do not re-calculate it after the commit. Do not forget to do this. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23UBIFS: use nicer 64-bit mathArtem Bityutskiy6-35/+30
Instead of using do_div(), use better primitives from linux/math64.h. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23UBIFS: fix available blocks countArtem Bityutskiy3-9/+13
Take into account that 2 eraseblocks are never available because they are reserved for the index. This gives more realistic count of FS blocks. To avoid future confusions like this, introduce a constant. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23UBIFS: various comment improvements and fixesArtem Bityutskiy2-22/+24
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23UBIFS: improve budgeting dumpArtem Bityutskiy3-3/+16
Dump available space calculated by budgeting subsystem. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23UBIFS: fix tnc dumpingArtem Bityutskiy1-1/+1
debugfs tnc dumping was broken because of an obvious typo. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23UBIFS: use PAGE_CACHE_MASK correctlyArtem Bityutskiy1-2/+2
It has high bits set, not low bits set as the UBIFS code assumed. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23[XFS] handle unaligned data in xfs_bmbt_disk_get_allChristoph Hellwig1-1/+2
In libxfs xfs_bmbt_disk_get_all needs to handle unaligned data and thus has been updated to use get_unaligned_be64. In kernelspace we don't strictly need it as the routine is only used for tracing and xfsidbg, but let's keep the two implementations in sync. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22[XFS] avoid memory allocations in xfs_fs_vcmn_errChristoph Hellwig4-26/+40
xfs_fs_vcmn_err can be called under a spinlock, but does a sleeping memory allocation to create buffer for it's internal sprintf. Fortunately it's the only caller of icmn_err, so we can merge the two and have one single static buffer and spinlock protecting it. While we're at it make sure we proper __attribute__ format annotations so that the compiler can detect mismatched format strings. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22[XFS] Fix speculative allocation beyond eofLachlan McIlroy1-21/+7
Speculative allocation beyond eof doesn't work properly. It was broken some time ago after a code cleanup that moved what is now xfs_iomap_eof_align_last_fsb() and xfs_iomap_eof_want_preallocate() out of xfs_iomap_write_delay() into separate functions. The code used to use the current file size in various checks but got changed to be max(file_size, i_new_size). Since i_new_size is the result of 'offset + count' then in xfs_iomap_eof_want_preallocate() the check for '(offset + count) <= isize' will always be true. ie if 'offset + count' is > ip->i_size then isize will be i_new_size and equal to 'offset + count'. This change fixes all the places that used to use the current file size. Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22[XFS] Remove XFS_BUF_SHUT() and friendsLachlan McIlroy3-20/+1
Code does nothing so remove it. Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22[XFS] Use the incore inode size in xfs_file_readdir()Lachlan McIlroy1-1/+1
We should be using the incore inode size here not the linux inode size. The incore inode size is always up to date for directories whereas the linux inode size is not updated for directories. We've hit assertions in xfs_bmap() and traced it back to the linux inode size being zero but the incore size being correct. Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22sched: fix warning in fs/proc/base.cIngo Molnar1-2/+2
Stephen Rothwell reported this new (harmless) build warning on platforms that define u64 to long: fs/proc/base.c: In function 'proc_pid_schedstat': fs/proc/base.c:352: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'u64' asm-generic/int-l64.h platforms strike again: that file should be eliminated. Fix it by casting the parameters to long long. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-22[XFS] Fix merge conflict in fs/xfs/xfs_rename.cLachlan McIlroy16-27/+75
Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: fs/xfs/xfs_rename.c Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-20fs/9p: change simple_strtol to simple_strtoulJulia Lawall1-1/+1
Since v9ses->uid is unsigned, it would seem better to use simple_strtoul that simple_strtol. A simplified version of the semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r2@ long e; position p; @@ e = simple_strtol@p(...) @@ position p != r2.p; type T; T e; @@ e = - simple_strtol@p + simple_strtoul (...) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-12-209p: convert d_iname references to d_name.nameWu Fengguang3-6/+10
d_iname is rubbish for long file names. Use d_name.name in printks instead. Signed-off-by: Wu Fengguang <wfg@linux.intel.com> Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-12-209p: Remove potentially bad parameter from function entry debug print.Duane Griffin1-1/+2
Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-12-20security: pass mount flags to security_sb_kern_mount()James Morris1-1/+1
Pass mount flags to security_sb_kern_mount(), so security modules can determine if a mount operation is being performed by the kernel. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2008-12-19Merge branches 'tracing/ftrace', 'tracing/ring-buffer' and 'tracing/urgent' ↵Ingo Molnar13-20/+63
into tracing/core Conflicts: include/linux/ftrace.h
2008-12-18schedstat: consolidate per-task cpu runtime statsKen Chen1-1/+1
Impact: simplify code When we turn on CONFIG_SCHEDSTATS, per-task cpu runtime is accumulated twice. Once in task->se.sum_exec_runtime and once in sched_info.cpu_time. These two stats are exactly the same. Given that task->se.sum_exec_runtime is always accumulated by the core scheduler, sched_info can reuse that data instead of duplicate the accounting. Signed-off-by: Ken Chen <kenchen@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-18Merge branch 'linux-2.6' into nextPaul Mackerras3-4/+11
2008-12-18Merge branch 'upstream-linus' of ↵Linus Torvalds2-3/+9
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: Add JBD2 compat feature bit. ocfs2: Always update xattr search when creating bucket.
2008-12-18cifs: fix buffer overrun in parse_DFS_referralsJeff Layton1-1/+2
While testing a kernel with memory poisoning enabled, I saw some warnings about the redzone getting clobbered when chasing DFS referrals. The buffer allocation for the unicode converted version of the searchName is too small and needs to take null termination into account. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-17ocfs2: Add JBD2 compat feature bit.Joel Becker1-1/+7
Define the OCFS2_FEATURE_COMPAT_JBD2 bit in the filesystem header. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-12-17ocfs2: Always update xattr search when creating bucket.Tao Ma1-2/+2
When we create xattr bucket during the process of xattr set, we always need to update the ocfs2_xattr_search since even if the bucket size is the same as block size, the offset will change because of the removal of the ocfs2_xattr_block header. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-12-16jfs: ensure symlinks are NUL-terminatedDave Kleikamp1-1/+7
This is an alternate fix for a bug reported and fixed by Duane Griffin. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Reported-by: Duane Griffin <duaneg@dghda.com>
2008-12-16proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQKOSAKI Motohiro1-5/+2
Impact: restructure code to fix compiler warning commit 240d367b4e6c6e3c5075e034db14dba60a6f5fa7 moved desc usage point into #ifdef CONFIG_SPARSE_IRQ. Eliminate the desc variable, otherwise following warning happens: fs/proc/stat.c: In function 'show_stat': fs/proc/stat.c:31: warning: unused variable 'desc' [ akpm: cleaned up the patch to remove #ifdef ] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-16powerpc: Remove `have_of' global variableAnton Vorontsov1-2/+1
The `have_of' variable is a relic from the arch/ppc time, it isn't useful nowadays. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-16Merge branch 'master' of ↵David S. Miller14-18/+57
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000e/ich8lan.c