summaryrefslogtreecommitdiff
path: root/fs/nfs/nfs4state.c
AgeCommit message (Collapse)AuthorFilesLines
2012-03-06NFSv4: Simplify the struct nfs4_stateidTrond Myklebust1-2/+2
Replace the union with the common struct stateid4 as defined in both RFC3530 and RFC5661. This makes it easier to access the sequence id, which will again make implementing support for parallel OPEN calls easier. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-06NFSv4: Add helpers for basic copying of stateidsTrond Myklebust1-3/+3
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-06NFSv4: Rename nfs4_copy_stateid()Trond Myklebust1-1/+1
It is really a function for selecting the correct stateid to use in a read or write situation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-06NFS: Properly handle the case where the delegation is revokedTrond Myklebust1-2/+27
If we know that the delegation stateid is bad or revoked, we need to remove that delegation as soon as possible, and then mark all the stateids that relied on that delegation for recovery. We cannot use the delegation as part of the recovery process. Also note that NFSv4.1 uses a different error code (NFS4ERR_DELEG_REVOKED) to indicate that the delegation was revoked. Finally, ensure that setlk() and setattr() can both recover safely from a revoked delegation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
2012-03-04Merge commit 'nfs-for-3.3-4' into nfs-for-nextTrond Myklebust1-0/+2
Conflicts: fs/nfs/nfs4proc.c Back-merge of the upstream kernel in order to fix a conflict with the slotid type conversion and implementation id patches...
2012-03-03SUNRPC: Use RCU to dereference the rpc_clnt.cl_xprt fieldTrond Myklebust1-8/+17
A migration event will replace the rpc_xprt used by an rpc_clnt. To ensure this can be done safely, all references to cl_xprt must now use a form of rcu_dereference(). Special care is taken with rpc_peeraddr2str(), which returns a pointer to memory whose lifetime is the same as the rpc_xprt. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [ cel: fix lockdep splats and layering violations ] [ cel: forward ported to 3.4 ] [ cel: remove rpc_max_reqs(), add rpc_net_ns() ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-17NFSv4.1 set highest_used_slotid to NFS4_NO_SLOTAndy Adamson1-1/+1
Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-10NFSv4: Ensure we throw out bad delegation stateids on NFS4ERR_BAD_STATEIDTrond Myklebust1-0/+2
To ensure that we don't just reuse the bad delegation when we attempt to recover the nfs4_state that received the bad stateid error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
2012-02-07NFS: start printks w/ NFS: even if __func__ shownWeston Andros Adamson1-6/+6
This patch addresses printks that have some context to show that they are from fs/nfs/, but for the sake of consistency now start with NFS: Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-07NFS: printks in fs/nfs/ should start with NFS:Weston Andros Adamson1-1/+1
Messages like "Got error -10052 from the server on DESTROY_SESSION. Session has been destroyed regardless" can be confusing to users who aren't very familiar with NFS. NOTE: This patch ignores any printks() that start by printing __func__ - that will be in a separate patch. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-01NFSv4: Avoid thundering herd issues with nfs_release_seqidTrond Myklebust1-6/+15
Store a pointer to the rpc_task in struct nfs_seqid so that we can wake up only that request that is able to grab the lock after we've released it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-01SUNRPC: Fix potential races in xprt_lock_write_next()Trond Myklebust1-9/+8
We have to ensure that the wake up from the waitqueue and the assignment of xprt->snd_task are atomic. We can do this by assigning the snd_task while under the waitqueue spinlock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-01NFS: Move struct nfs_unique_id into struct nfs_seqid_counterTrond Myklebust1-5/+5
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-01NFSv4: Move contents of struct rpc_sequence into struct nfs_seqid_counterTrond Myklebust1-13/+23
Clean up. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-01NFSv4: Replace lock_owner->ld_id with an ida based allocatorTrond Myklebust1-66/+8
Again, We're unlikely to ever need more than 2^31 simultaneous lock owners, so let's replace the custom allocator. Now that there are no more users, we can also get rid of the custom allocator code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-01NFSv4: Replace state_owner->so_owner_id with an ida based allocatorTrond Myklebust1-5/+12
We're unlikely to ever need more than 2^31 simultaneous open owners, so let's replace the custom allocator with the generic ida allocator. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-01NFSv4: Clean up nfs4_get_state_ownerTrond Myklebust1-13/+11
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-05NFS: Cache state owners after files are closedChuck Lever1-9/+80
Servers have a finite amount of memory to store NFSv4 open and lock owners. Moreover, servers may have a difficult time determining when they can reap their state owner table, thanks to gray areas in the NFSv4 protocol specification. Thus clients should be careful to reuse state owners when possible. Currently Linux is not too careful. When a user has closed all her files on one mount point, the state owner's reference count goes to zero, and it is released. The next OPEN allocates a new one. A workload that serially opens and closes files can run through a large number of open owners this way. When a state owner's reference count goes to zero, slap it onto a free list for that nfs_server, with an expiry time. Garbage collect before looking for a state owner. This makes state owners for active users available for re-use. Now that there can be unused state owners remaining at umount time, purge the state owner free list when a server is destroyed. Also be sure not to reclaim unused state owners during state recovery. This change has benefits for the client as well. For some workloads, this approach drops the number of OPEN_CONFIRM calls from the same as the number of OPEN calls, down to just one. This reduces wire traffic and thus open(2) latency. Before this patch, untarring a kernel source tarball shows the OPEN_CONFIRM call counter steadily increasing through the test. With the patch, the OPEN_CONFIRM count remains at 1 throughout the entire untar. As long as the expiry time is kept short, I don't think garbage collection should be terribly expensive, although it does bounce the clp->cl_lock around a bit. [ At some point we should rationalize the use of the nfs_server ->destroy method. ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [Trond: Fixed a garbage collection race and a few efficiency issues] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-05NFS: Clean up nfs4_find_state_owners_locked()Chuck Lever1-12/+3
There's no longer a need to check the so_server field in the state owner, because nowadays the RB tree we search for state owners contains owners for that only server. Make nfs4_find_state_owners_locked() use the same tree searching logic as nfs4_insert_state_owner_locked(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-12-10NFSv4: Ensure correct locking when accessing the 'lock_states' listTrond Myklebust1-0/+4
There are currently 2 places in the state recovery code, where we do not take sufficient precautions before accessing the state->lock_states. In both cases, we should be holding the state->state_lock. Reported-by: Pascal Bouchareine <pascal@gandi.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-12-02NFSv4.1: Ensure that we handle _all_ SEQUENCE status bits.Trond Myklebust1-4/+4
Currently, the code assumes that the SEQUENCE status bits are mutually exclusive. They are not... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org [>= 2.6.34]
2011-12-02NFSv4: Don't error if we handled it in nfs4_recovery_handle_errorTrond Myklebust1-8/+13
If we handled an error condition, then nfs4_recovery_handle_error should return '0' so that the state recovery thread can continue. Also ensure that nfs4_check_lease() continues to abort if we haven't got any credentials by having it return ENOKEY (which is not handled). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-08-24NFSv4: renewd needs to be able to handle the NFS4ERR_CB_PATH_DOWN errorTrond Myklebust1-0/+6
The NFSv4 spec does not specify that the server must repeat that error, so in order to avoid having the delegations revoked, we should handle it immediately. Also note that NFS4ERR_CB_PATH_DOWN does in fact renew the lease... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-25Merge branch 'master' into devel and apply fixup from Stephen Rothwell:Stephen Rothwell1-6/+6
vfs/nfs: fixup for nfs_open_context change Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-20nfs4_closedata doesn't need to mess with struct pathAl Viro1-6/+6
instead of path_get()/path_put(), we can just use nfs_sb_{,de}active() to pin the superblock down. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-12NFS: use scope from exchange_id to skip reclaimWeston Andros Adamson1-1/+8
can be skipped if the "eir_server_scope" from the exchange_id proc differs from previous calls. Also, in the future server_scope will be useful for determining whether client trunking is available Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-05-28NFSv4.1: Fix the handling of NFS4ERR_SEQ_MISORDERED errorsTrond Myklebust1-1/+5
Currently, the call to nfs4_schedule_session_recovery() will actually just result in a test of the lease when what we really want is to force a session reset. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2011-04-24NFSv4: Ensure that clientid and session establishment can time outTrond Myklebust1-0/+1
The following patch ensures that we do not get permanently trapped in the RPC layer when trying to establish a new client id or session. This again ensures that the state manager can finish in a timely fashion when the last filesystem to reference the nfs_client exits. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-24NFSv4.1: Don't loop forever in nfs4_proc_create_sessionTrond Myklebust1-15/+31
If a server for some reason keeps sending NFS4ERR_DELAY errors, we can end up looping forever inside nfs4_proc_create_session, and so the usual mechanisms for detecting if the nfs_client is dead don't work. Fix this by ensuring that we loop inside the nfs4_state_manager thread instead. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-16NFSv4.1: Ensure state manager thread dies on last umountTrond Myklebust1-2/+2
Currently, the state manager may continue to try recovering state forever even after the last filesystem to reference that nfs_client has umounted. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2011-03-29fs: don't use igrab() while holding i_lockDave Chinner1-1/+2
Fix the incorrect use of igrab() inside the i_lock in NFS and Ceph‥ If we are already holding the i_lock, we have a reference to the inode so we can safely use ihold() to gain an extra reference. This avoids hangs due to lock recursion on the i_lock now that the inode_lock is gone and igrab() uses the i_lock itself. Signed-off-by: Dave Chinner <dchinner@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: Ryan Mallon <ryan@bluewatersys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-11NFSv4.1: filelayout async error handlerAndy Adamson1-0/+1
Use our own async error handler. Mark the layout as failed and retry i/o through the MDS on specified errors. Update the mds_offset in nfs_readpage_retry so that a failed short-read retry to a DS gets correctly resent through the MDS. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11NFSv4.1: new flag for lease time checkAndy Adamson1-0/+5
Data servers cannot send nfs4_proc_get_lease_time. but still need to setup state renewal. Add the NFS_CS_CHECK_LEASE_TIME bit to indicate if the lease time can be checked. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11NFSv4: nfs4_state_mark_reclaim_nograce() should be staticTrond Myklebust1-2/+2
There are no more external users of nfs4_state_mark_reclaim_nograce() or nfs4_state_mark_reclaim_reboot(), so mark them as static. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11NFSv4/4.1: Fix nfs4_schedule_state_recovery abusesTrond Myklebust1-6/+19
nfs4_schedule_state_recovery() should only be used when we need to force the state manager to check the lease. If we just want to start the state manager in order to handle a state recovery situation, we should be using nfs4_schedule_state_manager(). This patch fixes the abuses of nfs4_schedule_state_recovery() by replacing its use with a set of helper functions that do the right thing. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-25NFS do not find client in NFSv4 pg_authenticateAndy Adamson1-6/+0
The information required to find the nfs_client cooresponding to the incoming back channel request is contained in the NFS layer. Perform minimal checking in the RPC layer pg_authenticate method, and push more detailed checking into the NFS layer where the nfs_client can be found. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06NFS: Move cl_state_owners and related fields to the nfs_server structChuck Lever1-71/+180
NFSv4 migration needs to reassociate state owners from the source to the destination nfs_server data structures. To make that easier, move the cl_state_owners field to the nfs_server struct. cl_openowner_id and cl_lockowner_id accompany this move, as they are used in conjunction with cl_state_owners. The cl_lock field in the parent nfs_client continues to protect all three of these fields. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06pnfs: layout roc codeFred Isaman1-2/+5
A layout can request return-on-close. How this interacts with the forgetful model of never sending LAYOUTRETURNS is a bit ambiguous. We forget any layouts marked roc, and wait for them to be completely forgotten before continuing with the close. In addition, to compensate for races with any inflight LAYOUTGETs, and the fact that we do not get any layout stateid back from the server, we set the barrier to the worst case scenario of current_seqid + number of outstanding LAYOUTGETS. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06NFS add session back channel drainingAndy Adamson1-7/+22
Currently session draining only drains the fore channel. The back channel processing must also be drained. Use the back channel highest_slot_used to indicate that a callback is being processed by the callback thread. Move the session complete to be per channel. When the session is draininig, wait for any current back channel processing to complete and stop all new back channel processing by returning NFS4ERR_DELAY to the back channel client. Drain the back channel, then the fore channel. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06NFS associate sessionid with callback connectionAndy Adamson1-0/+6
The sessions based callback service is started prior to the CREATE_SESSION call so that it can handle CB_NULL requests which can be sent before the CREATE_SESSION call returns and the session ID is known. Set the callback sessionid after a sucessful CREATE_SESSION. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-26Merge branch 'nfs-for-2.6.37' of ↵Linus Torvalds1-0/+2
git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: net/sunrpc: Use static const char arrays nfs4: fix channel attribute sanity-checks NFSv4.1: Use more sensible names for 'initialize_mountpoint' NFSv4.1: pnfs: filelayout: add driver's LAYOUTGET and GETDEVICEINFO infrastructure NFSv4.1: pnfs: add LAYOUTGET and GETDEVICEINFO infrastructure NFS: client needs to maintain list of inodes with active layouts NFS: create and destroy inode's layout cache NFSv4.1: pnfs: filelayout: introduce minimal file layout driver NFSv4.1: pnfs: full mount/umount infrastructure NFS: set layout driver NFS: ask for layouttypes during v4 fsinfo call NFS: change stateid to be a union NFSv4.1: pnfsd, pnfs: protocol level pnfs constants SUNRPC: define xdr_decode_opaque_fixed NFSD: remove duplicate NFS4_STATEID_SIZE
2010-10-26Merge branch 'nfs-for-2.6.37' of ↵Linus Torvalds1-6/+34
git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (67 commits) SUNRPC: Cleanup duplicate assignment in rpcauth_refreshcred nfs: fix unchecked value Ask for time_delta during fsinfo probe Revalidate caches on lock SUNRPC: After calling xprt_release(), we must restart from call_reserve NFSv4: Fix up the 'dircount' hint in encode_readdir NFSv4: Clean up nfs4_decode_dirent NFSv4: nfs4_decode_dirent must clear entry->fattr->valid NFSv4: Fix a regression in decode_getfattr NFSv4: Fix up decode_attr_filehandle() to handle the case of empty fh pointer NFS: Ensure we check all allocation return values in new readdir code NFS: Readdir plus in v4 NFS: introduce generic decode_getattr function NFS: check xdr_decode for errors NFS: nfs_readdir_filler catch all errors NFS: readdir with vmapped pages NFS: remove page size checking code NFS: decode_dirent should use an xdr_stream SUNRPC: Add a helper function xdr_inline_peek NFS: remove readdir plus limit ...
2010-10-25NFS: client needs to maintain list of inodes with active layoutsAndy Adamson1-0/+2
In particular, server reboot will invalidate all layouts. Note that in order to have an active layout, we must get a successful response from the server. To avoid adding that machinery, this patch just includes a stub that fakes up a successful return. Since the layout is never referenced for io, this is not a problem. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Dean Hildebrand <dhildebz@umich.edu> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-23nfs: include ratelimit.h, fix nfs4state build errorRandy Dunlap1-0/+1
nfs4state.c uses interfaces from ratelimit.h. It needs to include that header file to fix build errors: fs/nfs/nfs4state.c:1195: warning: type defaults to 'int' in declaration of 'DEFINE_RATELIMIT_STATE' fs/nfs/nfs4state.c:1195: warning: parameter names (without types) in function declaration fs/nfs/nfs4state.c:1195: error: invalid storage class for function 'DEFINE_RATELIMIT_STATE' fs/nfs/nfs4state.c:1195: error: implicit declaration of function '__ratelimit' fs/nfs/nfs4state.c:1195: error: '_rs' undeclared (first use in this function) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: linux-nfs@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-23NFSv4: The state manager must ignore EKEYEXPIRED.Trond Myklebust1-1/+22
Otherwise, we cannot recover state correctly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-20NFSv4: Don't call nfs4_reclaim_complete() on receiving NFS4ERR_STALE_CLIENTIDTrond Myklebust1-5/+11
If the server sends us an NFS4ERR_STALE_CLIENTID while the state management thread is busy reclaiming state, we do want to treat all state that wasn't reclaimed before the STALE_CLIENTID as if a network partition occurred (see the edge conditions described in RFC3530 and RFC5661). What we do not want to do is to send an nfs4_reclaim_complete(), since we haven't yet even started reclaiming state after the server rebooted. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2010-10-05fs/locks.c: prepare for BKL removalArnd Bergmann1-5/+5
This prepares the removal of the big kernel lock from the file locking code. We still use the BKL as long as fs/lockd uses it and ceph might sleep, but we can flip the definition to a private spinlock as soon as that's done. All users outside of fs/lockd get converted to use lock_flocks() instead of lock_kernel() where appropriate. Based on an earlier patch to use a spinlock from Matthew Wilcox, who has attempted this a few times before, the earliest patch from over 10 years ago turned it into a semaphore, which ended up being slower than the BKL and was subsequently reverted. Someone should do some serious performance testing when this becomes a spinlock, since this has caused problems before. Using a spinlock should be at least as good as the BKL in theory, but who knows... Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Matthew Wilcox <willy@linux.intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Miklos Szeredi <mszeredi@suse.cz> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Sage Weil <sage@newdream.net> Cc: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org
2010-07-30NFSv4: Ensure the lockowners are labelled using the fl_owner and/or fl_pidTrond Myklebust1-10/+35
flock locks want to be labelled using the process pid, while posix locks want to be labelled using the fl_owner. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-07-30NFSv4: Add support for the RELEASE_LOCKOWNER operationTrond Myklebust1-0/+2
This is needed by NFSv4.0 servers in order to keep the number of locking stateids at a manageable level. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-24NFSv4: Clean up struct nfs4_state_ownerTrond Myklebust1-8/+6
The 'so_delegations' list appears to be unused. Also eliminate so_client. If we already have so_server, we can get to the nfs_client structure. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>