summaryrefslogtreecommitdiff
path: root/fs/nfs
AgeCommit message (Collapse)AuthorFilesLines
2018-06-01fs: Teach path_connected to handle nfs filesystems with multiple roots.Eric W. Biederman1-0/+2
commit 95dd77580ccd66a0da96e6d4696945b8cea39431 upstream. On nfsv2 and nfsv3 the nfs server can export subsets of the same filesystem and report the same filesystem identifier, so that the nfs client can know they are the same filesystem. The subsets can be from disjoint directory trees. The nfsv2 and nfsv3 filesystems provides no way to find the common root of all directory trees exported form the server with the same filesystem identifier. The practical result is that in struct super s_root for nfs s_root is not necessarily the root of the filesystem. The nfs mount code sets s_root to the root of the first subset of the nfs filesystem that the kernel mounts. This effects the dcache invalidation code in generic_shutdown_super currently called shrunk_dcache_for_umount and that code for years has gone through an additional list of dentries that might be dentry trees that need to be freed to accomodate nfs. When I wrote path_connected I did not realize nfs was so special, and it's hueristic for avoiding calling is_subdir can fail. The practical case where this fails is when there is a move of a directory from the subtree exposed by one nfs mount to the subtree exposed by another nfs mount. This move can happen either locally or remotely. With the remote case requiring that the move directory be cached before the move and that after the move someone walks the path to where the move directory now exists and in so doing causes the already cached directory to be moved in the dcache through the magic of d_splice_alias. If someone whose working directory is in the move directory or a subdirectory and now starts calling .. from the initial mount of nfs (where s_root == mnt_root), then path_connected as a heuristic will not bother with the is_subdir check. As s_root really is not the root of the nfs filesystem this heuristic is wrong, and the path may actually not be connected and path_connected can fail. The is_subdir function might be cheap enough that we can call it unconditionally. Verifying that will take some benchmarking and the result may not be the same on all kernels this fix needs to be backported to. So I am avoiding that for now. Filesystems with snapshots such as nilfs and btrfs do something similar. But as the directory tree of the snapshots are disjoint from one another and from the main directory tree rename won't move things between them and this problem will not occur. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Fixes: 397d425dc26d ("vfs: Test for and handle paths that are unreachable from their mnt_root") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [bwh: Backported to 3.2: - Add the super_block::s_iflags field - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2018-02-13nfs: Fix ugly referral attributesChuck Lever1-10/+8
commit c05cefcc72416a37eba5a2b35f0704ed758a9145 upstream. Before traversing a referral and performing a mount, the mounted-on directory looks strange: dr-xr-xr-x. 2 4294967294 4294967294 0 Dec 31 1969 dir.0 nfs4_get_referral is wiping out any cached attributes with what was returned via GETATTR(fs_locations), but the bit mask for that operation does not request any file attributes. Retrieve owner and timestamp information so that the memcpy in nfs4_get_referral fills in more attributes. Changes since v1: - Don't request attributes that the client unconditionally replaces - Request only MOUNTED_ON_FILEID or FILEID attribute, not both - encode_fs_locations() doesn't use the third bitmask word Fixes: 6b97fd3da1ea ("NFSv4: Follow a referral") Suggested-by: Pradeep Thomas <pradeepthomas@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2017-11-11mount: copy the port field into the cloned nfs_server structure.Steve Dickson1-0/+1
commit 89a6814d9b665b196aa3a102f96b6dc7e8cb669e upstream. Doing this copy eliminates the "port=0" entry in the /proc/mounts entries Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=69241 Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2017-06-05NFSv4: fix getacl ERANGE for some ACL buffer sizesWeston Andros Adamson1-6/+2
commit ed92d8c137b7794c2c2aa14479298b9885967607 upstream. We're not taking into account that the space needed for the (variable length) attr bitmap, with the result that we'd sometimes get a spurious ERANGE when the ACL data got close to the end of a page. Just add in an extra page to make sure. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2017-06-05NFSv4: Fix range checking in __nfs4_get_acl_uncached and __nfs4_proc_set_aclTrond Myklebust1-9/+11
commit 21f498c2f73bd6150d82931f09965826dca0b5f2 upstream. Ensure that the user supplied buffer size doesn't cause us to overflow the 'pages' array. Also fix up some confusion between the use of PAGE_SIZE and PAGE_CACHE_SIZE when calculating buffer sizes. We're not using the page cache for anything here. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2017-06-05NFSv4: Fix the underestimation of delegation XDR space reservationTrond Myklebust1-0/+2
commit 5a1f6d9e9b803003271b40b67786ff46fa4eda01 upstream. Account for the "space_limit" field in struct open_write_delegation4. Fixes: 2cebf82883f4 ("NFSv4: Fix the underestimate of NFSv4 open request size") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2017-03-16nfs_write_end(): fix handling of short copiesAl Viro1-1/+1
commit c0cf3ef5e0f47e385920450b245d22bead93e7ad upstream. What matters when deciding if we should make a page uptodate is not how much we _wanted_ to copy, but how much we actually have copied. As it is, on architectures that do not zero tail on short copy we can leave uninitialized data in page marked uptodate. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2017-02-23NFSv4: Open state recovery must account for file permission changesTrond Myklebust1-0/+3
commit 304020fe48c6c7fff8b5a38f382b54404f0f79d3 upstream. If the file permissions change on the server, then we may not be able to recover open state. If so, we need to ensure that we mark the file descriptor appropriately. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2016-11-20NFSv4.1: Fix the CREATE_SESSION slot number accountingTrond Myklebust1-4/+11
commit b519d408ea32040b1c7e10b155a3ee9a36660947 upstream. Ensure that we conform to the algorithm described in RFC5661, section 18.36.4 for when to bump the sequence id. In essence we do it for all cases except when the RPC call timed out, or in case of the server returning NFS4ERR_DELAY or NFS4ERR_STALE_CLIENTID. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [bwh: Backported to 3.2: - Add the 'out' label - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2016-11-20nfs: don't create zero-length requestsBenjamin Coddington1-1/+4
commit 149a4fddd0a72d526abbeac0c8deaab03559836a upstream. NFS doesn't expect requests with wb_bytes set to zero and may make unexpected decisions about how to handle that request at the page IO layer. Skip request creation if we won't have any wb_bytes in the request. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2016-11-20NFS: Don't drop CB requests with invalid principalsChuck Lever1-1/+5
commit a4e187d83d88eeaba6252aac0a2ffe5eaa73a818 upstream. Before commit 778be232a207 ("NFS do not find client in NFSv4 pg_authenticate"), the Linux callback server replied with RPC_AUTH_ERROR / RPC_AUTH_BADCRED, instead of dropping the CB request. Let's restore that behavior so the server has a chance to do something useful about it, and provide a warning that helps admins correct the problem. Fixes: 778be232a207 ("NFS do not find client in NFSv4 ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2016-08-23NFS: Fix another OPEN_DOWNGRADE bugTrond Myklebust1-3/+2
commit e547f2628327fec6afd2e03b46f113f614cca05b upstream. Olga Kornievskaia reports that the following test fails to trigger an OPEN_DOWNGRADE on the wire, and only triggers the final CLOSE. fd0 = open(foo, RDRW) -- should be open on the wire for "both" fd1 = open(foo, RDONLY) -- should be open on the wire for "read" close(fd0) -- should trigger an open_downgrade read(fd1) close(fd1) The issue is that we're missing a check for whether or not the current state transitioned from an O_RDWR state as opposed to having transitioned from a combination of O_RDONLY and O_WRONLY. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: cd9288ffaea4 ("NFSv4: Fix another bug in the close/open_downgrade code") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2016-02-13NFS: Fix attribute cache revalidationTrond Myklebust1-15/+39
commit ade14a7df796d4e86bd9d181193c883a57b13db0 upstream. If a NFSv4 client uses the cache_consistency_bitmask in order to request only information about the change attribute, timestamps and size, then it has not revalidated all attributes, and hence the attribute timeout timestamp should not be updated. Reported-by: Donald Buczek <buczek@molgen.mpg.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-12-30nfs: if we have no valid attrs, then don't declare the attribute cache validJeff Layton1-1/+5
commit c812012f9ca7cf89c9e1a1cd512e6c3b5be04b85 upstream. If we pass in an empty nfs_fattr struct to nfs_update_inode, it will (correctly) not update any of the attributes, but it then clears the NFS_INO_INVALID_ATTR flag, which indicates that the attributes are up to date. Don't clear the flag if the fattr struct has no valid attrs to apply. Reviewed-by: Steve French <steve.french@primarydata.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-11-17Failing to send a CLOSE if file is opened WRONLY and server reboots on a 4.x ↵Olga Kornievskaia1-1/+1
mount commit a41cbe86df3afbc82311a1640e20858c0cd7e065 upstream. A test case is as the description says: open(foobar, O_WRONLY); sleep() --> reboot the server close(foobar) The bug is because in nfs4state.c in nfs4_reclaim_open_state() a few line before going to restart, there is clear_bit(NFS4CLNT_RECLAIM_NOGRACE, &state->flags). NFS4CLNT_RECLAIM_NOGRACE is a flag for the client states not open owner states. Value of NFS4CLNT_RECLAIM_NOGRACE is 4 which is the value of NFS_O_WRONLY_STATE in nfs4_state->flags. So clearing it wipes out state and when we go to close it, “call_close” doesn’t get set as state flag is not set and CLOSE doesn’t go on the wire. Signed-off-by: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-10-13NFSv4: don't set SETATTR for O_RDONLY|O_EXCLNeilBrown1-1/+1
commit efcbc04e16dfa95fef76309f89710dd1d99a5453 upstream. It is unusual to combine the open flags O_RDONLY and O_EXCL, but it appears that libre-office does just that. [pid 3250] stat("/home/USER/.config", {st_mode=S_IFDIR|0700, st_size=8192, ...}) = 0 [pid 3250] open("/home/USER/.config/libreoffice/4-suse/user/extensions/buildid", O_RDONLY|O_EXCL <unfinished ...> NFSv4 takes O_EXCL as a sign that a setattr command should be sent, probably to reset the timestamps. When it was an O_RDONLY open, the SETATTR command does not identify any actual attributes to change. If no delegation was provided to the open, the SETATTR uses the all-zeros stateid and the request is accepted (at least by the Linux NFS server - no harm, no foul). If a read-delegation was provided, this is used in the SETATTR request, and a Netapp filer will justifiably claim NFS4ERR_BAD_STATEID, which the Linux client takes as a sign to retry - indefinitely. So only treat O_EXCL specially if O_CREAT was also given. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [bwh: Backported to 3.2: we only check open_flags, not createmode as well] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-08-12NFS: Fix size of NFSACL SETACL operationsChuck Lever1-1/+1
commit d683cc49daf7c5afca8cd9654aaa1bf63cdf2ad9 upstream. When encoding the NFSACL SETACL operation, reserve just the estimated size of the ACL rather than a fixed maximum. This eliminates needless zero padding on the wire that the server ignores. Fixes: ee5dc7732bd5 ('NFS: Fix "kernel BUG at fs/nfs/nfs3xdr.c:1338!"') Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-08-12fixing infinite OPEN loop in 4.0 stateid recoveryOlga Kornievskaia1-0/+2
commit e8d975e73e5fa05f983fbf2723120edcf68e0b38 upstream. Problem: When an operation like WRITE receives a BAD_STATEID, even though recovery code clears the RECLAIM_NOGRACE recovery flag before recovering the open state, because of clearing delegation state for the associated inode, nfs_inode_find_state_and_recover() gets called and it makes the same state with RECLAIM_NOGRACE flag again. As a results, when we restart looking over the open states, we end up in the infinite loop instead of breaking out in the next test of state flags. Solution: unset the RECLAIM_NOGRACE set because of calling of nfs_inode_find_state_and_recover() after returning from calling recover_open() function. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-05-10NFSv4: Minor cleanups for nfs4_handle_exception and nfs4_async_handle_errorTrond Myklebust1-5/+5
commit 14977489ffdb80d4caf5a184ba41b23b02fbacd9 upstream. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [bwh: This is not merely a cleanup but also fixes a regression introduced by commit 3114ea7a24d3 ("NFSv4: Return the delegation if the server returns NFS4ERR_OPENMODE"), backported in 3.2.14] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-05-10NFSv4: Don't call put_rpccred() under the rcu_read_lock()Trond Myklebust1-1/+1
commit 7c0af9ffb7bb4e5355470fa60b3eb711ddf226fa upstream. put_rpccred() can sleep. Fixes: 8f649c3762547 ("NFSv4: Fix the locking in nfs_inode_reclaim_delegation()") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-05-10NFSv4.1: Fix a kfree() of uninitialised pointers in decode_cb_sequence_argsTrond Myklebust1-1/+3
commit d8ba1f971497c19cf80da1ea5391a46a5f9fbd41 upstream. If the call to decode_rc_list() fails due to a memory allocation error, then we need to truncate the array size to ensure that we only call kfree() on those pointer that were allocated. Reported-by: David Ramos <daramos@stanford.edu> Fixes: 4aece6a19cf7f ("nfs41: cb_sequence xdr implementation") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-05-10nfs: don't call blocking operations while !TASK_RUNNINGJeff Layton1-3/+5
commit 6ffa30d3f734d4f6b478081dfc09592021028f90 upstream. Bruce reported seeing this warning pop when mounting using v4.1: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1121 at kernel/sched/core.c:7300 __might_sleep+0xbd/0xd0() do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810ff58f>] prepare_to_wait+0x2f/0x90 Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer ppdev joydev snd virtio_console virtio_balloon pcspkr serio_raw parport_pc parport pvpanic floppy soundcore i2c_piix4 virtio_blk virtio_net qxl drm_kms_helper ttm drm virtio_pci virtio_ring ata_generic virtio pata_acpi CPU: 1 PID: 1121 Comm: nfsv4.1-svc Not tainted 3.19.0-rc4+ #25 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014 0000000000000000 000000004e5e3f73 ffff8800b998fb48 ffffffff8186ac78 0000000000000000 ffff8800b998fba0 ffff8800b998fb88 ffffffff810ac9da ffff8800b998fb68 ffffffff81c923e7 00000000000004d9 0000000000000000 Call Trace: [<ffffffff8186ac78>] dump_stack+0x4c/0x65 [<ffffffff810ac9da>] warn_slowpath_common+0x8a/0xc0 [<ffffffff810aca65>] warn_slowpath_fmt+0x55/0x70 [<ffffffff810ff58f>] ? prepare_to_wait+0x2f/0x90 [<ffffffff810ff58f>] ? prepare_to_wait+0x2f/0x90 [<ffffffff810dd2ad>] __might_sleep+0xbd/0xd0 [<ffffffff8124c973>] kmem_cache_alloc_trace+0x243/0x430 [<ffffffff810d941e>] ? groups_alloc+0x3e/0x130 [<ffffffff810d941e>] groups_alloc+0x3e/0x130 [<ffffffffa0301b1e>] svcauth_unix_accept+0x16e/0x290 [sunrpc] [<ffffffffa0300571>] svc_authenticate+0xe1/0xf0 [sunrpc] [<ffffffffa02fc564>] svc_process_common+0x244/0x6a0 [sunrpc] [<ffffffffa02fd044>] bc_svc_process+0x1c4/0x260 [sunrpc] [<ffffffffa03d5478>] nfs41_callback_svc+0x128/0x1f0 [nfsv4] [<ffffffff810ff970>] ? wait_woken+0xc0/0xc0 [<ffffffffa03d5350>] ? nfs4_callback_svc+0x60/0x60 [nfsv4] [<ffffffff810d45bf>] kthread+0x11f/0x140 [<ffffffff810ea815>] ? local_clock+0x15/0x30 [<ffffffff810d44a0>] ? kthread_create_on_node+0x250/0x250 [<ffffffff81874bfc>] ret_from_fork+0x7c/0xb0 [<ffffffff810d44a0>] ? kthread_create_on_node+0x250/0x250 ---[ end trace 675220a11e30f4f2 ]--- nfs41_callback_svc does most of its work while in TASK_INTERRUPTIBLE, which is just wrong. Fix that by finishing the wait immediately if we've found that the list has something on it. Also, we don't expect this kthread to accept signals, so we should be using a TASK_UNINTERRUPTIBLE sleep instead. That however, opens us up hung task warnings from the watchdog, so have the schedule_timeout wake up every 60s if there's no callback activity. Reported-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2015-01-01move d_rcu from overlapping d_child to overlapping d_aliasAl Viro1-1/+1
commit 946e51f2bf37f1656916eb75bd0742ba33983c28 upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [bwh: Backported to 3.2: - Apply name changes in all the different places we use d_alias and d_child - Move the WARN_ON() in __d_free() to d_free() as we don't have dentry_free()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-12-14NFSv4.1: Fix an NFSv4.1 state renewal regressionAndy Adamson2-3/+11
commit d1f456b0b9545f1606a54cd17c20775f159bd2ce upstream. Commit 2f60ea6b8ced ("NFSv4: The NFSv4.0 client must send RENEW calls if it holds a delegation") set the NFS4_RENEW_TIMEOUT flag in nfs4_renew_state, and does not put an nfs41_proc_async_sequence call, the NFSv4.1 lease renewal heartbeat call, on the wire to renew the NFSv4.1 state if the flag was not set. The NFS4_RENEW_TIMEOUT flag is set when "now" is after the last renewal (cl_last_renewal) plus the lease time divided by 3. This is arbitrary and sometimes does the following: In normal operation, the only way a future state renewal call is put on the wire is via a call to nfs4_schedule_state_renewal, which schedules a nfs4_renew_state workqueue task. nfs4_renew_state determines if the NFS4_RENEW_TIMEOUT should be set, and the calls nfs41_proc_async_sequence, which only gets sent if the NFS4_RENEW_TIMEOUT flag is set. Then the nfs41_proc_async_sequence rpc_release function schedules another state remewal via nfs4_schedule_state_renewal. Without this change we can get into a state where an application stops accessing the NFSv4.1 share, state renewal calls stop due to the NFS4_RENEW_TIMEOUT flag _not_ being set. The only way to recover from this situation is with a clientid re-establishment, once the application resumes and the server has timed out the lease and so returns NFS4ERR_BAD_SESSION on the subsequent SEQUENCE operation. An example application: open, lock, write a file. sleep for 6 * lease (could be less) ulock, close. In the above example with NFSv4.1 delegations enabled, without this change, there are no OP_SEQUENCE state renewal calls during the sleep, and the clientid is recovered due to lease expiration on the close. This issue does not occur with NFSv4.1 delegations disabled, nor with NFSv4.0, with or without delegations enabled. Signed-off-by: Andy Adamson <andros@netapp.com> Link: http://lkml.kernel.org/r/1411486536-23401-1-git-send-email-andros@netapp.com Fixes: 2f60ea6b8ced (NFSv4: The NFSv4.0 client must send RENEW calls...) Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-12-14NFSv4: fix open/lock state recovery error handlingTrond Myklebust1-10/+6
commit df817ba35736db2d62b07de6f050a4db53492ad8 upstream. The current open/lock state recovery unfortunately does not handle errors such as NFS4ERR_CONN_NOT_BOUND_TO_SESSION correctly. Instead of looping, just proceeds as if the state manager is finished recovering. This patch ensures that we loop back, handle higher priority errors and complete the open/lock state recovery. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-11-05NFSv4: Fix another bug in the close/open_downgrade codeTrond Myklebust1-15/+15
commit cd9288ffaea4359d5cfe2b8d264911506aed26a4 upstream. James Drew reports another bug whereby the NFS client is now sending an OPEN_DOWNGRADE in a situation where it should really have sent a CLOSE: the client is opening the file for O_RDWR, but then trying to do a downgrade to O_RDONLY, which is not allowed by the NFSv4 spec. Reported-by: James Drews <drews@engr.wisc.edu> Link: http://lkml.kernel.org/r/541AD7E5.8020409@engr.wisc.edu Fixes: aee7af356e15 (NFSv4: Fix problems with close in the presence...) Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-09-14NFSv4: Fix problems with close in the presence of a delegationTrond Myklebust1-5/+12
commit aee7af356e151494d5014f57b33460b162f181b5 upstream. In the presence of delegations, we can no longer assume that the state->n_rdwr, state->n_rdonly, state->n_wronly reflect the open stateid share mode, and so we need to calculate the initial value for calldata->arg.fmode using the state->flags. Reported-by: James Drews <drews@engr.wisc.edu> Fixes: 88069f77e1ac5 (NFSv41: Fix a potential state leakage when...) Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-04-02nfs: fix do_div() warning by instead using sector_div()Helge Deller1-1/+1
commit 3873d064b8538686bbbd4b858dc8a07db1f7f43a upstream. When compiling a 32bit kernel with CONFIG_LBDAF=n the compiler complains like shown below. Fix this warning by instead using sector_div() which is provided by the kernel.h header file. fs/nfs/blocklayout/extents.c: In function ‘normalize’: include/asm-generic/div64.h:43:28: warning: comparison of distinct pointer types lacks a cast [enabled by default] fs/nfs/blocklayout/extents.c:47:13: note: in expansion of macro ‘do_div’ nfs/blocklayout/extents.c:47:2: warning: right shift count >= width of type [enabled by default] fs/nfs/blocklayout/extents.c:47:2: warning: passing argument 1 of ‘__div64_32’ from incompatible pointer type [enabled by default] include/asm-generic/div64.h:35:17: note: expected ‘uint64_t *’ but argument is of type ‘sector_t *’ extern uint32_t __div64_32(uint64_t *dividend, uint32_t divisor); Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-04-02nfs4.1: properly handle ENOTSUP in SECINFO_NO_NAMEWeston Andros Adamson1-2/+2
commit 78b19bae0813bd6f921ca58490196abd101297bd upstream. Don't check for -NFS4ERR_NOTSUPP, it's already been mapped to -ENOTSUPP by nfs4_stat_to_errno. This allows the client to mount v4.1 servers that don't support SECINFO_NO_NAME by falling back to the "guess and check" method of nfs4_find_root_sec. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-04-02NFSv4: OPEN must handle the NFS4ERR_IO return code correctlyTrond Myklebust1-16/+31
commit c7848f69ec4a8c03732cde5c949bd2aa711a9f4b upstream. decode_op_hdr() cannot distinguish between an XDR decoding error and the perfectly valid errorcode NFS4ERR_IO. This is normally not a problem, but for the particular case of OPEN, we need to be able to increment the NFSv4 open sequence id when the server returns a valid response. Reported-by: J Bruce Fields <bfields@fieldses.org> Link: http://lkml.kernel.org/r/20131204210356.GA19452@fieldses.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-01-03NFSv4: Update list of irrecoverable errors on DELEGRETURNTrond Myklebust1-2/+8
commit c97cf606e43b85a6cf158b810375dd77312024db upstream. If the DELEGRETURN errors out with something like NFS4ERR_BAD_STATEID then there is no recovery possible. Just quit without returning an error. Also, note that the client must not assume that the NFSv4 lease has been renewed when it sees an error on DELEGRETURN. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-01-03NFSv4 wait on recovery for async session errorsAndy Adamson1-2/+1
commit 4a82fd7c4e78a1b7a224f9ae8bb7e1fd95f670e0 upstream. When the state manager is processing the NFS4CLNT_DELEGRETURN flag, session draining is off, but DELEGRETURN can still get a session error. The async handler calls nfs4_schedule_session_recovery returns -EAGAIN, and the DELEGRETURN done then restarts the RPC task in the prepare state. With the state manager still processing the NFS4CLNT_DELEGRETURN flag with session draining off, these DELEGRETURNs will cycle with errors filling up the session slots. This prevents OPEN reclaims (from nfs_delegation_claim_opens) required by the NFS4CLNT_DELEGRETURN state manager processing from completing, hanging the state manager in the __rpc_wait_for_completion_task in nfs4_run_open_task as seen in this kernel thread dump: kernel: 4.12.32.53-ma D 0000000000000000 0 3393 2 0x00000000 kernel: ffff88013995fb60 0000000000000046 ffff880138cc5400 ffff88013a9df140 kernel: ffff8800000265c0 ffffffff8116eef0 ffff88013fc10080 0000000300000001 kernel: ffff88013a4ad058 ffff88013995ffd8 000000000000fbc8 ffff88013a4ad058 kernel: Call Trace: kernel: [<ffffffff8116eef0>] ? cache_alloc_refill+0x1c0/0x240 kernel: [<ffffffffa0358110>] ? rpc_wait_bit_killable+0x0/0xa0 [sunrpc] kernel: [<ffffffffa0358152>] rpc_wait_bit_killable+0x42/0xa0 [sunrpc] kernel: [<ffffffff8152914f>] __wait_on_bit+0x5f/0x90 kernel: [<ffffffffa0358110>] ? rpc_wait_bit_killable+0x0/0xa0 [sunrpc] kernel: [<ffffffff815291f8>] out_of_line_wait_on_bit+0x78/0x90 kernel: [<ffffffff8109b520>] ? wake_bit_function+0x0/0x50 kernel: [<ffffffffa035810d>] __rpc_wait_for_completion_task+0x2d/0x30 [sunrpc] kernel: [<ffffffffa040d44c>] nfs4_run_open_task+0x11c/0x160 [nfs] kernel: [<ffffffffa04114e7>] nfs4_open_recover_helper+0x87/0x120 [nfs] kernel: [<ffffffffa0411646>] nfs4_open_recover+0xc6/0x150 [nfs] kernel: [<ffffffffa040cc6f>] ? nfs4_open_recoverdata_alloc+0x2f/0x60 [nfs] kernel: [<ffffffffa0414e1a>] nfs4_open_delegation_recall+0x6a/0xa0 [nfs] kernel: [<ffffffffa0424020>] nfs_end_delegation_return+0x120/0x2e0 [nfs] kernel: [<ffffffff8109580f>] ? queue_work+0x1f/0x30 kernel: [<ffffffffa0424347>] nfs_client_return_marked_delegations+0xd7/0x110 [nfs] kernel: [<ffffffffa04225d8>] nfs4_run_state_manager+0x548/0x620 [nfs] kernel: [<ffffffffa0422090>] ? nfs4_run_state_manager+0x0/0x620 [nfs] kernel: [<ffffffff8109b0f6>] kthread+0x96/0xa0 kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20 kernel: [<ffffffff8109b060>] ? kthread+0x0/0xa0 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20 The state manager can not therefore process the DELEGRETURN session errors. Change the async handler to wait for recovery on session errors. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [bwh: Backported to 3.2: - Adjust context - There's no restart_call label] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2014-01-03NFSv4: Fix a use-after-free situation in _nfs4_proc_getlk()Trond Myklebust1-0/+1
commit a6f951ddbdfb7bd87d31a44f61abe202ed6ce57f upstream. In nfs4_proc_getlk(), when some error causes a retry of the call to _nfs4_proc_getlk(), we can end up with Oopses of the form BUG: unable to handle kernel NULL pointer dereference at 0000000000000134 IP: [<ffffffff8165270e>] _raw_spin_lock+0xe/0x30 <snip> Call Trace: [<ffffffff812f287d>] _atomic_dec_and_lock+0x4d/0x70 [<ffffffffa053c4f2>] nfs4_put_lock_state+0x32/0xb0 [nfsv4] [<ffffffffa053c585>] nfs4_fl_release_lock+0x15/0x20 [nfsv4] [<ffffffffa0522c06>] _nfs4_proc_getlk.isra.40+0x146/0x170 [nfsv4] [<ffffffffa052ad99>] nfs4_proc_lock+0x399/0x5a0 [nfsv4] The problem is that we don't clear the request->fl_ops after the first try and so when we retry, nfs4_set_lock_state() exits early without setting the lock stateid. Regression introduced by commit 70cc6487a4e08b8698c0e2ec935fb48d10490162 (locks: make ->lock release private data before returning in GETLK case) Reported-by: Weston Andros Adamson <dros@netapp.com> Reported-by: Jorge Mora <mora@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-09-10NFSv4.1: integer overflow in decode_cb_sequence_args()Dan Carpenter1-3/+3
commit 0439f31c35d1da0b28988b308ea455e38e6a350d upstream. This seems like it could overflow on 32 bits. Use kmalloc_array() which has overflow protection built in. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-10NFS: nfs_getaclargs.acl_len is a size_tChuck Lever1-1/+2
commit 56d08fef2369d5ca9ad2e1fc697f5379fd8af751 upstream. Squelch compiler warnings: fs/nfs/nfs4proc.c: In function ‘__nfs4_get_acl_uncached’: fs/nfs/nfs4proc.c:3811:14: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] fs/nfs/nfs4proc.c:3818:15: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] Introduced by commit bf118a34 "NFSv4: include bitmap in nfsv4 get acl data", Dec 7, 2011. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-10NFSv4.1: Fix a race in pNFS layoutcommitTrond Myklebust2-15/+18
commit a073dbff359f4741013ae4b8395f5364c5e00b48 upstream. We need to clear the NFS_LSEG_LAYOUTCOMMIT bits atomically with the NFS_INO_LAYOUTCOMMIT bit, otherwise we may end up with situations where the two are out of sync. The first half of the problem is to ensure that pnfs_layoutcommit_inode clears the NFS_LSEG_LAYOUTCOMMIT bit through pnfs_list_write_lseg. We still need to keep the reference to those segments until the RPC call is finished, so in order to make it clear _where_ those references come from, we add a helper pnfs_list_write_lseg_done() that cleans up after pnfs_list_write_lseg. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Benny Halevy <bhalevy@tonian.com> [bwh: Backported to 3.2: s/pnfs_put_lseg/put_lseg/] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-04-10pnfs-block: removing DM device maybe cause oops when call dev_removefanchaoting1-2/+2
commit 4376c94618c26225e69e17b7c91169c45a90b292 upstream. when pnfs block using device mapper,if umounting later,it maybe cause oops. we apply "1 + sizeof(bl_umount_request)" memory for msg->data, the memory maybe overflow when we do "memcpy(&dataptr [sizeof(bl_msg)], &bl_umount_request, sizeof(bl_umount_request))", because the size of bl_msg is more than 1 byte. Signed-off-by: fanchaoting<fanchaoting@cn.fujitsu.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [bwh: Backported to 3.2: - In dev_remove(), msg is a structure not a pointer to it] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-03-20NFS: Don't allow NFS silly-renamed files to be deleted, no signalTrond Myklebust1-7/+13
commit 5a7a613a47a715711b3f2d3322a0eac21d459166 upstream. Commit 73ca100 broke the code that prevents the client from deleting a silly renamed dentry. This affected "delete on last close" semantics as after that commit, nothing prevented removal of silly-renamed files. As a result, a process holding a file open could easily get an ESTALE on the file in a directory where some other process issued 'rm -rf some_dir_containing_the_file' twice. Before the commit, any attempt at unlinking silly renamed files would fail inside may_delete() with -EBUSY because of the DCACHE_NFSFS_RENAMED flag. The following testcase demonstrates the problem: tail -f /nfsmnt/dir/file & rm -rf /nfsmnt/dir rm -rf /nfsmnt/dir # second removal does not fail, 'tail' process receives ESTALE The problem with the above commit is that it unhashes the old and new dentries from the lookup path, even in the normal case when a signal is not encountered and it would have been safe to call d_move. Unfortunately the old dentry has the special DCACHE_NFSFS_RENAMED flag set on it. Unhashing has the side-effect that future lookups call d_alloc(), allocating a new dentry without the special flag for any silly-renamed files. As a result, subsequent calls to unlink silly renamed files do not fail but allow the removal to go through. This will result in ESTALE errors for any other process doing operations on the file. To fix this, go back to using d_move on success. For the signal case, it's unclear what we may safely do beyond d_drop. Reported-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-03-06umount oops when remove blocklayoutdriver firstfanchaoting2-0/+2
commit 5a12cca697aca5dfba42a7d4c3356acc0445a2b0 upstream. now pnfs client uses block layout, maybe we can remove blocklayoutdriver first. if we umount later, it can cause oops in unset_pnfs_layoutdriver. because nfss->pnfs_curr_ld->clear_layoutdriver is invalid. reproduce it: modprobe blocklayoutdriver mount -t nfs4 -o minorversion=1 pnfsip:/ /mnt/ rmmod blocklayoutdriver umount /mnt then you can see following CPU 0 Pid: 17023, comm: umount.nfs4 Tainted: GF O 3.7.0-rc6-pnfs #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform RIP: 0010:[<ffffffffa04cfe6d>] [<ffffffffa04cfe6d>] unset_pnfs_layoutdriver+0x1d/0x70 [nfsv4] RSP: 0018:ffff8800022d9e48 EFLAGS: 00010286 RAX: ffffffffa04a1b00 RBX: ffff88000b013800 RCX: 0000000000000001 RDX: ffffffff81ae8ee0 RSI: ffff880001ee94b8 RDI: ffff88000b013800 RBP: ffff8800022d9e58 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff880001ee9400 R13: ffff8800105978c0 R14: 00007fff25846c08 R15: 0000000001bba550 FS: 00007f45ae7f0700(0000) GS:ffff880012c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffffffa04a1b38 CR3: 0000000002c0c000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process umount.nfs4 (pid: 17023, threadinfo ffff8800022d8000, task ffff880006e48aa0) Stack: ffff8800105978c0 ffff88000b013800 ffff8800022d9e78 ffffffffa04cd0ce ffff8800022d9e78 ffff88000b013800 ffff8800022d9ea8 ffffffffa04755a7 ffff8800022d9ea8 ffff880002f96400 ffff88000b013800 ffff880002f96400 Call Trace: [<ffffffffa04cd0ce>] nfs4_destroy_server+0x1e/0x30 [nfsv4] [<ffffffffa04755a7>] nfs_free_server+0xb7/0x150 [nfs] [<ffffffffa047d4d5>] nfs_kill_super+0x35/0x40 [nfs] [<ffffffff81178d35>] deactivate_locked_super+0x45/0x70 [<ffffffff8117986a>] deactivate_super+0x4a/0x70 [<ffffffff81193ee2>] mntput_no_expire+0xd2/0x130 [<ffffffff81194d62>] sys_umount+0x72/0xe0 [<ffffffff8154af59>] system_call_fastpath+0x16/0x1b Code: 06 e1 b8 ea ff ff ff eb 9e 0f 1f 44 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 80 03 00 00 48 89 fb 48 85 c0 74 29 <48> 8b 40 38 48 85 c0 74 02 ff d0 48 8b 03 3e ff 48 04 0f 94 c2 RIP [<ffffffffa04cfe6d>] unset_pnfs_layoutdriver+0x1d/0x70 [nfsv4] RSP <ffff8800022d9e48> CR2: ffffffffa04a1b38 ---[ end trace 29f75aaedda058bf ]--- Signed-off-by: fanchaoting<fanchaoting@cn.fujitsu.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-03-06NFSv4.1: Don't decode skipped layoutgetsWeston Andros Adamson1-1/+2
commit 085b7a45c63d3da5be155faab9249a5cab224561 upstream. layoutget's prepare hook can call rpc_exit with status = NFS4_OK (0). Because of this, nfs4_proc_layoutget can't depend on a 0 status to mean that the RPC was successfully sent, received and parsed. To fix this, use the result's len member to see if parsing took place. This fixes the following OOPS -- calling xdr_init_decode() with a buffer length 0 doesn't set the stream's 'p' member and ends up using uninitialized memory in filelayout_decode_layout. BUG: unable to handle kernel paging request at 0000000000008050 IP: [<ffffffff81282e78>] memcpy+0x18/0x120 PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/0000:02:01.0/irq CPU 1 Modules linked in: nfs_layout_nfsv41_files nfs lockd fscache auth_rpcgss nfs_acl autofs4 sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod ppdev parport_pc parport snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc e1000 microcode vmware_balloon i2c_piix4 i2c_core sg shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptspi mptscsih mptbase scsi_transport_spi [last unloaded: speedstep_lib] Pid: 1665, comm: flush-0:22 Not tainted 2.6.32-356-test-2 #2 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform RIP: 0010:[<ffffffff81282e78>] [<ffffffff81282e78>] memcpy+0x18/0x120 RSP: 0018:ffff88003dfab588 EFLAGS: 00010206 RAX: ffff88003dc42000 RBX: ffff88003dfab610 RCX: 0000000000000009 RDX: 000000003f807ff0 RSI: 0000000000008050 RDI: ffff88003dc42000 RBP: ffff88003dfab5b0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000024 R13: ffff88003dc42000 R14: ffff88003f808030 R15: ffff88003dfab6a0 FS: 0000000000000000(0000) GS:ffff880003420000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000008050 CR3: 000000003bc92000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process flush-0:22 (pid: 1665, threadinfo ffff88003dfaa000, task ffff880037f77540) Stack: ffffffffa0398ac1 ffff8800397c5940 ffff88003dfab610 ffff88003dfab6a0 <d> ffff88003dfab5d0 ffff88003dfab680 ffffffffa01c150b ffffea0000d82e70 <d> 000000508116713b 0000000000000000 0000000000000000 0000000000000000 Call Trace: [<ffffffffa0398ac1>] ? xdr_inline_decode+0xb1/0x120 [sunrpc] [<ffffffffa01c150b>] filelayout_decode_layout+0xeb/0x350 [nfs_layout_nfsv41_files] [<ffffffffa01c17fc>] filelayout_alloc_lseg+0x8c/0x3c0 [nfs_layout_nfsv41_files] [<ffffffff8150e6ce>] ? __wait_on_bit+0x7e/0x90 Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-02-06NFSv4.1: Handle NFS4ERR_DELAY when resetting the NFSv4.1 sessionTrond Myklebust1-2/+12
commit c489ee290bdbbace6bb63ebe6ebd4dd605819495 upstream. NFS4ERR_DELAY is a legal reply when we call DESTROY_SESSION. It usually means that the server is busy handling an unfinished RPC request. Just sleep for a second and then retry. We also need to be able to handle the NFS4ERR_BACK_CHAN_BUSY return value. If the NFS server has outstanding callbacks, we just want to similarly sleep & retry. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-02-06NFS: Don't silently fail setattr() requests on mountpointsTrond Myklebust1-0/+20
commit ab225417825963b6dc66be7ea80f94ac1378dfdf upstream. Ensure that any setattr and getattr requests for junctions and/or mountpoints are sent to the server. Ever since commit 0ec26fd0698 (vfs: automount should ignore LOOKUP_FOLLOW), we have silently dropped any setattr requests to a server-side mountpoint. For referrals, we have silently dropped both getattr and setattr requests. This patch restores the original behaviour for setattr on mountpoints, and tries to do the same for referrals, provided that we have a filehandle... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-01-16nfs: fix null checking in nfs_get_option_str()Xi Wang1-1/+1
commit e25fbe380c4e3c09afa98bcdcd9d3921443adab8 upstream. The following null pointer check is broken. *option = match_strdup(args); return !option; The pointer `option' must be non-null, and thus `!option' is always false. Use `!*option' instead. The bug was introduced in commit c5cb09b6f8 ("Cleanup: Factor out some cut-and-paste code."). Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-01-03NFS: avoid NULL dereference in nfs_destroy_serverNeilBrown1-2/+1
commit f259613a1e4b44a0cf85a5dafd931be96ee7c9e5 upstream. In rare circumstances, nfs_clone_server() of a v2 or v3 server can get an error between setting server->destory (to nfs_destroy_server), and calling nfs_start_lockd (which will set server->nlm_host). If this happens, nfs_clone_server will call nfs_free_server which will call nfs_destroy_server and thence nlmclnt_done(NULL). This causes the NULL to be dereferenced. So add a guard to only call nlmclnt_done() if ->nlm_host is not NULL. The other guards there are irrelevant as nlm_host can only be non-NULL if one of these flags are set - so remove those tests. (Thanks to Trond for this suggestion). This is suitable for any stable kernel since 2.6.25. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2013-01-03NFS: Add sequence_priviliged_ops for nfs4_proc_sequence()Bryan Schumaker1-4/+17
commit 6bdb5f213c4344324f600dde885f25768fbd14db upstream. If I mount an NFS v4.1 server to a single client multiple times and then run xfstests over each mountpoint I usually get the client into a state where recovery deadlocks. The server informs the client of a cb_path_down sequence error, the client then does a bind_connection_to_session and checks the status of the lease. I found that bind_connection_to_session sets the NFS4_SESSION_DRAINING flag on the client, but this flag is never unset before nfs4_check_lease() reaches nfs4_proc_sequence(). This causes the client to deadlock, halting all NFS activity to the server. nfs4_proc_sequence() is only called by the state manager, so I can change it to run in privileged mode to bypass the NFS4_SESSION_DRAINING check and avoid the deadlock. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16nfs: Show original device name verbatim in /proc/*/mount{s,info}Ben Hutchings4-9/+20
commit 97a54868262da1629a3e65121e65b8e8c4419d9f upstream. Since commit c7f404b ('vfs: new superblock methods to override /proc/*/mount{s,info}'), nfs_path() is used to generate the mounted device name reported back to userland. nfs_path() always generates a trailing slash when the given dentry is the root of an NFS mount, but userland may expect the original device name to be returned verbatim (as it used to be). Make this canonicalisation optional and change the callers accordingly. [jrnieder@gmail.com: use flag instead of bool argument] Reported-and-tested-by: Chris Hiestand <chiestand@salk.edu> Reference: http://bugs.debian.org/669314 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [bwh: Backported to 3.2: - Adjust context - nfs_show_devname() still takes a pointer to struct vfsmount]
2012-11-16nfsv3: Make v3 mounts fail with ETIMEDOUTs instead EIO on mountd timeoutsScott Mayhew1-1/+1
commit acce94e68a0f346115fd41cdc298197d2d5a59ad upstream. In very busy v3 environment, rpc.mountd can respond to the NULL procedure but not the MNT procedure in a timely manner causing the MNT procedure to time out. The problem is the mount system call returns EIO which causes the mount to fail, instead of ETIMEDOUT, which would cause the mount to be retried. This patch sets the RPC_TASK_SOFT|RPC_TASK_TIMEOUT flags to the rpc_call_sync() call in nfs_mount() which causes ETIMEDOUT to be returned on timed out connections. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16NFS: fix bug in legacy DNS resolver.NeilBrown1-2/+3
commit 8d96b10639fb402357b75b055b1e82a65ff95050 upstream. The DNS resolver's use of the sunrpc cache involves a 'ttl' number (relative) rather that a timeout (absolute). This confused me when I wrote commit c5b29f885afe890f953f7f23424045cdad31d3e4 "sunrpc: use seconds since boot in expiry cache" and I managed to break it. The effect is that any TTL is interpreted as 0, and nothing useful gets into the cache. This patch removes the use of get_expiry() - which really expects an expiry time - and uses get_uint() instead, treating the int correctly as a ttl. This fixes a regression that has been present since 2.6.37, causing certain NFS accesses in certain environments to incorrectly fail. Reported-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16NFSv4: nfs4_locku_done must release the sequence idTrond Myklebust1-0/+1
commit 2b1bc308f492589f7d49012ed24561534ea2be8c upstream. If the state recovery machinery is triggered by the call to nfs4_async_handle_error() then we can deadlock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2012-11-16NFSv4.1: We must release the sequence id when we fail to get a session slotTrond Myklebust1-13/+23
commit 2240a9e2d013d8269ea425b73e1d7a54c7bc141f upstream. If we do not release the sequence id in cases where we fail to get a session slot, then we can deadlock if we hit a recovery scenario. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [bwh: Backported to 3.2: - Adjust context - nfs4_setup_sequence() has an additional 'cache_reply' parameter] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>